As curiosity grows around the potential of the next wave of technology – Linked Data – and early exemplars emerge exploiting its capabilities, the information world finally has the opportunity to exploit fully the richness of relationships between cultural artefacts. This has considerable potential for academics and students seeking to discover the precise nature of the complex relationship between, say, Homer's Odyssey and James Joyce's Ulysses. However, the ramifications extend beyond academia – a network that seamlessly directs the interested layperson from their current interest to works that are related in a defined way has transformative potential for cultural life.

To date, there is no openly available or commercialized resource anywhere which surfaces such relationships. Yet cultural life constantly throws up new examples which need to be made more easily discoverable. There is scarcely a cult television programme these days without its obligatory, knowing references to popular culture and other texts, as viewers of The Simpsons will be aware. And observe the vast popularity, in all media, of the glamourous vampire: Edward Cullen of Twilight and his brethren emerge from complex transformations of the original, monstrous Dracula of Bram Stoker's seminal novel.

In the pre-web era, there was only one type of information resource for discovering relationships between texts – the citation index. But these indexes covered only those texts with formal integral references, and even then, the precise relationship between the citing and the cited texts occasionally remained unclear. The advent of the web and global search technologies, although undoubtedly transformative, still required searchers to know what they were looking for – there was no mechanism for starting at one text and serendipitously exploring all related cultural outputs.

The second generation of web technology – the social web – and particularly Amazon's recommendation engine – established myriad relationships between cultural artefacts based on consumer behaviour. However, the fundamental nature of the links remains undefined.

It was Julia Kristeva who coined the term ‘intertextuality’ in 1966 to denote the kind of relationships under discussion here. J A Cuddon summarizes her claims

that a literary text is not an isolated phenomenon but is made up of a mosaic of quotations, and that any text is the ‘absorption and transformation’ of another. [… ] But this is not connected with the study of sources. [… ] Kristeva is not merely pointing to the way texts echo each other but the way that discourses or sign systems are transposed into one another – so that meanings in one kind of discourse are overlaid with meanings from another kind of discourse.1

Julia Kristeva thus employs the term to describe an inescapable property of all texts, and all signifying systems. Kristeva postulated a literary system which acts like a collective mind that writers unconsciously draw on. Any literary text will be related to any other, not merely through direct quotation or indirect allusion, but inescapably as a subliminal presence that is part of the very notion of literature. The term is encountered often these days in literary discussions. However, it has come to be applied to a whole range of notions from sources, influences and deliberate allusion, to chance resemblance.

Whatever one thinks of Kristeva's version, note that as she does not differentiate between the conscious kinds of reference, allusion etc., and her own universal intertextuality – she ignores the former and, indeed, cannot account for it (authorial agency being ruled out). The nebulousness of Kristeva's concept would not lend itself easily to the Semantic Web.

Gérard Genette's transtextuality

It may be more productive to confine the scope initially to Gérard Genette's more methodical approach, whilst remaining theoretically neutral and making available the possibility of representing Kristeva's types of discursive transpositions at a future date. In Palimpsests, Genette commences a detailed study of transtextuality (his preferred term for relationships between texts) and develops a thorough taxonomy that would be most useful; he proposes five kinds of relationship: architextuality, intertextuality, paratextuality, metatextuality and hypertextuality.2 These he subsumes under the common term, ‘transtextuality’.

Architextuality links the text to a grouping such as ‘types of discourse, modes of enunciation, literary genres’.3 This, then, is a relationship not with another text, but with something more abstract. The relationship is one of ‘inclusion’.4 Texts are included in genres, which can in turn be nested within others. The representation of archi-textuality as Linked Data would require an ontology of genres and modes that can be hierarchical and overlapping. The novel itself is a mosaic of incorporated genres, other discourses, and other speech genres (the latter being, according to Bakhtin, a component part of all literature).5 Existing theory would be carefully consulted in the formulation of such a taxonomy, taking into account the instability and contentiousness of genre (Google Books committed the error of basing genres entirely on US retailer categories).

Intertextuality: ‘a relationship of copresence between two texts or among several texts: […] typically as the actual presence of one text within another. In its most explicit and literal form, it is the traditional practice of quoting […] In another less explicit and canonical form, it is the practice of plagiarism […] Again, […] it is the practice of allusion’.6 Allusions may be conscious or unconscious; or, better, attestable or not. Quotation may be implicit or explicit; intentional or incidental; marked or unmarked.7

Paratextuality: the ‘generically less explicit and more distant relationship that binds the text […] to what can be called its paratext: a title, subtitle, intertitles; prefaces, postfaces, notices, forewords, etc.; marginal, infrapaginal, terminal notes; epigraphs; illustrations; blurbs, book covers, dust jackets, and many other kinds of secondary signals, whether allographic or autographic’.8 Allography (written by someone other than the author) and autography (by the author) can make significant differences; for example, a paratext, when allographic, may become a metatext, with a critical relationship to the original text (see below). Because of this, it may not be clear when, say, an introduction becomes a metatext, but Genette stresses authorial intentions here, as paratexts: ‘ensure for the text a destiny consistent with the author's purpose’.9

Genette breaks these features of paratextual messages down further into categories such as peritext/epitext; prior/original/delayed; post-humous/anthumous. Again, it is these fine distinctions which offer the potential user a rich set of pathways through literary connections.10

Metatextuality: ‘the relationship most often labelled “commentary”. It unites a given text to another, of which it speaks without necessarily citing it’.11 This is thus the realm of literary criticism. Metatextuality has not yet been analyzed in enough detail to implement in more than a sketchy fashion (though Genette indicates some preliminary moves).

Hypertextuality: ‘any relationship uniting a text B (which I shall call the hypertext) to an earlier text A (I shall, of course, call it the hypotext), upon which it is grafted in a manner that is not that of commentary’.12

Hypertextuality might include such processes as translation, interpretation, adaptation (i.e. to a different genre or medium; Genette's ‘intermodal transmodalization’13), illustration, sequels and prequels, paraphrase, editing, rearranging (e.g. montage), mash-up, variation, censorship, bowdlerization, or rendering for children.

(NB: This is not to be confused with the familiar notion of ‘hypertext’ in electronic media.)

Note that these are not firmly fixed boundaries; as Genette says, ‘one must not view the five types of transtextuality as separate and absolute categories without any reciprocal contact or overlapping […] their relationships to one another are numerous and often crucial’.14

Finally, it is important to note that trans-textuality is not confined to print; it can take place in film, painting, even music (and there are technologies to assist the mark-up of these media which could be enlisted in this project). And it can take place between media.

Transtextuality and Jane Eyre: a test case

Charlotte Brontë's Jane Eyre is a particularly illuminating choice to illustrate Genette's transtextuality. In the novel, woven-in quotes from Milton and the Bible and countless others are a deliberate and significant feature.15

The fairy tale is one architext that lies behind Jane Eyre; the specific fairy tales, ‘Cinderella’ and ‘Bluebeard’ form possible hypotexts, but are also specific intertextual allusions. But there are plentiful examples of all five of Genette's categories; our diagram (Figure 1 on p. 163) illustrates a simplified illustration of some of these.

Figure 1 

Example from ‘Jane Eyre’ showing some of the five transtextual elements identified by Genette

Paratextuality: For Genette, such components of a text as titles count as paratexts. Jane Eyre is subtitled, ‘An Autobiography’; this has effects upon the reader such as imparting the sense of psychological realism that is so characteristic of the work. Another paratextual feature is the apparently allographic Preface by ‘Currer Bell’. The ideological processes at work here in terms of Brontë's adoption of a male persona are, of course, very important (as metatextual links to contemporary reviews would reveal).16

Intertextuality: Jane Eyre is replete with allusions to other texts. There are intertextual references to Richardson's Pamela (which is also a hypotext to Jane Eyre), plus allusions to and quotations from Shakespeare, Milton, the Bible, Byron, Wordsworth, Scott, and many others. And also to conduct guides, evangelical tracts, phrenology manuals, and the chemistry treatises of Humphry Davy. Brontë quotes from the written text of Bewick's Birds, but also performs transmodal transformations (that is, between different media) of the illustrations therein. This transformation is that of ekphrasis – the depiction in words of a picture.

The intertextuality of Jane Eyre is not merely formal decoration; Jane, as narrator, is in dialogue with the Bible and Milton, re-reading these texts in order to rediscover in them original truths regarding the sanctity of authentic, companionate marriage that have been distorted by false citation. In her enforced solitariness, Jane herself is an intertextual reader, making parallels between the domestic tyranny she endures and the tyranny recounted in Goldsmith's History of Rome.17

Architextuality: architexts of Jane Eyre are the Gothic novel, Bildungsroman; allegedly, so the para-text tells us, autobiography; folk tale (‘Cinderella’ motif); Puritan confession; realist novel; romantic novel; ‘governess novels’, such as Anne Brontë's The Tenant of Wildfell Hall and Fanny Burney's The Wanderer.

Metatextuality: The representation of meta-textuality can also lead to important insights at an interdisciplinary level. In Figure 1, we show that in the Introduction to the Penguin Classics edition of Jane Eyre, Stevie Davies contends that Humphry Davy, the nineteenth-century chemist, was a significant influence on authors such as Charlotte Brontë (the intertextual allusions have been pointed out above).18 Brontë drew heavily on his ideas to represent physical attraction and repulsion between her fictional characters.

A network model pointing to authoritative URIs also offers a robust model for scholarly verification. In this instance, the hypertextual relationship between Jane Eyre and Jean Rhys's prequel, Wide Sargasso Sea (discussed below) is validated by the co-presence of both novels in James Allen's critical study, Intertextextuality.19 Validation and traceability are significant benefits of a Linked Data approach.

Hypertextuality: the novel not only alludes to earlier texts, but is also derived from them through transformations that Genette delineates in depth. Richardson's Pamela has already been mentioned; there are others. In turn, Jane Eyre is the hypotext for numerous hypertexts. There are two main ways in which texts become transformed into hypertexts: transformation and imitation, each of which can involve other processes.

Rhys's Wide Sargasso Sea becomes what it is through transformations of cutting and amplification, and transvocalization,20 where the narrative voice shifts from Jane Eyre to Rochester's first wife, and then to Rochester. The film, I Walked with a Zombie, is a transmodalization of the novel – reshaped into a different medium. Daphne du Maurier's Rebecca is, in Genette's terms, an imitation, as are countless lesser romantic novels.


An argument needs to be made for what the Linked Data approach would offer. The publishing community and academic librarians need an appreciation of the transformative potential of Linked Data not only for a collective understanding of the relationships between texts, but more broadly of cultural history. The Linked Data approach lays the basis for new pedagogical and research technologies.

In our model, we have linked Jane Eyre to the Library of Congress subject heading of ‘Gothic Writing’. We can thus start to group together all the instances of a genre, identifying influences and transformations within the grouping. This data could then be reused in interesting ways – a timeline would represent the development of the genre visually, for example. Elsewhere in the model, we have linked up to the Library of Congress Subject Heading ‘Chemistry’, in order to start to trace the influence of non-literary disciplines in fiction, as scientific fields enter the popular consciousness over the course of that century. Thus by modelling literary relationships within a broader context, we can see the development of literature in a wider cultural context.

Library catalogues and course reading lists could be extended to show the influences, allusions, sources, transformations related to a given text, with metadata specifying relationship, by whom they were ascribed along with their justifications. In interactive catalogues and reading list systems, students and academics could potentially make their own interpretations concerning relationships between texts, and share them in a seminar context or among other researchers.

Paratexts, expressed as Linked Data, could depict what Genette calls pre-texts – earlier stages of a work.21 This would be a very useful resource (for example, in genetic criticism, tracing the creative process), and again, exploits the traceability of a Linked Data approach.


In cultural life, relationships of a literary nature between cultural artefacts proliferate. By taking a Linked Data approach and applying Gérard Genette's typology of transtextuality, we are able to model these relationships. Such a web-scale model will provide the scholar with a means of sharing discoveries and insights in this area, extend the interest of laypeople into outlying areas of cultural output, and produce a clearer and more granular picture of cultural history as that model is augmented over time.