Tuesday 24 July 2012

Ruling the Screen: compromising decisions and decisive compromises - DRH 99

I was lucky my first paper on an international conference got published right away and in two different places. I presented 'Where is the editor? Resistance in the creation of an electronic critical edition' on the DRH Conference (Digital Resources for the Humanities) in Glasgow in 1998. The original paper got published in Human IT (1/1999: 197-214) where my name was vikingized as 'Edvard'. A revised version appeared a year later in DRH 98. Selected papers from Digital resources for the Humanities 1998 (Marilyn Deegan, Jean Anderson & Harold Short (eds.), London: OHC, 2000, p. 171-183). My second international paper, however, didn't make it into publication, partly because it was too sketchy and reported on research in progress. The main aim of 'Ruling the Screen: compromising decisions and decisive compromises' which I presented to DRH99 in London was to introduce the Electronic Streuvels Project and to report on the work so far. I focused especially on six decisive compromises I had to make because of the financial and infrastructural context of the project. Two of these compromises concerned the encoding architecture for the textual variation and the design of a project specific DTD for the encoding of letters.

Because this paper was never published, my use of nested <NOTE>-tags instead of the TEI parallel segmentation method to generate an inclusive view of all variant versions of the text in the edition was misunderstood by the encoding community when the electronic critical edition of De teleurgang van den Waterhoek was published in 2000 by Amsterdam University Press. The venture was not so much about documenting the textual variation among the different versions of the novel, but about creating a model and an interface by which parts of the text could be optically compared to one another independently. Criticism has been voiced by Dan O'Donnel, for instance, in his review of the electronic edition in Literary and Linguistic Computing (17/4 (2000): 491-496). O'Donnel pointed out that my solution was a poor one because of the sigificantly stray from the TEI definition of <NOTE> and because it ignored several features of TEI standard intended for precisely the type of functionality that I was suggesting. O'Donnel suggested that a combination of ≶APP>, <RDG>, and optionally <LEM> elements could have been chosen within the TEI Guidelines and that <LINKGRP> could have been used to link the variant versions to the orientation text. These choices, however, were the result of one of the compromising decisions outlined in the current paper, namely that I had to use TEI-Lite for reasons of time and financial constraints. O'Donnell righlty pointed out that adding them to the TEI-Lite DTD wouldn't have been too difficult, but I've always been against modifying a digested DTD like TEI Lite in order to lift it up to the level of full TEI. The choice was also one of ease of formatting. With only a basic knowledge of SGML and TEI, I was unable to to much transformations, and the low-cost low-tech SGML publication suite Multidoc was perfect for my purpose of getting out an electronic edition in 21 months time. The nested <NOTE>-construction also served my model of the linkeme on which I elaborated in my article 'A Linkemic Approach to Textual Variation. Theory and Practice of the Electronic-Critical Edition of Stijn Streuvels' De teleurgang van den Waterhoek.' (Human IT, 1/2000: 103-138). In the current paper I also presented the DTD I had produced for the encoding of modern correspondence materials. This DTD was the very first attempt at what later became the DALF-Scheme. Important to know, when reading this paper, is that we were then still living in the SGML world.

Abstract

At the end of every discussion on textual criticism and scholarly editing, there is this question about lay-out: 'How will the editor present his or her theories, findings and editorial decisions to the interested public?' The already published scholarly editions in paper seem to have opted, not for a maximum legibility and usability of the edition, but for a form that shows its well-researched content in such a way that it can expect all academic recognition it certainly deserves, leaving the interested user amidst a labyrinth of diacritics and codes and most of all, without enough fingers to use as bookmarks in order to read the edition as it was designed.

This paradoxical situation in which the explanation of the choice for and the constitution of a particular version of a text is laid down in some sort of merely illegible apparatus, culminates in the production and publication of static and paper-based historic-critical editions in which the genesis and the transmission of a particular work is articulated and presented against a base text conforming well defined guidelines. The synoptic, lemmatized and inclusive organisations of historic-critical editions are all meant to clarify the results of the research, but by their condense form they problematize their accessibility and usability (McGann, 1996; Lavagnino, 1995; De Tienne, 1996 & Vanhoutte, 1999). On top of that, their physical record impedes the longevity and interchangeability. And these are exactly the parameters by which academic work is being assessed.

The theory and practice of genetic and textual criticism shows that this is not a simple question to answer. There are in fact as many answers to this question about-lay out as there are potential editors thinking of possible editions. With the creation and publication of the electronic-critical edition of Stijn Streuvels’ De teleurgang van den Waterhoek (The decline of the Waterhoek) (De Smedt & Vanhoutte, 2000), the Electronic Streuvels Project (ESP) hopes to suggest yet another answer to this basic question, not by producing a complete historic-critical edition, but by aiming at a compromise which incorporates intellectual integrity, usability and utility.

Paper

Enfin,'t is er - of het nu inslaat,
of weer in stilte begraven wordt,
kan me minder schelen1

Stijn Streuvels

In 1996 Marcel De Smedt published a genetic article on Stijn Streuvels’ novel De teleurgang van den Waterhoek (De Smedt, 1996) from 1927 on the basis of close-reading of the author’s correspondence with friends and publishers and of thorough research of the extant primary sources which can all be found in the Archive and Museum for Flemish Cultural Life (AMVC, Archief en Museum voor het Vlaamse Cultuurleven) in Antwerp and they are described in Streuvels (1999) and Streuvels (2000). There is

  • a defective draft manuscript from 1926 (S935/H15),
  • a complete neat manuscript from 1927 (S935/H18),
  • a corrected typescript (1927) (S935/H16),
  • a corrected and annotated copy of the pre-publication of the novel in the literary journal De Gids which functioned as manuscript for the first print edition of 1927 (S935/H17),
  • a defective corrected proof (S935/H17),
  • an elaborately edited version of the first print which functioned as manuscript for the second revised edition of 1939 (S935/H24).

This drastical revised edition of 1939 which only retained 73.4% of the original text of the first edition was probably the author’s response to both the publisher's request, to produce a shorter and hence a more marketable book,2 and the catholic critique who had fulminated against the elaborate depiction of the erotic relationship between two of the main characters. It goes without saying that this revision resulted in a different text, telling a different story with different conclusions to literary criticism. Up to 1987, this revised text had been the basis for 13 reprints of the book.

De Smedt's conclusion of his genetic study was a plea for a new scholarly edition of the novel based on the restored text of the first print edition.

We believe that in this case, the first print edition prevails over the journal publication. It wasn’t till the redaction of this first edition that Streuvels had the complete text of his work at hand, and that he could overlook and edit his book as a whole. (De Smedt, 1996: 326; my translation)

and further on

It is obvious that manifest mistakes in this first edition have to be corrected with the use of the proofs and the manuscript. (Ibidem)

In what form and when this edition should become reality remained an open question to him.

In January 1998, The Royal Academy of Dutch Language and Literature (Belgium), charmed by De Smedt’s proposal and on the lookout for a new challenging profile for this learned society, employed one full-time research fellow to design and realize this project which was called the Electronic Streuvels Project (ESP). From the very start of the project it was clear that it had to include an electronic component which would make the exclusive choice for a specific well defined form or for one kind of edition (e.g. a documentary, historic-critical, diplomatic, study or reading edition) obsolete. The project would have to include elements of all of these, but be neither of them. Very soon, the choice was made for an electronic edition project which aims:

  • To deliver an electronic edition of Stijn Streuvels’ De Teleurgang van den Waterhoek in 21 months time.
  • To obtain expertise in using SGML/TEI in creating electronic editions.
  • To function as a pilot-project in electronic scholarly editing in the field of modern Dutch literature.
  • To deliver a project report which will be helpfull as guidelines for further electronic editing projects in Flanders and the Netherlands.

With the inefficiency of conventional paper-based editions annex the illegibility of their apparatus variorum as omnipresent demon, the project wants to explore new ways of producing editions for a diverse audience and suggest alternative solutions for the presentation of variants in an electronic environment to the interested user. I believe we succeeded in doing so with the publication of both an electronic-critical edition on CD and a text-critical edition in bookform as a first spin-off product.

The hard-copy spin-off product (paperback and hardback editions) by all means appeals to a scholarly reading edition in that it answers the central criterion as defined by Bowers in his essay Notes on theory and practice in editing texts:

Perhaps the central criterion for such a reading edition is that its text is intended to serve two audiences - the scholarly and the generally informed non-professional public, in each case without essential compromise. (Bowers, 1992: 245)

Whereas the hard-copy version will present the constituted reading text of the first edition, accompanied by a glossary list, scholarly articles on the text-constitution and the transmission of the work together with an exemplary article on the genetic variants, the electronic edition will include the fully searchable texts of the pre-publication published in De Gids, the first edition from 1927 and the second revised edition from 1939, the digital facsimiles of three primary sources (i.e. the complete manuscript, the corrected version of De Gids and the corrected version of the first print) a glossary list, a (genetic) chapter on the production and the transmission of the work, including relevant correspondence between the author and his publishers in full text (ca. 70 letters), and a study on the reception of the work. Taking the (edited) text of the first edition as the orientation text, this hypertext edition will link the different versions of the text on the paragraph level in order to show the variant readings. With this choice of the paragraph instead of the variant as linkeme, we think to have found a gentle compromise for the aforementioned dichotomy between intellectual integrity and legibility.

The scope of this enterprise (and hence the form and formality of its spin-off products) was highly defined by a set of reality-driven compromising decisions on the level of the project administration, which had its repercussions on the methodology of the project, i.e. a set of six decisive compromises had to be put forward.

1.

The Royal Academy funds the ESP with private money from their patrimony. Therefore, only the employment of one full-time research fellow could be financed, leaving a small budget for hard- and software and the production of the CD. As a consequence, all the electronic work as OCR’ing, imaging, text encoding, system architecture etc had to be done in-house by one person with very basic hard- and software and a lot of creativity. On top of that, the funds only allowed the project to run for 21 months. Therefore, a choice had to be made as to what to include in the edition.

2.

From the very start of the project, the steering committee was focused on the lay-out of the result, i.e. the CD-ROM, without knowing anything about text-encoding, markup, imaging or Humanities Computing in general. The aim of the project was something along the lines of: 'Creating a CD-Rom by which users can compare different versions of the text by showing at the same time all variants on the paragraph level as well as the corresponding digital facsimiles of the document witnesses.' Hyperlinking and digital facsimiles were the keywords in this description. This 'narrowing down' of the text-critical description of variants and the limited time which didn’t allow for a long learning curve made me skim the option for the full-cream TEI DTD to the reality of the TEILite DTD. A compromise which was in itself a compromising decision.

3.

In looking for a valid way to encode the relation between the corresponding paragraphs of the different versions, I first wanted to document the relationship using CORRESP attribute to <P>, but although this seems to work conceptually, it didn’t work in the practical world of the browsers. Most popular browsers (including Panorama Pro and MultiDoc Pro) have difficulties in expressing it in a visible useful way for editions. The same is true of the CopyOf and the SameAs attributes in the full TEI.

The compromise was found in a combined construction of nested <NOTE> tags to show the paragraph agreement and <XREF>’s for references to the digital facsimiles. In giving each paragraph of each full text document source a unique ID and drawing correspondance tables from it, it was relatively easy to generate one mammoth-SGML instance by running a suite of AWK and PERL scripts on the three basis SGML instances which contained the encoded full text versions of the pre-publication, the first and the second print. The XREF’ing had then to be done manually, because no machine-readable transliteration of the facsimiles had been made.

This resulted in a typical construction of the following type. (NB: this formatted structure is not valid SGML because of mixed content. The whitespaces between <NOTE> and <P> should be removed.)

<P ID="ed.td1.1.003" N="1.003">
   <NOTE>
      <P>MS</XREF></P>
         <P>DG
            <NOTE>
               <P><SEG>DG</SEG></P>
               <P ID="ed.tg.1.003" N="1.003">"Aho! Aho!"</P>
            </NOTE>
         </P>
         <P>DGcor<XREF DOC="g1.064065"></XREF></P>
         <P>D1cor<XREF DOC="d1.004005"></XREF></P>
         <P>D2
             <NOTE>
               <P><SEG>D2</SEG></P>
               <P ID="ed.td2.1.003" N="1.003">
               <Q TYPE="speech" DIRECT="Y" WHO="reiziger">
               — Aho! Aho!</Q></P>
             </NOTE>
         </P>
   </NOTE>
<Q TYPE="speech" DIRECT="Y" WHO="reiziger">— Aho! Aho!</Q></P>

This explicit articulation of the virtual paragraph correspondences is by far the clearest case of the ruling of the screen in the project.

To overcome the fact that this mammoth-instance is theoretically and methodologically speaking “unsound” for analytic research operations limited to one version of the text, the three basic SGML instances are supplied with this edition-instance on the CD.

4.

De teleurgang van den Waterhoek, as many modern novels mixes poetry and prose. Several paragraphs in the novel contain songs, which can be identified as poetry. Because of an error in the content model of <P> in P3, which hasn't been corrected in the revised P3 (sometimes called P4 beta) as I have it,3 <LG> and <L> are not allowed inside <P>. To solve this problem, two options are open:

  1. Modify the TEILite DTD by change the content model for <P> so that it allows both <LG> and <L> as its childs.
  2. Embrace <LG> or <L> with <Q>-tags. This compromise creates yet another pitfall, known as the mixed content problem. Every element whose content model in the TEI dtd is %specialPara; only allows #PCDATA or a sequence of childs.

5.

The edition contains a corpus of 70-odd letters which encoding with TEILite was problematic. The theory of letter editing imposes strict rules on its practice and on the visualisation of the edition, something which even in our electronic edition couldn’t be neglected.

First of all in editing letters it is common practice to transcribe the physical appearance of the writing, eg.

  • what’s underlined in the letter is put in italics in the edition
  • what’s double-underlined in the letter is put in italics and underlined in the edition
  • what’s been added is put between /slashes/
  • what’s been deleted between <-lower than and greater than marks with a minus sign>
  • what’s been altered between >lower than and double greater than signs>>

this asks for the generic markup to describe procedural information.

Further, there is the need to markup specific elements of a letter as such:

  • catalogue number
  • envelope
  • postmark
  • sender
  • receiver
  • sender address
  • receiver address
  • initials
  • subject statement
  • editorial commentary
  • words which are unclear
  • the distinction between a correction by the author and an editorial correction

Instead of modifying the TEI DTD I found it more useful and quicker to write my own STREULET DTD which defines, amongst others, all of these elements and allows for the use of both procedural and descriptive markup.

6.

Due to lack of time, the digitised facsimiles are supplied as Jpegs, derived from 24 bit 300dpi Tiff files, without any supplementary tagging or documentation.

Notes

1. Letter of Stijn Streuvels to Joris Eeckhout, 27.11.1927. Archive UB KULeuven, archive Joris Eeckhout (P23/93).

2. Cf. Letter of R. Van der Velde to Stijn Streuvels of 02/06/1938. AMVC (S 935 / B) and included in (Streuvels, 2000).

3. LG and L should be removed from m.chunk and added to m.inter which would result in the following entity declarations:

<!ENTITY % m.inter ’%x.inter %m.bibl; | %m.hqinter; |
      %m.lists; | %m.notes; | %m.stageDirection; | castList |
      figure | l | lg | stage | table | text’>

<!ENTITY % m.chunk ’%x.chunk ab | eTree | graph | p |
      sp | tree | witList’>

References

  • Bowers, Fredson (1992). 'Notes on theory and practice in editing texts.' In Peter Davidson (ed.), The Book Encompassed. Studies in Twentieth-Century Bibliography. Cambridge: Cambridge University Press, 244-257.
  • De Smedt, Marcel & Edward Vanhoutte (2000). Stijn Streuvels, De teleurgang van den Waterhoek. Elektronisch-kritische editie/electronic-critical edition. Amsterdam/Gent: Amsterdam University Press/KANTL.
  • De Smedt, Marcel (1996). 'Uit de ontstaansgeschiedenis van De teleurgang van den Waterhoek.' In Rik Van Daele & Piet Thomas, De vos en het Lijsternest. Jaarboek 2 van het Stijn Streuvelsgenootschap, Tielt, Lannoo, 309-326.
  • De Tienne, André. (1996). 'Selecting Alterations for the Apparatus of a Critical Edition.' TEXT, 9 (1996): 33-62.
  • Lavagnino, John. (1995). 'Reading, Scholarship, and Hypertext Editions.' TEXT, 8 (1995): 109-124.
  • McGann, Jerome. (1996). 'The Rationale of HyperText.' TEXT, 9 (1996): 11-32.
  • Stijn Streuvels (1999). De teleurgang van den Waterhoek. Tekstkritische editie door Marcel de Smedt en Edward Vanhoutte. Antwerpen: Manteau.

Wednesday 18 July 2012

First steps in Digital Humanities

Back in 1995, at Lancaster University where I undertook an MA in Mediaeval Studies (with 'ae'!) I met Professor Meg Twycross, who turned out to become one of the most influential women in my life. At a time when it was still possible to read through the complete internet (what we attempted in the computerlabs at night) and when we were trying out different nicknames on #IRC chat channels, Meg Twycross not only caught my attention with her tremenduously well taught courses on Medieval literature and culture and Paleography, but especially with the pilot for the York Doomsday Project which she was building at that time. In one of my nightlong Internet sessions, I came across Stuart Lee's Break of Day in the Trenches Hypermedia Edition which also exploited hypertext as a didactic means in the teaching of literature and culture. This appealed so much to me that I started to build similar editions of poems by Hugo Claus when I was a research assistant at the University of Antwerp in 1996. This was picked up by people from the Department of Didactics at the University of Antwerp who invited me to present on a conference on teaching Dutch in secondary education. I presented my first conference paper on 15 November 1996 under the title "Retourtje Hypertekst. Een reis naar het hoe en waarom van hypertekst in het literatuuronderwijs" and a revised version was published as:

However, this wasn't my first publication. Before this article came out, I had already published two pieces about the same matter:

  • 'Oorlogspoëzie en HyperTekst: Gruwel of Hype?' WvT, Werkmap voor Taal- en Literatuuronderwijs, 20/79 (najaar 1996): 153-160.
  • 'Met een Doughnut de bib in. Over de rol van HyperTekst in het literatuuronderwijs' Vonk, 25/5 (mei-juni 1996): 51-56.

In the following years, I wrote some more on this subject:

  • 'De geheugenstunt van hypertekst.' Leesidee, 3/10 (december 1997): 777-779.
  • 'De soap 'Middeleeuwen.' Leesidee, 3/9 (november 1997): 697-698.
  • 'Het web van Marsua.' Tsjip/Letteren, 7/3 (oktober 1997): 9-12.

Meg Twycross not only stimulated my interest in the use of hypertext for literary studies and through this for models of electronic editions, she also charged me with an important mission which changed my life forever. One day, when I was awarded the County College Major Award which came with a cheque for £250, I asked her what I should do with that money and she told me to go away, learn everything I could about SGML and come back and tell her. The first book I bought with the prize money was Charles Goldfarb's The SGML Handbook.

Both the interest in the hypertextual modeling of scholarly editions and the markup of texts using SGML formed what I have been doing since. And the only person to kindly blame for it is Meg Twycross.

What if your paper doesn't make it into print?

Recently a graduate student at UCL emailed me with a request to have access to a number of conference papers I presented at the beginning of my academic career, none of which made it into a publication. Apart from the abstracts which were published in the conference book, nothing of the research or argument presented in these papers survive.

Is this a bad thing? Not necessarily. One of the reasons why they were never turned into a publication is probably because they were simply not good enough. Another reason may have been that the 20 minute presentation didn't have enough body to write out as a full paper submission to a Journal or a chapter in a book. A third, and more plausible, reason is that I was too busy doing other stuff to revisit the conference paper and rewrite it as an academic paper.

Nevertheless, I do think they have some value to some people including myself. I myself am interested in the history of the field and in the evolution of ideas - I can't just read an article by my colleagues without looking up the previous publications on which the argument builds - and I find it difficult sometimes to reconstruct a history of thoughts because of the lacking documentation. Therefore, I decided to dig up my old conference presentations and make them publicly available on this blog over the coming weeks. For me personally, it'll probably be a confronting revisit of my first steps in academia, but it will hopefully generate a better understanding of the provenance of my current ideas.

For my own documentation and for the sake of contextualisation I will provide each paper with a short introduction explaining the circumstances of the research and the occasion of the presentation. I'll also try to reconstruct which conference papers were the inspiration to published papers.