Posted by scott

Last week, I spent a few days in Evanston attending the joint conference of the TEI and the Chicago Colloquium on Digital Humanities and Computer Science (DHCS 2014). 

I didn’t send myself there as a representative of the Harry Watkins Diary—I had other goals pertaining to doing more teaching and research at the intersection of DH and CS. But I had a few interactions which may have some impact on the production and publication of this project, as well as subsequent Scholarly Investigation:

Editing XML. Since the beginning of the project, I’ve been pining for a web-based XML editor, one which could be made to integrate with Drupal’s database. This way, the transcription work could be completely contained within the site, rather than distributed between the site (which tracks revisions and workflow) and oXygen (a desktop XML editor). On the first morning of the conference, I got into a conversation with the CEO of FontoXML, a new Dutch company developing… a very nice web-based XML editor. As he was explaining how it worked and how it could be used, he mentioned plugging it into content management systems, and he rattled off a few examples. “And Drupal?” I asked. He flashed me an odd smile. “Well, maybe.”

“Because,” I said, “I’m working with this Drupal-powered transcription project…” I explained a little bit about our project and why a nice web-based "plug-in" would be good for us (their editor is quite user-friendly; it doesn't require users to know XML!). In response, he said, “Next week I’m meeting with a Dutch Drupal guy to talk about whether/how the two could be integrated. Maybe I could use your project as a use case?” Well, this is tremendously exciting. I emailed him a bit more about the project, and he said he’d get back to me after their meeting.

Publication. I’m still undecided about the best way to publish the diary, when the time comes. The extensible Text Framework (XTF) is probably the top contender, especially as there’s been some discussion on the development forum about moving away from a publication model that relies on multi-frame webpages (ugh). But I’m always curious about other possibilities. At the conference, I talked with people at the ARTFL Project (that's American and French Research on the Treasury of the French Language), a collaboration between some folks in France and at U-Chicago. They’re developing a new version of their PhiloLogic “reader” that works with TEI-encoded documents, offering searchability and both web and mobile interfaces (they have Android apps for the complete Shakespeare and the Eighteenth Century Collections Online, and they’re developing an iOS app). This is certainly a different way to think about publishing the diary electronically—Harry on a tablet! 

Search and Other Investigations. Finally, I saw a presentation about a very impressive German project: a multi-university, multi-layered, and certainly multi-euro effort to produce a digital scholarly version of Der Freischütz: TEI encodings of the various libretti, MEI encodings (that’s the Music Encoding Initiative) of various versions of the score, and (most eyebrow-raising) a tool that allows you to hear selected voices extracted from extant recordings of the opera.(*) But what struck me as relevant to Harry is their technique of developing “topic maps” of ideas, keywords, people, and source materials, which they use to augment searching. Their maps are handmade rather than computer-generated — though, if our titleography includes information about authors and roles, and our “role-ography” and/or personography include information about who played what, then the computer could easily produce a network/graph of affinities structured by (say) when these people/plays/places were mentioned in the diary. This could offer a richer sort of search result, as well as, perhaps, reveal noteworthy but otherwise-obscure connections. To make a long story short, I realized that it wouldn’t hurt to think a bit more creatively about what we could do with all this painful hand-encoding--and about how to get funding to pay our staff to generate all these high-quality -ographies. 

(*) As Google's Germlish puts it, "Algorithms for temporal and spectral segmentation of audio recordings are designed to ensure maximum flexibility in handling the acoustic domain in the edition. Since in this area first basic research is still to run, the three selected numbers of the opera are recorded using special imaging techniques to have a hand reference results for the optimization of the process and on the other hand ideal example material for the demonstration of the Edition model in the project."