Pages

Friday, March 17, 2017

Text Markup and THATcamps

As I mentioned in last week's blog post, text markup is incredibly complicated and incredibly technical. Having a very basic understanding of coding, I understand the theory behind the work that goes into building markup tools, however, for my purposes, I believe that examples of practical application would be most helpful.

I found myself running into a wall in the past week as I attempted to figure out the best way to learn more about TEI and text markup. I'm excited about these methodologies and I've already considered using them in my thesis next semester. However, understanding theory will only go so far-- I need practice. There are several articles that teach theory but, when it comes to learning the programs that implement it, the field is pretty DIY. That all being said, to take a step toward solving my problem I've registered for a THATcamp taking place in Washington D.C. on the weekend of March 25th, and I'm very excited to meet others in the field. It'll be great to meet others interested in DH, and I'll certainly blog my experience afterwards!

This blog is going to be two parts this week. Because I'm piggybacking off of my last two blogs, the reading material I have for this post is one main article and two resources, which have been helpful to me in understanding the building blocks of text markup.

First up, the article! My main reading for this week was “The Text Encoding Initiative and the Study of Literature” by James Cummings.

Cummings introduces his article with a brief history of the Text Encoding Initiative (TEI) by introducing some of the guidelines and sponsors that together make up the initiative. He states the chapter's thesis as follows:
This chapter will examine some of the history and theoretical and methodological assumptions embodied in the text-encoding framework recommended by the TEI. It is it intended to be a general introduction...nor is it exhaustive in its consideration of issues...This chapter includes a sampling of some of the history, a few of the issues, and some of the methodological assumptions...that the TEI makes.
 It is still fascinating to me that TEI is such a young endeavor. According to Cummings, it was formed at a conference at Vassar College in 1987, and very few of the principles established at that time have changed. This is exciting because the field is new and accessible-- the people who dive in are free to determine how the tools are used.

I've chosen this article because I feel that it's important to not only have a grasp of the technologies, but also to understand the history. The article includes technical language relating to different markup languages, SGML (Standard Generalized Markup Language) and XML (Extensible Markup Language), explains the history of these languages, and describes how they are used. I was interested by Cumming's explanation of the transition from GML (Generalized Markup Language), a noted "milestone system based on character flagging, enabling basic structual markup of electronic documents for display and printing," to SGML, which was "originally intended for the sharing of documents throughout large organizations." As time went on, SGML was not universal enough and XML was adopted and is still used, because of it's flexible nature.
XML has been increasingly popular as a temporary storage format for web-based user interfaces. Its all-pervasive applicability has meant not only that there are numerous tools which read, write, transform, or otherwise manipulate XML as an application-, operating-system- and hardware-independent format, but also that there is as much training and support available.
Throughout the article Cummings highlights key points and goal of TEI. The design goals section examined the standards set for TEI to be as straightforward and accessible as possible for anyone interested in learning the text encoding methodology. He examines the community-centric nature of TEI, and the emphasis on keeping the field open and collaborative. I'm excited to be coming into the academic world at this time because although the DH field has a distinct technological learning curve, I'd rather face the curve in a community setting, rather than the traditional closed off world of academic hazing.

Cummings also discusses the user-centic nature of the TEI. Due to the community-based nature of the field, it must deliver what users of all different disciplines need. This can be a challenge, but it also exemplifies the versatile nature of the beast. As I have explained, I'm interested in using text markup and the TEI in order to see what it can uncover about texts that have been close-read to death. In the field of literature, we all know close reading, we all know how to compare elements of book. I want to take this to the next level- I want to see what technology can show me, and I want to learn how to use the programs.

Cummings explains that the TEI may have been influenced by New Criticism, a school of literary criticism with which I am quite familiar, and Cummings purports that the TEI, instead of reacting against this structuralism, as many poststructuralists might desire, in fact is compatible with New Criticism, as "the TEI's assumptions of markup theory as basically structuralist in nature as it pairs record (the text) with interpretation (markup and metadata)." This is something I would like to delve into further, because I can understand both sides of the New Criticism comparison argument.

I highly suggest reading this article, as Cummings successfully accomplishes his proposed thesis statement. I came away with feeling as if I learned the key points in the history of the TEI, without being drowned in technical conversation. I am increasingly interested in learning to code, as I am amazed by the things we can achieve with computer programs.

If you are interested in the technical side, Stanford University Digital Humanities department website includes many helpful resources, particularly "Metadata and Text Markup," which further explains buzzwords and phrases in the field, and "Content Based Analysis," which explains more about text content mining.

Additionally, the TEI website has several helpful links that may take one down many rabbit holes. I got stuck for a long while going through project examples which use TEI encoding.

No comments:

Post a Comment