DH Deluge

Wednesday, May 1, 2013

What I Learned from DH: I Still Love TEI

Perhaps this post should be re-titled "One Thing I Learned from DH: I Still Love TEI," as I definitely learned more than this post will touch on, but I would like to focus on what I have found to be my greatest takeaway from my coursework in the digital humanities.

To give some background, a little over a year ago I was a but wistful first-year graduate student drifting through student-job applications, hoping to land something to give me some more money and experience while in school. I ended up being very fortunate and landing a job as an assistant to the arts & humanities librarian at IU, under whom one of my duties has been assisting on the Victorian Women Writers Project. This project is an initiative at IU to create TEI versions of texts written by women from the Victorian period. I was quickly grateful for the opportunity to contribute to it, as I found TEI to be a fascinating endeavor.

Of course, I also happened to choose about as difficult a text as possible for my first encoding project. Not only was it replete with citation after citation and footnote after footnote, not only was it full of a bajillion historical and mythical figures that I would need to include in my prosopography, it also happened to have a bounty of quotations in ancient Greek. These were not only rendered in the Greek alphabet throughout the text, but with the bajillion diacritics used on ancient Greek. So a word of advice to those picking their first text to encode: make sure to read more than the title page and table of contents when making your choice.

In my time working on this text, which I would definitely call a labor of love and hate, I also had the opportunity to work as an intern underneath the same librarian, and since I was quite enamored with the TEI I had been doing, I decided to use this as an opportunity to use some of my research as an intern to read further into the background of TEI. This lead to me being even more interested in the variety of applications TEI has, due to its extensibility and adaptability. So I decided I needed to sign up for the digital humanities course and try to pursue a project that would give me an even better understanding of TEI.

Thus, throughout the semester, my group and I completed the undertaking of creating guidelines on how to use TEI when encoding nonlinear narratives. Since this involved gaining an intimate knowledge of all the elements and many of the attributes TEI has to offer, as well as the rules of what can contain what, I am definitely more ready than ever to tackle encoding. In addition, it provided the opportunity to devise new elements and attributes, which was an exciting challenge of learning how to create elements whose structures within TEI were sensible, and whose definitions were not too limiting. This was overall a rewarding process on learning how to come to a group consensus when writing such specifications and getting into the grit of making them general enough to be useful for a plethora of nonlinear narratives, instead of just for the texts we focused on. I can definitely walk away knowing that I have a better grasp of TEI than I did at the beginning of the semester, and hope that wherever I find employment I am able to partake in TEI projects.

And I haven't even gotten to write an actual ODD yet! So much more to learn, and I have to say it's quite exciting.

Thursday, April 25, 2013

Video Games, Take II: The Problems of Preservation

In my last post I discussed why video games should be studied, and why the digital humanities is a field uniquely suited to carry that torch. There is one key to being able to properly study a particular video game that I had not addressed, though: the continuing existence of that video game. Unfortunately, the preservation of video games is hardly a straightforward affair, and there are several aspects that must be considered and hopefully addressed when preserving a video game for future use and study.

1) Source code vs. ROM

First, there's the issue of what exactly you should preserve to preserve the game itself. Many people would probably think, "Hey, I've kept around this old NES cartridge of Zelda II: The Adventure of Link. That's good enough, right?" Well, not necessarily and not really. See, the actual file you keep on that cartridge (or on a disc, on your hard drive, in the cloud, whatever), is actually a Read-Only Memory file, i.e. an iteration of that game as software formatted to run in a particular environment (e.g. your NES). So, if you want to save the file that says how the game should function, you want to save the source code, which is something usually only the developer has. And unfortunately, many developers don't keep that around as well as they should once a game has been released, either because they don't care or don't really understand the value of their source code vs. the released product.

2) Hardware & Software

Okay, so let's say you've successfully got a copy of the game preserved that you should be able to play. Now you just need to have kept around something to play it on, or have the capability to modify or build something to play it on. This is pretty straightforward to understand if what you have is something like an NES cartridge: you need the right video game console (or replica of it). But what if you've only saved the software from that cartridge as a ROM file? Well, hopefully you can get an emulator or virtual machine to run it on your computer. And at that, an emulator or virtual machine that can run it as it was meant to run, since you're clearly not running it on the hardware and software that ROM was designed for. So in addition to preserving the game, you have to at least preserve the specifications of what it was supposed to run when it comes to hardware and software.

3) Interface

Speaking of hardware, what about the interface components that define how you interact with the game? Is it adequate to play an NES game on an HD television, when it was designed with CRT televisions in mind? In other words, you want to consider also preserving the type of visualization interface it was designed to be displayed on, or at least replicating what that type of interface would have provided. In addition to the visualization interface, there are also the input interfaces as well, such as a keyboard, mouse, or game controller. To truly replicate the game experience as designed, you really should have the same type, if not identical, methods of inputting your control of the game.

4) Multiplayer Gaming Environment

Multiplayer games get their own bonus entry on this list! The reason for this is relatively simple: when the experience in a game is reliant upon there being a multiple number of players in the game, it's pretty much impossible to replicate the game as designed by yourself. Now, this isn't as much of an issue with games that feature a limited number of multiple players, such as 2 to 4, although it obviously does require having around enough hardware and software to support all those players at once. However, what is truly difficult, and presents a unique challenge in preservation, are massively multiplayer games, such as World of Warcraft, which are intended to have a vast community of players online at the same time. Without the player base, such games are significantly altered in environment.

As you can see, there are a myriad of challenges to be overcome when it comes to preserving video games for the future, some of which I still don't see good solutions for (such as the preservation of massively multiplayer games without enslaving masses of people to keep playing it for you into eternity).

Tuesday, April 9, 2013

Video Games: Art or Not, Why They Should Be Studied in the Digital Humanities

Whenever a jarringly new creative medium emerges, there comes without fail a debate about the merits of approaching and analyzing such media. This has happened with photography and film, two media which we today do not question the merits of their critical studies (barring, of course, fringe voices), and we now see the same debate raging anew over video games. One of the most common forms such debate takes is whether or not the new media can be considered art, and with video games this has been a very public debate, as exemplified when Roger Ebert declared "video games can never be art." While I very much disagree with Ebert's opinion (to an extent; I have a subjective definition and will agree that they were never art to him, but that does not mean they cannot be art to others), this kind of debate is very much welcome, as it intrinsically creates critical discussion of such works, and helps forge a new discourse for how this medium should be studied, regardless of its status as art or not.

Since the study of art is only one aspect of the humanities (as the field is also engaged with culture, history, et al.), I can think of no field of study more suited to picking up the task of engaging video games critically than the digital humanities. Video games are objects that do not lend themselves well to many of our extant discourses because of their amalgamation of technology, design, creativity, and dynamism. The digital humanities not only supports the interdisciplinary, multimedia aspect of video games much better than many other fields due to the interdisciplinary nature of the humanities in general, it is also one of the few fields of study to already critically engage other dynamic texts (e.g. such as in critical code studies, since code is practically always dynamic in how we can interact with it).

Granted, new media studies is another more recent field that can also pick up this mantle and help us engage with video games critically. But what it cannot offer that the digital humanities can is the amount of support needed to connect this new mode of discourse with other extant humanities discourses. In other words, digital humanities acts much better as an interdisciplinary communicator among various disciplines, and new media studies is but one of those humanities disciplines which the digital humanities can include in such a diverse engagement. Video games should be studied as new media, as narratives, as historical objects, as visual design, and as more, and the digital humanities will allow such a complex discourse to be created around them.

Thankfully, I'm not the only one thinking this. In 2012, the Journal of Digital Humanities had a special section devoted to the intersection of video games with the field of history. While this is far from perfect (in that it only looked at this media from one field's point-of-view, whereas the digital humanities allows for a forum to compare the different insights available through different fields of study), it is still a welcome step in the right direction.

Monday, April 1, 2013

Marking-Up Multimedia

One of my favorite aspects of the digital humanities is TEI, and how it allows scholars to embed metadata with semantic weight into the text itself. We no longer have to present a reading of the text simply as an external artifact, we can integrate that reading into the digital rendition of the text. For example, if we want to examine the role of location in a corpus, that corpus can be tagged directly for instances of place, and through this tagging we can literally manipulate the texts themselves to extract meaning for display and manipulation purposes.

Since this type of mark-up was designed with textual documents in mind, it works fantastically for them, but runs into problems when we try to expand it to include multimedia outside of strictly textual materials, such as audio and video. But why limit ourselves to simply marking up transcriptions of lyrics and dialogue, or to scripts?

Now, I'm sure you're wondering, how can you adequately mark-up something that you hear over time? Or visuals that are constantly in motion? These things are non-static in how we experience them, but they do not have to be non-static when we mark them up. To view these as static, I would suggest turning to non-linear editing for a proposal on how to view them as static in such a way that would be adequate for adapting TEI to such media.

For those unfamiliar with what non-linear editing looks like, here's a screencap from Adobe Premiere to help illustrate it:

Click to enlarge

The part you want to pay especial attention to there is that timeline sequence along the bottom, which has clips of audio and video (those colorful boxes) arranged along it. That's a view commonly used for non-linear editing. (N.B. It's called non-linear because you can jump to and from anywhere on the timeline with disregard for original linearity, unlike in traditional, physical editing.)

Now, imagine you have an interface similar to this, except instead of loading multiple clips to edit, you have but one object on the timeline, which would be whatever object you'll be marking up. The upper right pane would be used for what it is now: as a "monitor" through which you can either view (or listen) to the content you are editing. What that pane to the left of the monitor would then be for is editing the TEI tag(s) being opened or closed at that particular point on the timeline (on which your cursor would currently rest). The granularity of this temporal mark-up could be set to a specified interval, such as tenth or hundredth of a second, allowing each project to be as precise as necessary as to when tags open and close.

By working in conjunction with a timeline, you could also "wrap" portions of the file in the appropriate tags. Say a particular character gives a speech from 2h10m05s in the footage until 2h13m18s; you could then highlight that portion of the file in this view and then wrap the selected time segment with the <said> element, which would insert an opening tag at 2h10m05s and a closing tag at 2h13m18s.

Of course, this is just a nascent idea, and there would be technical specifics to be worked out, such as supported multimedia formats, software for such editing, and how this could actually be made beneficial on the consumer's end (e.g. how can this be used to dynamically enrich the manner in which a researcher studies this object?), but it would be wonderful if it could be made to work as a way to encode multimedia that has a temporal dimension. (Or, if there's something out there like it already, I'd love to know of it.)

Wednesday, March 20, 2013

Spreading the Pages of Naked Lunch

If there was ever a novel that could benefit from the analysis and display techniques digital technologies have brought to the humanities, it's William S. Burroughs' seminal work Naked Lunch (ha, "seminal," I'm so mature; but who doesn't love a good double entendre?). And it's not just the radical structure of the novel itself (or should I say the structure in each iteration of the novel?) that could benefit more than most novels from such an analysis and reconstructive manipulations, it's also the novel's changing structure over time. So I guess in a way both of those are still due to its radical structure. Hm. I think it would be best if I allow this excerpt from an essay by Carol Loranger do the talking:

"Naked Lunch has undergone at least five significant changes in the three and a half decades since its first publication. The changes in each case have consisted of the addition or deletion of large, often self-contained portions of text. None of these changes can be considered accidental variants, since changes of this magnitude and these particular kinds were enacted by author or publisher in response to specific pressures. But neither can these changes be satisfactorily marked in each case as deliberate authorial revisions in the sense that, for example, passages in the 1909 "New York" edition of Daisy Miller can be clearly marked as the late James's late-Jamesifying amplifications of the 1878 edition. Some of Burroughs's additions pre-date Naked Lunch, others are mutually contradictory, and yet others were written or transcribed by third parties and were included in some editions but omitted from others, presumably with Burroughs's blessing. Moreover, Burroughs's history of abandoning the text to circumstance and necessity and his authorial claim to have "no precise memory of writing the notes which have now been published under the title Naked Lunch" (NL xxxvii)--coupled with his subsequent experiments with the unauthored cut-up in the Nova books and his call for guerrilla assault on the idea of authorial ownership in The Third Mind--suggest very strongly that authorial intent is antithetical to the very spirit of Naked Lunch."

Honestly, I think it's almost impossible to explain that in any shorter a summary, at least not without losing a good sense of just why it's so impossible to say what exactly constitutes the text of Naked Lunch. Granted, in 2001, two years after the aforementioned Loranger piece was published, the William S. Burroughs Trust published "The Restored Text," which is what can now be more or less called the most definitive edition, and it does include supplementary introductory material and appendices that give a solid overview of the different iterations of the text throughout its history. But I very deliberately say most definitive, because the tools available to us now do allow us to create what would be a definitive version of the text.

But how's that possible, you ask, when we don't have a truly definitive edition? Well, that's just it, there is not a truly definitive edition: to create a "definitive edition," we have to include every edition, and allow a reader to consume the text, or texts, in any order, as published or otherwise, jumping between passages and editions, or not, as desired or required by the reader. As the editors of the restored edition noted, "Naked Lunch resists the idea of a fixed text;" in fact, in the "Atrophied Preface" within Naked Lunch itself, the narrator (or is it a narrator? the author? an author?) explains its dynamic structure as such: "You can cut into Naked Lunch at any intersection point...I have written many prefaces. They atrophy and amputate spontaneous[...]" (second ellipsis mine).

With the use of digital transcription methods, such as TEI, we now have the means to produce a hypertextual edition of the novel that could properly resist fixation, that would adequately represent its fluidity and allow the reader dynamic interactions with the text(s). You would think the 50th anniversary in 2009 of the first publication of the text, in any published form, would have been a great occasion to unveil such an undertaking, or at least some sort of project that would truly take advantage of the unique advantages and dynamic fluidity available in digital objects. Unfortunately, the most significant undertakings were a traditional retrospective done by Columbia University Libraries, a print edition packaged with essays, a web site for reference, and a series of various events. While all of these were perfectly, this was definitely a missed opportunity, especially considering 2009 was in the midst of the e-book boom. Hopefully sometime in the near future such an undertaking will take place. Anyone interested?

Wednesday, February 27, 2013

Redefining the Pendular Arc of Analysis

BIG data. MACRO analysis. Both are hot terms in the digital humanities, and they both point to one of the strengths of the discipline: the ability to take a step back from the text, to look at its underlying and overarching structures from an outside view. To extract ourselves from the inside position of more traditional close readings, where we are more akin to entymologists obtaining a micro view of the segmented nature of the text, dissecting its thorax, abdomen, and head. Instead we can take a step back and more readily say, "This is the pattern underlying the connections among the thorax, abdomen, and head. These are the prominent features that dominate this text's body: the antennae, pincers, and multifaceted eyes."

Such analyses are in many ways refreshing after the many decades throughout which close-reading has held the humanities fiercely in its throes. Not that it's completely let go, of course: grade schools tend to be less contemporaneous regarding humanities scholarship, and are more likely to still rely on older methodologies in the class room. Perhaps, within the next few decades, and as the digital divide hopefully continues to shrink, will we see more of such macro-analytic thought taught to our children, but for now it is more the domain of higher academia. And when it does reach even the humblest grade school classrooms, I hope we do not forget the value of close-readings and displace them too much in favor of big data and the long-distance views digital humanities is bringing to the table.

We must be diligent, for too often dichotomous cultural structures can swing back and forth as a pendulum from generation to generation. In generation A we see it on the left, in generation B it has struck the center, and by generation C it is fully on the right, ready to make a trans-generational journey across its arc again. These new techniques digital humanities has given us are wonderful, they are beautiful and new and shiny, but we most not be blinded to the virtues of our parents' methodologies by their luster. Cultural trends do not have to travel like an arc, we can instead intercede with our arms and minds, striking the pendulum whimsically along its arc, suiting our methodologies as situationally best along its curvature, perhaps even shattering its path into new dimensions it could not reach without our interference of tangential intersections imposed from other intellectual disciplines.

If we are not diligent, if we do not intercede, we will find ourselves and our children eventually at the mirrored disadvantage of what we had before: instead of too much of a focus on close reading, we will focus too much on the analysis from afar. As digital humanists on the vanguard of this weather-change, we are in a unique position to precipitate this possible eventuality, and we should work consciously towards the need for situationally dependent mixtures of micro and macro analysis.

Tuesday, February 19, 2013

Fandoms and Gamers: Oil Reserves Lurking Just Beneath the Surface of the Web

Most people would probably scoff if I suggested to them that their profession had countless free hands at its disposal, willing to work for the sheer love of it. Most people would probably scoff if I suggested that their are countless people out there willing to work together on complex projects, and they would do so with little more than a set of rules and a command to go play. Yet both can be said truthfully more and more in this digital age, and we have to start paying attention or we'll be missing out on a lot of volunteer labor, including in the digital humanities.

But where, you are probably wondering, can I find such an invaluable pool of people? Where can such a large group be hiding? Many of them in plain sight, really: we tend to call it fandom. These are groups of people severely devoted to a subject, and will spend countless hours collaborating, theorizing, discussing, number-crunching, and otherwise worshiping that subject simply for the love of it. Granted, accepting such droves into the fold would take getting over elitist hurdles about the usefulness of incorporating such lay research. We do like everyone to be properly vetted by exclusionary systems. And yes, there is a lot of dreck out there to wade through. But do not dismiss the products of fandom out-of-hand simply because they lack the proper rubber stamp.

As an example, I will showcase two resources I have used myself as part of a fandom I am proud to partake in: Robert Jordan's The Wheel of Time. (For those unfamiliar, it's an epic fantasy series that makes most other epic series look short, coming in at nearly 4 and a half million words.) The first of these resources can be found at encyclopaedia-wot.org, a fan-driven site that provides incredibly rich reference for practically any person, place, or what-have-you in the series, as well as very detailed synopses of the works with embedded hyperlinks for reference, and invaluable foot-notes. More specifically, I would like to point to the synopsis page given to each novel in the series, including a chapter breakdown and a map of the points-of-view each chapter is written from in that novel. My favorite is probably that for book 6, Lord of Chaos, which handily visualizes the fragmented structure of the points-of view in this novel, especially in comparison to the preceding novels. In other words, this map gives a macro view of how this novel's structure lives up to the chaos in its title.

The second resource I would like to point to is a wikia site devoted to the series, specifically A Wheel of Time Wiki. If you go to any of the pages for a specific volume in the series (let's use Lord of Chaos again), near the bottom of that page you will find a statistical table with a link above that reads "See also the full statistical analysis for this book." On that page you will then see a statistical analysis of the points of view in this book, including broken down chapter by chapter, done as percentages and as word counts.

With fandoms out there so willing to do such analyses and visualizations simply because they enjoy it, and in fashions that are much in keeping with the goals and methods of the digital humanities, we would be remiss as scholars not to tap into such resources. Granted, some coaxing may be required to get the exact help you're looking for, such as setting up a wiki page of your own and posting to message boards to try and get people to help you out with that wiki page's goals, but fandoms are generally very ready to explore their objects of devotion. And fandoms are hardly limited to contemporary popular culture; if something exists, there will be geeks for it.

And such methods need hardly be kept to more simple enterprises. In addition to fandoms in general, we must not forget that there are more and more gamers online every day, and, as a gamer myself, I will fully attest to how much we like standing up to a challenge if presented as a game. In the past couple years, gamers were used to unlock the structure of an enzyme crucial to our understanding of AIDS. My girlfriend has recently been entranced by a game called EteRNA, which happens to be developed by Carnegie Mellon and Stanford to learn more about folding and synthesizing RNA.

Why pay people to do work they don't want to do, when you can just get people to do it because they love to do it?