Monday, December 6, 2010

Back to the Future

Maybe this is just Patricia Cohen's take, but it's interesting to note that she casts both of the text mining projects she's put on the Times site this week (Victorian books and the Stanford Literature Lab) as attempts to use modern tools to address questions similar to vast, comprehensive tomes written in the 1950s. There are good reasons for this. Those books are some of the classics that informed the next generation of scholarship in their field; they offer an appealing opportunity to find people who should have read more than they did; and, more than some recent scholarship, they contribute immediately to questions that are of interest outside narrow disciplinary communities. (I think I've seen the phrase 'public intellectuals' more times in the four days I've been on Twitter than in the month before). One of the things that the Times articles highlight is how this work can re-engage a lot of the general public with current humanities scholarship.

But some part of my ABD self is a little uncomfortable with reaching so far back. As important as it is to get the general public on board with digital humanities, we also need to persuade less tech-interested, but theory-savvy, scholars that this can create cutting edge research, not just technology. The lede for P. Cohen's first article—that the Theory Wars can be replaced by technology—isn't going to convince many inside the academy. Everybody's got a theory. It's better if you can say what it is.

I, alas, am not quite there yet. But the echoes of Dan Rodgers's voice in my head says that the temptation to use these massive datasets to recreate the "American Mind" is pretty problematic, even though it's hard not to succumb to it from time to time. But I think it's important to point some of the discourses this contributes to:

  1. History of the book: this is the most obvious one, because book data is, obviously, our most important resource on the publishing industry. There are a lot of charts, if I remember right, in The Literary Underground of the Old Regime, with some heavy caveats. We're approaching a time in which anyone doing an analysis of a journal or a publisher would have to be perverse to avoid _some_ textual analysis, even if the dataset has problems.
  2. Discursive/keyword/conceptual histories. I probably need to dive back in to some kind of the Williams/Skinner/Koselleck nexus to figure out just how to write about this. (Particularly if I'm to use it in the diss) But preliminarily: the reason I found myself drawn to the IA sources in the first place is because the way to talk about meaning begs for some sort of network analysis. Just tracing keywords on their own isn't that useful. Anyone will tell you that. But at least at Princeton, historians spend a lot of time obsessing about anachronistic word combinations, minute distinctions between words (darwinian vs. darwinist, say), and the evolution and diffusion of 'concepts'. These are things that happen that happen almost entirely on the printed page. Exploring interconnections between words can only help here.
  3. Post-structuralism. The only time I ever got a prize for my writing, it was in a paper in which I reluctantly included the phrase "gendered language" after not being able to find any other way to express the concept. A lot—too much?—American history of the last twenty-thirty years has played around with the ways that metaphors of gender, of race, of slavery inflected beliefs about everything else. But while it doesn't always play in Peoria, this definitely wasn't a lost quarter-century. My first, anonymous correspondent suggested a month ago that I look into verifying, for instance, Kristin Hoganson's claims about the importance of gender in American imperialism. Data and these sorts of language games are odd bedfellows, on some level. But this legacy is a large part of the reason, I think, that English in particular (too few links, I know--send me more...) has been so progressive about adopting computational tools for texts. And numbers could be invaluable for convincing recalcitrant undergraduates that individuals are embedded in historical linguistic worlds that shape what they say and how they say it. Of course, a lot of that history has been deeply and incisively critical about the practice of statistics itself. (I follow that in my own work quite a bit). But there's some room for a rapprochement, I'm convinced. 

1 comment:

  1. Hi Ben:

    I like this post. Looking down the three avenues of potential value seems especially good; though I have some issues with the last one (so do you, judging by its length -- the lady doth protest too much..).

    Partly, my sense is that the sorts of questions and answers these tools might be oriented toward, ultimately, can and should fall out of the standard ways of thinking you summarize here. Post-post-post-structuralism, we might just decide that having digital tools of this sort means we should develop a new language for our explanatory frameworks or the way they work together with our interpretation to produce new work.

    OR, a "New Structuralism" - not just testing past claims, but generating new ones. You've still got to be able to ask the right questions of the database (see your latest post, "First Principals," for Exhibit A), but if you keep this up (as you well know) the sorts of results you're capable of pursuing will feed back into questions in ways (I hope) we can't predict.

    All this is to say, it's good to pitch the project to potential audiences (Historians of the Book, or at least those who aren't too busy writing Op-Eds), but we also want to push the envelope. Yes, historians are stubborn, and yes, most hold to whatever theory they latch onto in the third-year of graduate school for a career, but it seems to me that digital humanities (or whatever we end up calling, esp. in history-specific contexts) might yet constitute a disruption to standard field-hiring practices by departments.

    We'll see!