The first part is an interactive map with every place name we could find using the Stanford Natural Language Toolkit and some (Fraas-flavored) elbow grease. Then we got two great historians of American foreign policy, Dael Norwood and Gretchen Heefner, to explain some of the things in the maps.
The second is about individual words presidents use. So the recent rise in "Freedom," the references to the Constitution predominantly in the time of crisis, and so forth.
My favorite feature, and one that the Atlantic team executed beautifully, is the deep access into individual texts: click on a circle or a bar, and you are off reading the actual paragraph from the state of the union that uses that word on mentions that place. This has always been a core feature of Bookworm on various levels--by treating paragraphs as documents for the modelling, it's easy to drill straight to the interesting stuff. One thing that's mostly missing are the Ngrams-style line charts. I've been saying for a while that I hope people see Bookworm enabling other forms of visualization. These pages are a great example of that; maps and bar charts of words are just as engaging, and sometimes things like "presidents" and "the world" are more engaging than individual years.
But for the text analysis crowd, I also wanted to tell you a little more about the link down right at the bottom of the second (words) piece, and get a little technical about why that, although we decided not to include on the Atlantic site, contains the germ of something I find pretty interesting for online text analysis in general.
That page gives you an in-browser live corpus comparison using Dunning Log-Likelihood between the complete speeches of any two presidents. Click on any word on either chart and you get a bar graph showing which presidents used it the most. (I'm obligated to Isabel Mereilles for pointing out to me that this was the best approach over some awful wordcloud thing I had originally).
This visualization is a little more inside baseball--but for those who know either presidential history or text analysis, it should be pretty interesting. These aren't pre-compiled numbers: every search runs the Dunning keyword analysis anew. That means that although this front end is strictly about comparing presidents, you can use it for all sorts of other questions. I used myself, in fact, to find interest words to graph: I settled on "freedom" as one of my blurbs under the chart after rewriting the display to show words that showed marked differences between Republican and Democratic presidents since 1960.
I find this exciting--and I think you should too--because as I've been saying for a while, comparison is the most underused tool in the toolkit of the digital humanities. A lot of pre-2010 work did fascinating comparative work. (The MONK interface for Dunning comparisons on TEI-annotated documents, for example). This is a proof-of-concept in a very similar space, but implemented inside quite a rich environment for text analysis that makes it easier to adopt the same tools for very different applications.
People who topic model talk about the potential to generate insights. I believe them. But I also think that in the presence of rich metadata, good comparison metrics (which Dunning approaches) can be far, far more productive. For the State of the Union, there are all sorts of useful comparisons to make: president vs. president, republican vs. Democrat, lame duck vs recently elected, opposition congress vs. friendly crowd... And for every other corpus, there are just as many. We currently treat these kinds of analytics as things that should be run client side, requiring individuals to obtain digital texts (frequently impossible) and install and run some tools for corpus comparison (a high barrier to entry.) But libraries and other content holders can--and I would argue, should--support these things as a form of exploration out of the box.