Friday, December 16, 2011

Genre similarities

When data exploration produces Christmas-themed charts, that's a sign it's time to post again. So here's a chart and a problem.

First, the problem. One of the things I like about the posts I did on author age and vocabulary change in the spring is that they have two nice dimensions we can watch changes happening in. This captures the fact that language as a whole doesn't just up and change--things happen among particular groups of people, and the change that results has shape not just in time (it grows, it shrinks) but across those other dimensions as well.

There's nothing fundamental about author age for this--in fact, I think it probably captures what, at least at first, I would have thought were the least interesting types of vocabulary change. But author age has two nice characteristics.

1) It's straightforwardly linear, and so can be set against publication year cleanly.
2) Librarians have been keeping track of it, pretty much accidentally, by noting the birth year of every book's author.

Neither of these attributes are that remarkable; but the combination is.