I'm pulling this discussion out of the comments thread on Scott Enderle's blog, because it's fun. This is the formal statement of what will forever be known as the efficient plot hypothesis for plot arceology. Noble prize in culturomics, here I come.
Brief background: Enderle shows pretty persuasively that all the fundamental plot arcs described in a paper by a math-based computational story lab can be ascribed to random (brownian) noise. As I wrote earlier, and Hannah Walser explored in more depth recently, that this happens with their data isn't so surprising; the "stories" they are modeling are mostly random documents to begin with.
Still, there's some reason to think that maybe sentiment trajectories are random walks even in actual databases of stories like those Matt Jockers uses. Enderle finds that, well, weird: "Should we find that sentiment data from novels does indeed amount to “mere noise,” literary critics will have some very difficult questions to ask themselves about the conditions under which noise signifies." The idea that plots are random seems offensive to the idea of plot at all. Others in the field, like Jockers and Ted Underwood, have also expressed the idea that there should be some regularities to plot, particularly that map across genre.
I had earlier raised the idea that the null hypothesis for plot testing should be a random walk (Brownian noise, as Enderle calls it) but I thought of it as just that--a null hypothesis that indicates nothing interesting is going on.
But of course, it *would be interesting if nothing was going on.* It would demand explanation! And now I've got one: the efficient plots hypothesis, a corollary of the efficient markets hypothesis (EMH) for the literary world.
The EMH states that stock prices are efficient; you can't know reliably if they're about to go up or down, because if they were someone would have bought them. There's been a lot of research on whether stocks move in Brownian noise; they don't, totally, but they come pretty close.
The EPH, as I imagine it, says that the ideal reader can't know if the mood of a book is about to get sunnier or darker at any given point in the plot. This not because of market forces directly, but because the purpose of a narrative is to engross the reader. Engrossment proceeds through uncertainty. If you knew what was about to happen, you'd skim ahead or stop reading.
That is: at any moment in a story, the emotional trajectory is a random walk for the reader because anything else would be *boring.* And stories aren't boring.
This could be tested empirically by asking readers if a book will get more positive or more negative over the next five pages, and by how much. In a pure EPH world, they'll only be right about half the time. Enderle thinks the EPH is obviously wrong, particularly for genre fiction.
I'm not so sure. To take an example: I read some John le Carré novels over the summer. Periodically, a spy has to secretly pass from the East to the West without getting by the commies. (Through the Berlin wall, over the Chinese border to Hong Kong, etc.) Do you know if they'll make it? The emotional sentiment of the next few pages depends on whether they get killed or not. I can see two models here:
1. Genre determines plot arceology: There are conventions to the spy novel that make it possible to tell in advance.
2. The EPH: The whole point of reading a spy novel is that you don't know what will happen; the job of a spy novelist is to make you unsure.
My reading experience is much closer to the latter; that the conventions of genre fiction are *precisely* that you don't know what's going to happen next; otherwise no one would read it.
For most good genre fiction, I think this holds. Will Lockhardt/Gardner win the case? Is Don Draper going to hit the bottle or stay sober? The rise of "anyone can die" as the predominant trope of 2010s TV suggests that the economics are forcing stronger and stronger forms of the EPH onto us every day.
The major objection to this would be: "but there *are* genres where you know the outcomes precisely!" In a Hardy Boys novel, they'll rebound from danger and catch the bad guy every time. One response to this is: sure, *you* know that; but you don't read Hardy boys novels. The people who do are 10-year-olds who legitimately think that, just maybe, the killer's going to drown the brothers in the quarry and the next 20 books on the shelf will turn out to be prequels.
Even if you know how certain books will *end*, that doesn't mean that you'll ever be able to predict the next two pages, which is what this is about. I think this distinction is crucially important and maybe underestimated. Sure, a romantic comedy always has a temporary breakup in the middle; but whether that happens 40% of the way through or 70% of the way through makes all the difference; and if you've made it 90% of the way through without the breakup happening, you start to think "maybe this is one of those comedies without a breakup in it."
If the EPH holds, then, it doesn't suggest that fiction is truly arbitrary; rather, that it's an elaborately constructed game between reader and writer, socially conditioned and in no way permanent. It would suggest that there are enough fundamental plots that at any point in a book you are unsure what plot you are in; and that plots tend to wear themselves out over time.
It does completely throw into the ringer my analogy between musical tonality and emotional valence. Key signatures in music are highly predictable. But I think that's OK: it's really clear that there aren't underlying structures quite so strong as sonata form under novels; this would explain why.
For a lunatic idea, the EPH is actually empirically kind of testable. Just ask people to predict the direction of books as they're reading them. Someone could totally do this. Maybe some movie studios even do.
For more details, see my forthcoming book with Stephen Dubner, Jane Austen was a Derivatives Trader (Harper Collins 2017).