Friday, April 27, 2012

Publishing Libraries

[The American Antiquarian Society conference in Worcester last weekend had an interesting rider on the conference invitation--they wanted 500 words from each participant on the prospects for independent research libraries. I'm posting that response here.]

Here's the basic idea:


I had a persistent fantasy through the second day of the conference of pulling out the card catalogs that occupied one of the American Antiquarian Society's transepts and replacing them with a set of server racks. This was classic tool-first thinking, prompted by not much more than an easy visual analogy. (And an improper one at that--while a private cloud would be nice, libraries could do a lot more with a lot less). But still, it's worth thinking: what could libraries do if they started providing information digitally this way?

First things first: libraries would become a place where information is stored. There's been a lot of talk lately about the NYPL's off-site storage plans; for digital resources as well as analog ones, I think there's a lot to be said for having them on site. With books, the constraint is largely about space; with digital resources, it's purely money. But as with book storage, there are good reasons to have the physical embodiments of digital items (by which I probably mean hard drives) held locally. (Server administration issues are complicated, but could be done by a part-timer or consultant.) It's much easier to lose digital data than hard data, or to forget what you have: shooting for persistent media held on site means that missed bills or forgotten holdings don't lead to eradication. Which is nice. But there's a lot more.

Jim Grossman from the AHA made a really great point--my favorite of the conference, probably--during one of the Q&A sessions about where the money for libraries comes from. In the old days, it was from industrial philanthropists, who made physical things and had a deep affection for the materiality of books and artifacts. The newer millionaires are less grounded in productions: we should expect their cultural philanthropy to be less interested in the physical object and more in the diffuse stuff of knowledge and opportunity. This suggests a pragmatic impetus towards more digitization.

This suggests a couple things. First and most obviously, research libraries should have a digital permanent collection to match their physical one. The most obvious component of this collection would be the physical stuff they already have, digitized so far as possible and made available online; as I said in my talk, digitization is an ongoing process, which will likely never be completely; and even if it were done, the continuing act of metadata curation means that even a fully digitized library would still retain authority over its own holdings. Most uses would link back to the library.

Keeping data locally and physically allows other opportunities, as well--all those local digitized copies of American newspapers can be used on site even if the Readex license doesn't allow, and copyrighted works (which I don't think AAS has, but other libraries will) can be held in a digital form. Committing the library to the preservation of the material by storing materials on site should make us take the process of digital decay more seriously.

Card catalogs look like server racks. There's another object in the American Antiquarian Society's main hall, though, that's has as much in common with the computers as the catalogs do: Isaiah Thomas's press. In the library, it's just a museum piece.

But there's no reason for a library not to publish. When holdings are physical, the distinction between an archive, a library, and a press is clear. With digital holdings, it is not. To put digital resources online is to publish them. A library publishes its metadata and its digital holdings as well as storing them: this is good. They can do more, though. Several times, the decision to suspect publishing the Journal of the American Antiquarian Society came up; the fact that servers publish, as well as archive, opens up another big opportunity for libraries.

Researchers from a number of digital humanities projects were at the conference. (E.g.: Amy Earhart about building the project Digital Concord). We all know the challenges facing these projects: among the most important are sustainability (what happens to a project once it's completed?), and credit (digital work doesn't count for tenure.)

Libraries could solve these problems. The core mission of libraries is sustainable access to scholarly materials. Grantmakers should not be asking project teams for sustainability plans; sustainability should happen outside the creator's domain, just as it does with other research. With digital resources, publication and archiving are one and the same thing. And a library can credibly promise two things that no departmental server can:

1) Perpetual hosting with needed maintenance. (Small clouds with virtual machines could keep running that Windows 3.5 framework for years--this may be important down the line).

2) Data archiving and publication that allows future collaborations to build off the original project, without the founder being involved. Paul Ginsparg has said the arXiv "was supposed to be a three-hour tour, not a life sentence;" and it was the library that ultimately took it over. We want projects to have a life without their originators.

Moreover, having an outside organization publish a website fulfills the critical function of accreditation of digital work. Projects can incubate on local servers, but a commitment to permanently house by an outside agency would provide a critical stamp of approval. And that would help distinguish between a 'personal' and a 'scholarly' web site.

This is a clear and pressing need in the production of knowledge right now that someone will have to fill.

Of course, the actual presses could step in and provide digital site publishing as well. But in many ways, libraries are better suited to the task. The basic infrastructure of grant and philanthropic long term funding makes a lot more sense for scholarly projects than the cash-for-commodity model the publishers can't get away from. They'll need to get an acquisitions editor, somehow, but the technological hurdles are just as hard for the presses; and libraries like the AAS do have a bench of committed scholars who could provide the needed review time. And moreover, they're not as tied to continuing to churn out physical books. (If the overlap between libraryship and academic publishing is as strong as I think, it's unlikely both institutions will survive.)

Joe Adelman, a fellow at the AAS who was at the conference, has an article up about the post office as "last public guarantor of free communication in the United States." Unless we make a hard turn towards postal banking, I think the ship has probably sailed on that one. But then the same story is going to come for the libraries. Their advantage is that they've always lost money--unlike the presses and the post office, a library doesn't have to break even. That gives a library far more leeway than the presses have to keep abreast, or lead, changes in scholarly dissemination. We'll be getting somewhere once they start leading.


  1. Ask, and ye shall receive:

  2. And they said I couldn't change the world.

    Although to be clearer (I think I cut the section about digital Concord)--I want to see born-digital projects published like monographs, not just journal articles move online. Although I guess that's how it starts.

  3. Ben, I'm trying to figure out what you mean by this:

    "...the continuing act of metadata curation means that even a fully digitized library would still retain authority over its own holdings. Most uses would link back to the library."

    What's the relationship here between managing metadata and authority/possession?

    1. Basically, that digital objects are meaningless without metadata, and the act of being the primary metadata authority over a digital object gives an institution something akin to 'possession.'--later users will refer back to the original site as the originating authority for all information. (Not sure if I said this in my talk at the AAS, but I meant to).

  4. Good one,good designing,Data archiving and publication that allows future collaborations to build off the original project, without the founder being involved.
    Server Racks