Yes We Scan Again! The Archives chats with voters on a “We the People” teleconference

On January 10th, I blogged about the “Yes We Scan” petitions proposed by Carl Malamud’s on the White House’s We The People petition platform. “Yes We Scan” calls for a national strategy, and even a Federal Scanning Commission, to figure out what it would take to digitize the holdings of many federal entities, from the Library of Congress to the Government Printing Office to the Smithsonian Institution.

I have been delighted to see the many ideas discussed in response to that blogpost. I encourage you to keep them coming!

Following that initial post, I worked with the White House Director of New Media, Macon Phillips, and the Director of Online Engagement, Katelyn Sabochik, to set up a conference call, inviting those who voted for the Yes We Scan petition (about 2,500 signers total) to further discuss this important issue and hear your ideas on how to move forward.

Sitting on the call with me were Mike Wash, NARA’s CIO; Pamela Wright, our Chief Digital Access Strategist; and Jill James, our Social Media Manager.

Eighty-five people from all over the country dialed in for the call. Eighteen participants asked questions. I want to thank you for taking the time to call in and to let us know your thoughts.

The topics included questions on everything from the magnitude of the task as hand (fyi NARA alone has over 10 billion pages in our holdings – and that’s just the paper) to ideas for getting more involvement in crowdsourcing, and the possibilities of using automated technologies like robots.

My opening comments for the teleconference are available here. I intend to post a full question and answer list soon, but in the meantime, please keep up the conversation here. We are listening!


4 thoughts on “Yes We Scan Again! The Archives chats with voters on a “We the People” teleconference

  1. In August 2010, Google announced it intends to scan all known existing 129,864,880 books by the end of the decade, accounting to over 4 billion digital pages and 2 trillion words in total. That’s a lot of scanning and we have more than double to conquer. Where will we store 10 billion pages plus a whole lot more audio and video ponders this disk storage engineer…

  2. Thanks for the opportunity to join the call! I’d like to hear more about NARA’s role in the Digital Public Library of America. I had hoped to ask during the call but ran out of time.

  3. New Zealand’s ā-tihi o Aotearoa, Digital New Zealand, has been running since about 2007.

    I was relieved when people in the teleconference were excited about open source and crowdsourcing. I was also glad to here that the nature of the content wasn’t strictly confined to paper documents.

    IMLS have awarded grants to help on the American Memory Project; it only seems logical to extend those efforts.

  4. I am pleased to see so many people taking an interest in the preservation of our history, which these documents represent whether they are scientific research or photos of World War II. There are a lot of procedures already developed in the private sector for beginning this project. I worked on one many years ago. We started with photocopies of original documents, then entered tags, identifiers, and metadata in a database which was used to cross-reference to the original documents. Since we were starting with photocopies, why couldn’t we start with half of that process: the scanning? Photocopiers have two functions: they scan the document, then they print it. The cost of paper and ink significantly increases the cost of the processing. As a result, we were often given somewhat inferior products (paper with toner that came off on our hands and keyboards).

    As for what needs to be scanned, the short answer is: everything. However, from a practical standpoint, we need to prioritize. In my opinion we should use a two-pronged approach: one group should concentrate on those documents and photos that are most in danger of deterioration. The other should start with the items that are currently most requested by the public, then work down to the least requested.

    A note on why some commenters on your other blog stated that the public wouldn’t be interested in most of it: I don’t agree. Right now, I would have to go to the NARA location in Philadelphia, at the cost of parking or regional rail, then sit in a room and read through hundreds of microfilmed documents, then pay to print them. I’ve used microfilm readers before; they are a pain in the neck and eyes. And there is no easy way for potential users to know what is out there. Digitizing the material would make an Internet search possible, opening up whole worlds of information to the public. Maybe then, certain folks wouldn’t be able to re-write history in their own way.

    As to the economics of it, yes there would be costs. Here is an idea. Instead of extending unemployment benefits for the jobless, use those funds to hire unemployed and underemployed persons to do the scanning and metadata tagging. In addition, many retired individuals, such as I am, would love to spend a day a week (or two) reviewing, categorizing, and scanning these documents. The National Park Service at Andersonville had started just such a project to list all the soldiers, both Blue and Gray, who were interred at Andersonville during the Civil War, and I would have volunteered if I weren’t transferring out of Georgia.

Comments are closed.