How Useful is the Library of Congress’ Twitter Archive?

Library of Congress' Twitter Archive

On Wednesday 14 April, the United States of America’s Library of Congress (LOC) announced a deal with the popular social networking service, Twitter, to archive all public messages on the site right down to the first “tweet” from @jack (Jack Dorsey, Twitter co-founder) on 21 March 2006, at 3:50 PM.

Response to the news can generally be described as positive and set “Library of Congress” as a top trend for the remainder of the week. Considering that the site has evolved into one of the most efficient means of spreading information (even by Internet standards) such enthusiasm is understandable.

Still, the LOC’s initial announcement carried little description of the archive itself or the conditions of access. As hype matured into curiosity, the LOC released a short FAQ to follow-up on questions from the public two weeks later.

Let’s break it down:

Archive Description

  • the archive was a gift
  • it includes all public messages (but excludes “private account information” and “deleted tweets”)
  • it excludes links or pictures (such as short URLs and location based services like Foursquare)
  • tweets are made available to researchers approximately half a year after their publication

Archive Direction

  • “…Library…[to work] with academic research communities to explore issues related to researcher access”
  • “…serve as a helpful case study [in the development of] policies for research use…”
  • “…Library…will not try to reproduce [Twitter’s] functionality”
  • “…interested in offering collections of tweets…” related to current events

Archive Access

The FAQ identifies access issues as a priority but does not elaborate on conditions. Shortly after its release Martha Anderson (director of the LOC’s National Digital Information Infrastructure Preservation Program) told Randall Stross of the New York Times that “[t]he library plans to dole out its access…only to…qualified researchers” due to privacy concerns.

However, exactly what “qualifications” means is unclear. Hopefully the LOC will further clarify this and elaborate on their overall archive strategy as it develops over the coming months.


While it is perhaps too early to properly assess just how useful such an archive might be, it is difficult to dismiss its potential value to the historical community as a source of contemporary cultural commentary in future. It is thus less a question of whether or not researchers will use the archive but how.

Overall, the LOC’s interest indicates that the still infant networking service may indeed have a staying power beyond others within the same rapidly evolving medium with a notoriously fleeting attention span. This may set a powerful precedent and persuade us to reevaluate otherwise “everyday” content on the Internet as well as reflect on new roles for archives.

On the other hand, it would take very little for Twitter to incorporate a more advanced, research-ready option into their existing search. Some researchers are already using Twitter for their own purposes without a formal archive structure or support beyond the platform itself. This does not dismiss the significance of the LOC’s decision to build and maintain an archive at considerable expense and effort, but it does suggest that not all prospective researchers need it.

For now, the only safe conclusion here is that Twitter has become socially significant enough to merit an archive at a publicly funded institution — which is not only unprecedented for a four-year-old start-up but also somewhat prophetic for researchers and the historical community in general.


What do you think about the LOC’s efforts to collect and collate messages on Twitter (now estimated at over 15 billion)?

Do you agree with the Washington Post‘s Monica Hesse that this an important example of history from the bottom up?

Are you a Twitter user?

Do you have any privacy concerns about your content? And if so, will the collection of public messages alter how you use the network?

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License. Blog posts published before October  28, 2018 are licensed with a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Canada License.

Please note: encourages comment and constructive discussion of our articles. We reserve the right to delete comments submitted under aliases, or that contain spam, harassment, or attacks on an individual.