Internet Archive, a nonprofit offering an overwhelming amount of free content (and triggering some copyright debates)


internet archive

This week’s Free Content Alert column considers the Internet Archive, and it’s a bit complex. Not that I want it to be, but it typifies DRM issues. If you bear with me, I believe you’ll find the result worthwhile.

First, the straightforward part: Internet Archive (IA)  is a true nonprofit, founded in 1996, and headquartered in San Francisco. According to a lengthy wiki on IA, its size was 15 petabytes. (A petabyte is 10 to the fifteenth power in bytes, or a million gigs.) Its stated mission is to provide “universal access to all knowledge.” The basic stats are staggering. Wiki continues,

It provides free public access to collections of digitized materials, including websites, software applications/games, music, movies/videos, moving images, and nearly three million public-domain books…In addition to its archiving function, the         Archive  is an activist organization, advocating for a free and open Internet. The Internet Archive allows the public to upload and download digital material to its data cluster, but the bulk of its data is collected automatically by its web crawlers, which work to preserve as much of the public web as possible. Its web archive, the Wayback Machine, contains over 150 billion web captures. The Archive also oversees one of the world’s largest book digitization projects.

Hard to take in so much. When a user opens IA’s homepage, s/he is confronted with “Top Collections at the Archive”, which are at least 150 sub-collections from a wide range of collections and in numerous formats. A number from large public libraries (Boston PL, Enoch Pratt) and academic sites (Johns Hopkins, Emory, Harvard). Overwhelming. A search box is provided, and there is the option of doing advanced searches. That screen allows the use of multiple parameters and includes a thorough explanation (with examples) of Boolean operators and searching techniques (such as nested searching).

From the Announcements box on the homepage, one can reach IA’s Blogs, written by IA team members, as the page notes. A search for ‘DRM’ received hits, but not anything in the way of a policy discussion. However, using ‘Copyright’ brought up numerous posts. For example, ” The Copyright Office is trying to redefine libraries, but libraries don’t want it — Who is it for? ” (July 2016). And so continues the debate. The list of hits typifies the facets of the discussion.

Another post from the same time deals with DCMA (see my first post for NSR). Not everyone, as you might expect, is pleased by an endeavor of this size.  The editor of wrote in 2013 that IA was violating author’s copyright.  The complexity of the debate is apparent, and not everyone agrees with the goal IA espouses. Meanwhile, we are still able to benefit from the uncertainties that abound.

Ari Sigal received his MLS in 1985 and has done reference and administrative work in public, academic and special libraries. Since 2004, he has worked for Catawba Valley Community College (Hickory), NC, first as Library Director (to 2009) and currently as the Reference and Instruction Librarian. He is also Curator of the Gilde-Marx Collection for Holocaust and Genocide Studies, one of the largest of its kind in North Carolina and offers an annual program on topics related to its material. He is the editor of Advancing Library Instruction (IGI-Global, 2013) and also serves on the editorial advisory board for the  Encyclopedia of Information Science and  Technology, also from IGI.

2 thoughts on “Internet Archive, a nonprofit offering an overwhelming amount of free content (and triggering some copyright debates)”

Comments are closed.