The flawed (and outdated) art of categorizing books and knowledge in digital formats

books-401896During my years as a Library Journal book review editor, I spent countless hours each week sorting through books (then physical objects only) to figure out what goes where. When I started my editorial career (in the late 1990s), book categories made a lot more sense than they did when I left the book review job in 2010. I can’t count the times I went back and forth with my Library Journal colleagues about whether a newly arrived print galley belonged in my or someone else’s “pile,” to be assigned for review.

Is it Military History or Politics? But couldn’t it also be Law & Crime? Is it Literature because it’s literary or Self-help because it’s about a writer’s spiritual journey? Is it Philosophy or Religion? And what if it’s always at least three categories combined? Questions like these were part of our daily dialog. In retrospect, my colleagues and I made educated guesses every day when assigning books for review and I have no doubt that we didn’t always make the right ones. The way we printed book reviews in the magazine corresponded to the way books were categorized in libraries. Since we were the ones instructing librarians what to buy (by category), we were essentially driving the way books would be made available to patrons in libraries. Quite a responsibility.

When I started handling electronic products (ebooks and databases), in the early 2000s, it became easier to figure out how to categorize content in digital environments. SCOPUS really is science. SAGE resources are usually, if not always, social science. Well, kind of. “Social science” by its definition is anything but clear-cut and incorporates a number of other disciplines. I remember my LJ interview with H. James Birx, the editor of SAGE’s Encyclopedia of Time, who explained to me at the time that the entries in the A-Z ranged from covering time as a purely scientific phenomenon to those discussing literary novels that explored the concept of time in various ways. So, in a way, The Encyclopedia of Time was a classic ‘reference’ that belonged in Social Sciences, but would also be of interest and great use to literature and science scholars.

I could spend hours citing similar examples that point to how books and electronic products sold to libraries and institutions of learning are getting harder to categorize with each passing year. Which has led me to wonder: Is it time to rethink the way content (in all its forms) is categorized? More precisely: have we reached a point in the evolution of the book to make the discovery of content more in tune with the multi-disciplinary and inter-disciplinary nature of the world we live in? Should the future goal for publishers and libraries be to get rid of categories as we know them and completely re-wire our thinking to come up with new ways to ‘sort’ knowledge? And how would we do it?

If we consider that the categorization of content has for centuries been tied to the print book—the physical object that has always demanded to be placed on a shelf, which also includes books exploring similar topics or belonging in the same ‘category’—doesn’t it make sense to apply completely different methods when organizing digital content? If shelving books in a categorical order makes sense in a physical library, does it make as much sense in a virtual one? Of course not.

Let’s look no further than to the World Wide Web for clues. How is the world’s information sorted in this vast open space we can’t imagine our lives without? Where do we begin when we ‘enter’ the Internet and need to get to particular information? We usually begin with a simple ‘Search’ screen. If we want to look something up, we don’t first try to decipher where ‘the term’ belongs before we can ‘zoom in’ to get to ‘it.’ In the virtual world, the topic is more important than the category. The category, it seems, has become obsolete.

For the past few years I have had persistent thoughts about why our seemingly organized world of online content is limiting the research experience of students and scholars by limiting the focus of library resources. For the most part, library collections and resources still tell us that History is History, Art is Art, Literature is Literature, and Science is Science. Even publishers and content creators are recognized within the industry by the ‘type’ of category they specialize in: Alexander Street Press, for example, is synonymous with performing arts; Salem Press is synonymous with literature, Adam Matthew is synonymous with history. While I do not wish to undermine the benefits of specialized publishing in a highly competitive and over-saturated book and content market, I am becoming increasingly more aware of their limitations when placed in the context of online research.

Content formats have been blending for years now. It’s common nowadays to enter a library resource and see books next to journals and multimedia files next to plain text.  Doesn’t it then also make sense to break through categorization of content in the same way we have been able to break through the rigidity of keeping ‘containers’ separate in older versions of library databases? To be fair, this isn’t a new concept but judging from the sheer number of resources produced each year (and sold to libraries), the emphasis still remains on preserving rather than defying the concept of categories.

When news broke out recently of Finland’s plans to overhaul its (already advanced) education system in the near future by moving away from ‘teaching by subject’ to ‘teaching by topics’ in its secondary schools, it caught some educators by surprise, but I’d argue that it goes in line with an awareness that a multi-disciplinary world demands a multi-disciplinary approach to education and research. So shouldn’t those of us in the business of creating, publishing and packaging content also move in that direction?

As explained in this article, Finish students who are learning about the European Union, for example, will simultaneously be exposed to several subjects, including the languages of the EU members, history, geography and current events. Shouldn’t the same methods be applied to research? When a student is learning about a topic, shouldn’t he or she be exposed to as much varied content as possible covering that topic from every angle and not be sent to a subject-specific category? Thinking back to those years at Library Journal, I’d say that at least one third of the books we assigned for review defied categorization. Today, the number is probably significantly higher.

Since the beginning of time, it seems, publishers and library vendors have produced subject-specific (category-driven) resources fully aligned with school and university curricula. In fact, a significant portion of content creator’s marketing budget is allocated toward ensuring that each new product is aligned with what is taught in schools and universities, so that it can be effectively sold to libraries. This is especially prevalent in the United States. Librarians who directly participate in the creation of products (via Advisory Boards, focus groups, etc.), have a great deal of power in how content is packaged because they echo back to the creators (i.e., vendors) what is in demand. And what is in demand is usually dictated by the institutions those libraries serve.

My point: the shift from ‘categories’ (and subjects) to ‘topics’ has been happening for some time now and is, in fact, all around us. But digital products and resources sold to and used by libraries, schools and universities, including, for example, databases, ebook collections, and e-learning platforms, aren’t exactly keeping up. To be fair, they’ve made great strides compared to what they looked like 20 years ago, but there is still a sense of disconnect about them that isn’t aligned with how modern-day research flows.

We (who work in publishing and libraries) still have the need to clean up the ‘mess,’ because what is out there manifests to us as one big inter-disciplinary mess that we need to get into ‘order’ first before producing and sharing. So we take Fiction and break it into Literary, Mystery, Thrillers, Fantasy, etc. We take Arts and break it into Fine Arts, Photography, Graphic Arts, Interior Design, Performing Arts, etc.

But what if we embraced the mess and worked toward adding more fluidity to research by insisting that resources move away from categories and subjects and move toward topics in completely new ways? What if an art history student studying the work of Van Gogh, for example, could benefit from a mystery novel written by a historian, in which Van Gogh the painter is the main character solving a murder? And this was a highly literary and intriguing novel heavily based on historical facts? How would that student discover such a novel inside an Arts resource recommended to him for research by his professor or librarian? And who would decide this novel’s value as a research tool?

Imagine that student starting his research by simply using the term ‘Van Gogh’ and not needing a subject-specific source to be recommended to him. Imagine him researching the painter across categories in a way that the concept of categories doesn’t even exist.

The physical world will always require organization visible and discernible to the human eye. But in virtual and digital settings, the written word and knowledge will continue to defy categorization and lead us in the direction of fluidity and uniformity. And speaking of uniformity, I can’t help but think that in that not-so-distant future, there will also come a point when content produced outside academia (by qualified, knowledgeable non-scholars, of whom there are many) will co-exist with the content produced ‘inside’ institutions and touted as ‘authentic’ and ‘authoritative.’ But that merits a separate article—the kind of article that also explores the unification of public, academic and school libraries.

As far as my mind can see, it’s all going in the direction of a completely inter-disciplinary and multi-disciplinary universe. A river of knowledge that flows to everyone and everywhere. No boundaries. Categories as we know them (e.g., History, Literature, Science) become obsolete. Research is centered around topics and topics are living organisms that grow in every way already possible (and not yet possible).

Mirela Roncevic is NSR Director. She is also the founder of the Free Reading initiative. Her full employment history is available on LinkedIn. Contact her directly at


One thought on “The flawed (and outdated) art of categorizing books and knowledge in digital formats”

Comments are closed.