Thinking about libraries, data retention, privacy, and you.

Police Surveillance (LAPD)
Police Surveillance (LAPD) flickr photo by Popwerks shared under a Creative Commons (BY) license

A while back one of my library’s regular users asked me about what data the library’s proxy server collected about their usage and how much information was retained by the university. They asked what should have been a fairly straight forward question and I was stymied. I looked to see if our campus had documented policies I could share with them, and couldn’t find much. I asked around – trying to figure out who would have such a policy, but not trying to create a mess – and couldn’t find any information. I kicked the question out to the collective wisdom of library workers and got no concrete information.

In the end this regular fell into a rhythm of asking me for PDFs of anything they needed to hand over information in order to access, or if they needed something and weren’t on the campus network. As a library worker who understood their concerns about privacy, I obliged. The concept of privacy is critical to the freedom of inquiry and intellectual curiosity, and central in the Library Bill of Rights. Here is a relevant section from ALA’s privacy interpretation of that document:

The right to privacy includes the right to open inquiry without having the subject of one’s interest examined or scrutinized by others, in person or online. Confidentiality exists when a library is in possession of personally identifiable information about its users and keeps that information private on their behalf.5 Article III of the Code of Ethics of the American Library Association states that confidentiality extends to “information sought or received and resources consulted, borrowed, acquired or transmitted,” including, but not limited to, reference questions and interviews, circulation records, digital transactions and queries, as well as records regarding the use of library resources, services, programs, or facilities. 

The privacy implications of the user data we retain from library proxies fits within the broader context which Kalliopi Mathios ascribed, “The Commodification of the Library Patron” – where we treat library users as customers, as if libraries were selling a product like coffee shops, online retailers, or media companies. It’s a logical outcome from the decades long trend of neoliberalism’s reshaping libraries. The question about data retention from proxies comes back into view with the publishers and NISO proposed RA21, oh I mean Seamless Access. It saw many raise concerns about the proposal’s privacy implications, such as ARL and Dorothea Salo. The piece “User Tracking on Academic Publisher Platforms” by Cody Hanson outlines many of the issues for library patron privacy as we rely on third parties who are driven by profit motives.

So that brings us to this week when Dorothea Salo published a dataset she obtained through a FOI request of her patron data from University of Wisconsin.

Her request led to the confirmation that the University of Wisconsin not only collects user data (which isn’t really a surprise), but also retains much of it (which is a really big deal). Here is where she describes more about her request for this information. I look forward to reading her further analysis on the matter. I doubt Wisconsin is the only library system that retains this much user data.

And let’s make it clear – this data set is bad. Remember 20 years back when libraries were the protectors of confidentiality and privacy in the face of the USA PATRIOT Act? Well that was then. I don’t think this kind of data retention happens out of any nefarious motives. There are probably some good intentions behind it, in trying to leverage this data to better serve users (just like Amazon and Google!), which could lead to student success! But of course, that feeds into learning analytics which are often in opposition to library ethics. The circulation data should not exist. I know it’s valuable for collection assessment but to the level of granularity tied to an individual? I guess all the talk around the PATRIOT Act was just bluster. The proxy data is something libraries need to look at their contracts with vendors for but again… if they’re going to retain that data we need to make it explicitly clear to library users.

So yeah… none of this is a surprise but it’s troubling as heck.

One thought on “Thinking about libraries, data retention, privacy, and you.

  1. “The circulation data should not exist. ” That’s the key sentence. And Berkeley, at least, knew that around 50 years ago. (Whether they’ve continued to practice it I don’t know.) Thanks for pointing it out again.

Leave a Reply