Author of Data Cartels on intellectual freedom and our digital rivers on fire

Note: This post is published on behalf of Michelle Gibeault, Head of Teaching, Learning and Research at Smith College Libraries and also a member of ALA’s Intellectual Freedom Committee’s Privacy Subcommittee, who conducted this interview with Sarah Lamdan. There will be book event on November 18, 4:30 PM EST. Sarah Lamdan will discuss her new book Data Cartels: The Companies that Monopolize and Control our Information with James Madison University’s Director of Scholarly Communication Strategies Yasmeen Shorish. Register for free.

The Interview with Sarah Lamdan

Librarian and law professor Sarah Lamdan is the author of Data Cartels: The Companies that Monopolize and Control our Information published on November 8, 2022 by Stanford University Press.

I wanted to first find out how you learned about privacy. Librarians have always embraced a professional responsibility to protect data. How are we taught this so well? Did you go to law school first, did you learn about privacy there or in library school?

These are great questions. I started going to law school, not library school. I did not know anything about librarianship. I had a professor who made me his research assistant. He sent me to a campus archives at the University of Kansas, and I was like. “This is what I want to do. I want to do the job that these archivists are doing.” And so then I learned about law librarianship, and enrolled in a library and information science program. I learned about privacy not at law school, but in my library and information science program.

Specifically, I learned about intellectual freedom. I’ve been thinking about this a lot lately, because the legal profession has its own ideas about privacy, but it seems like legal professionals don’t look to Information Science and librarians or read our scholarship as often as I’d recommend they do. Legal professionals don’t seem to delve much into our understanding of intellectual freedom, which means that the people who make the body of law around privacy aren’t oftentimes familiar with intellectual freedom. I really value the intellectual freedom perspective, because I think that when you see privacy from the perspective of what we do as librarians fostering and supporting research. Privacy means something much more intricate and nuanced related to how we behave in research spaces, including digital ones, and what we need to feel like we can research without being chilled by the prospect of surveillance. I feel like the legal treatment of privacy can be very heavy-handed, and it doesn’t focus enough on our ability to explore knowledge without being surveilled and without being tracked.

As a profession we’re into organizing information, but another focus of ours is information quality. One of my big takeaways from Data Cartels is that a duopoly of information conglomerates are normalizing mediocre information. Was this something you learned as you were writing it?

Yes, it’s something I think that all of us librarians have noticed, right? I’ve been working in libraries since 2003, which is a long time. In that time, I’ve seen resources that were very focused on vetting the quality of their offerings lose that focus. You can see it, I think, in the legal information resources that were the gold standard for being very quality-controlled like Westlaw and Lexis, and even platforms that companies like ProQuest or Ebsco or Gale offer. Information vendors were once very focused on vetting the quality of the resources that we shared with library users. And now, it feels like there is less of a focus, or a decreased standard. Vetted publications are blended in with other, mediocre publications. For instance, in legal information platforms, there are published opinions, but also things like trial court filings that aren’t proofread, where things are misspelled. A lot more information is available, but the quality of the information is no longer being checked as closely.

That may be because these data analytics companies are so focused on getting a large quantity of data and information to analyze. They’re interested in gathering robust data collections to run through their data analytic systems. In some cases, companies may be prioritizing gathering as much information as possible for analytics purposes over gathering the best information for researchers.

Another reason that content quality is declining is that, in a world where information is just a commodity, the company with the most information sources wins. It’s a race to get the most content on information platforms. I don’t know if you’ve been following the Hoopla situation, but companies that platform publisher’s content have been stuffing library collections with unvetted sources, and dubious publications. Information is non-fungible—every piece of information is unique. It’s not like oil or water, where you can replace one vessel full of the resource with another. You can replace one barrel of oil with another, but for the most part, every piece of information is unique. You can’t replace a study about giraffes with one about elephants. That means that the system that has the most unique pieces of information to offer will be the system that libraries gravitate towards. So library vendors are fighting to get as much content as possible, and that means they’re also decreasing quality standards in order to win the information fight on that level.

We recognize that this is not just irritating, but dangerous. How does what you’re talking about affect minorities, people of color, and low-income elderly?

I see information law, information policy, and information access on a spectrum between public information and private data. Our personal data should be private. On the other side of the spectrum, public information and critically important information should be accessible.

When the entire information spectrum is commodified, both sides of this spectrum are commodified. People get harmed, especially marginalized people and communities. People are harmed by both the data privacy implications of commodifying information and the access implications.

On the data privacy side, people are more likely to be forced to part with their personal data, and more likely to be surveilled against their will. Overly policed communities are subject to more digital surveillance and data collection, and people in low-income communities have to interact with more service providers that track and collect their personal information. So government services like Social Security and educational support lead to more robust personal data dossiers. People are sometimes also compelled to trade information for discounts and certain types of access to digital services and spaces. One prime example is those key fob cards that you get at your pharmacy or your grocery store, where you implicitly agree to exchange your consumer information in exchange for discounts. It’s also true of credit cards and shopper preference cards that you get in retail stores. When you sign up for one of those cards you get discounts, but you also are aware that you are giving up your personal information. The more you need those discounts, the less easy it is to opt out from those types of programs.

“Alone, the kind of data you trade by using a key fob, et cetera, isn’t very invasive or useful to a data broker…When you combine your shopping data with other data points, like your age, income, and hints about your other preferences and habits (your activity tracker, your social media posts), it becomes more useful.” (Data Cartels, 14)

So that’s the data privacy end – the surveillance – and also giving away personal information in exchange for consumer benefits.

And then, on the other end of the informational spectrum you have access problems. When we turn critical public information like the law, scientific research, financial information, and the news into a private commodity, it becomes something that gets pay-walled by these companies. Data companies are treating critically important information, not like something that should be accessible to everyone, but like a proprietary for-profit resource. So critical news information gets pay-walled. The best premium, financial predictions and financial information are paywalled. The best legal information is paywalled and same with scientific studies and research about things like global pandemics– like Covid– and other really important information that we need to have as an informed public.

This book began as a post for a blog very much like this one. Are you tired of telling that story? Will you tell it one more time?

As long as you all aren’t tired of hearing it? I feel so bad because I know there are people who listen to me talk more than once, and they’re like this story again! Yeah, this project would never have happened without the work of a library association blog. It wasn’t the American Library Association. In this situation it was the American Association of Law Libraries, which is another major national professional organization that thousands of librarians are members of. A colleague and I read a news article that talked about how certain companies were vying to help build ICE’s extreme vetting surveillance program. What caught my eye was one of the companies listed was LexisNexis, and another was Thomson Reuters, which is Westlaw’s parent company.

This was in 2017, at the height of ICE’s family separation activities, and a time where we knew that the agency was confining children in caged cells. The agency’s activities were raising human rights concerns. So I was alarmed by this news. It wasn’t clear to us, as librarians, what the companies’ roles in the program would be, because I knew them as publishers. So my colleague and I wrote a blog post for the American Association of Law Libraries blog. The post was not especially provocative. It was basically explaining, “We noticed this article. We’re concerned about this because we don’t understand what’s going on. Should we, as law librarians, care about this?” The blog post went up. It was up for about two minutes, and it was taken down. The general counsel for the American Association of Law Libraries censored it because they were afraid that the companies would react negatively and react in some sort of legally punitive way to the blog post. They invoked antitrust law fears to block us from raising the topic in AALL forums. In short, a library professional organization was censoring librarians and telling them that they couldn’t talk about something that was happening with our vendors.

This takedown made me want to look deeper into what LexisNexis and Thomson Reuters actually are as companies, so I started researching them. I was really surprised by what I found. Which is why I decided to write a whole book about it. The book is meant to inform my colleagues, to inform us, as a community outside of the bounds of a AALL or ALA – what our vendors are up to, and why it matters. At the beginning I didn’t know if it mattered. I was like “oh, these are interesting questions, about what academic publishers are up to” but I am now convinced that it does matter that these companies are transitioning away from publishing and towards data analytics. They aren’t “publishers” anymore, in the way that we, as librarians, are familiar with.

One of the reasons you’re probably the best person to write this book is that you’ve already written a book on environmental law. People who teach in libraries often use various environmental metaphors to talk about information: “the information landscape,” the “information ecosystem,” etc. Your prevailing metaphor is bodily, you talk about the Internet as our collective circulatory system. But I also love this sentence: “Today’s information crises are a digital river on fire.” I’m curious about how your background in environmental thinking helps you focus on the clearest metaphors to talk about complex systems?

That’s a really good question. And yeah, this situation is analogous to environmental law and other situations where catastrophes force legal reform, like how securities laws were enacted after the Great Depression, and more recently, how the 2008 collapse of the financial sector led to legal reform, and to the formation of a Consumer Finance Protection Bureau. Oftentimes the government doesn’t act before there is a catastrophe that requires it to act. This is especially true in situations where private companies and private industries are involved, because in our hyper-capitalist system we depend on the free market to govern itself, and we try to foster enterprises that are profitable without regulatory “red tape.” We pretty much let corporations do their own thing until the rivers are on fire– until it’s absolutely clear that we must step in to either regulate them, or foist more oversight and transparency upon them.

I think we are at the point now, in our informational systems and informational infrastructure, where it’s time to start thinking really critically about what the forces in these industries are doing to information, access, data privacy, all the things we care about in our informational structures.

I also use a circulatory system analogy in the book because we tend to focus on Google, Amazon, Facebook, Apple, and Microsoft (the “big five” tech companies) when we think about digital information problems, instead of the companies I discuss in Data Cartels. “These are the companies, this is the Internet. Full stop.” But what librarians are in charge of a lot of the time, at the heart of it all, is the information that fuels these systems. Information is like the bloodflow of the Internet, and we depend on certain companies to supply that material. The Internet, without information, would just be an empty shell. Facebook, sans information, would be an empty feed. Google search without any informational content would be just a blank page with a cute, themed search bar at the top. So I think it’s useful to think of information as a critical resource. And then also to think of that critical resource being at a tipping point where we should demand more accountability and transparency from the companies that we rely on to do our information work..

This book focuses on RELX and Thomson Reuters as a duopoly and their many, often less-than visible tentacles in data brokering; academic information; legal information; financial information; and news. You have a lot of big picture thoughts, but also two very clear recommendations for intervention. The first one, you want to build a wall between data brokering and other critical information infrastructure. It seems like this is largely what Apple just did, and people have seen it was good. Am I oversimplifying?

You’re right! We have these companies providing us with some of the most important critical resources that we can imagine. The law. Science. The knowledge enterprise, right? Like Elsevier (an academic and scholarly information powerhouse), LexisNexis and Westlaw (the two major legal information platforms), and also not to mention new services and news archives, financial information, right? So these companies are our source for information that is critically important to society, and that we can’t get anywhere else. And it is as simple as saying that companies that provide that type of information resource can’t also be providing a dossier of our personal data to the police and the FBI and to our insurers and employers. There should be a wall between those two information functions in society, because one of them is critically important to the public, and we must opt into it. And the other one is one we should definitely have the ability to opt out of. Those two conflicting systems need to be separate. And right now I think they are entwined in a way that makes it impossible to see what is private and what is public, and it’s really blurring the lines between critical information resources and private information resources, which, as I describe them, are at completely opposite ends of the informational spectrum. They need to be treated completely differently, not as parts of the same product.

The other recommendation is that data companies should be information fiduciaries. I have heard Jonathan Zittrain, who is another law and law library person, make a similar case about social media. Not many people know what a fiduciary is, and it seems like the only way forward is to develop literacy around public interest concepts. Do you think that there is a way to make fiduciaries sexy? Should that be your next book?

I love that and I haven’t thought about sexy fiduciaries before. And that’s amazing. Yeah. And no, not that that’s going to be my next book. (laughter) But yeah, I think the idea of putting more accountability and responsibility on the companies through some sort of fiduciary system, or fiduciary framing and mindset is a good one. I think we should definitely make that sexy. When you talk to people about the problems these companies cause, plenty of people respond with sentiments like “Well, I’m not gonna use Facebook anymore. I’m not gonna use LexisNexis anymore. I’m gonna throw my computer into the ocean.” Or they go the opposite direction, saying things like “Well, I have to use Lexis to do my job. I have to use Facebook or else I’m totally disconnected from the world. I’m going to just live with the problem, and I’m just going to feel kind of downtrodden and bummed that I will never have privacy again.” So those are the two extremes. People usually gravitate towards the latter, because these days, it’s impossible to throw your computer into the ocean and walk away from these systems. It’s just as hard to go off the digital information infrastructure these days as it is to go live off in the middle of the forest. It’s tough. Studies show that people do not like the way their personal data is treated. Everybody feels like their privacy is being invaded, and we’ve all just kind of decided that that’s the way it has to be, because we have no other choice.

There’s an in-between, and that is a fiduciary-like system. And that’s why it’s exciting and jazzy, and people who are very cool are talking about it. There’s a discussion about this concept in Chapter 9 of my book, with lots of good citations to mine. That’s the real treasure in my book, the citation. I hope librarians find them useful.

Another one for me was around this idea that data and tech are wizardry, and that it is somehow unwise to hold wizards responsible, to establish boundaries. There’s been a conversation in librarianship about how treating our work as magic basically elides the material aspects of librarianship, it’s labor, etc. So basically I just wanted to talk about that. Why are we afraid to hold wizards accountable? It seems true that we are afraid.

Meg Leta Jones wrote about the fallacy of technological exceptionalism (PDF), and her work has framed my thoughts on why people in power hesitate to reign in digital information companies. Throughout time, we’ve had a kind of resistance to regulating new technologies. When the printing press was invented, we were like. “Oh, well, we can’t possibly regulate that groundbreaking, magical technology and innovation, lest we stifle it. We just have to let it do its thing. We don’t want to slow it down with all of our burdensome regulations.” Industries foster this type of thinking. When the government considers regulating new technology, the industry usually comes back with “If you regulate us. We’re not going to be able to operate. It’s going to be too expensive. It’s going to be too burdensome. We’re going to slow down and eventually shut down.” So you have this kind of battle between regulators and industry, and especially when a technology is really cool and magical, like the automobile or the internet. We want to set limits on speed and set safety guidelines, but the innovative tech leaders want to build on their work unfettered by oversight and red tape. Their response to safety concerns are, generally, “Please don’t get in our way, because we need all of our resources to keep building these really cool automobiles” or tech platforms, or data analytics products…

We didn’t get seat belts for like fifty years or so.

And you know why? I teach a case about this in my administrative law class. We didn’t initially get seatbelts in all of our cars because the industry said it would be way too expensive to add the restraints to the auto manufacturing process. Like the printing press, and like cars, lawmakers don’t want to inhibit growth of the tech industry that’s doing all this really cool stuff. Also, among members of Congress, I think the median age is like in the sixties, which means that most federal lawmakers have lived most of their lives in a pre-internet world. And so there might be a bit less knowledge about how these systems work. There may be the sense that “We can’t possibly know how to regulate that, it’s too complicated,” which is also a fallacy. We’ve been able to regulate automobiles, even though most of us don’t know how to build an engine. We were able to regulate mass printing even though none of us invented a printing press, and most of us have probably never operated one. We can regulate things that are complicated.

What was the peer review process like for this book? Talk about how the ideas presented here were vetted.

That’s a really good question. This is a book really based on grey literature and qualitative research. Unlike a quantitative research paper, I didn’t have any deeply technical or complex methodology to present. It was really a lot of digging into corporate filings, mapping out all of the product lines in the different arms of the big information companies, reading a ton of research written by other information science experts, and talking to both tech experts who design data analytics products as well as data subjects who feel oppressed by the products. And rigorous fact-checking at every stage of the peer-review process, again and again. I’ve presented all of these pieces to legal scholars and peer-reviewers who are very deep into this work. A lot of people in the fields related to the topics raised in this book have been discussing the topics within the book, vetting it, and correcting the record if it ever went out of the bounds of what is actually happening in this information ecosystem.

You’re doing an event next month on November 18th at Smith College, where you’ll be interviewed by an amazing data privacy librarian Yasmeen Shorish. This is a hybrid event, and we’re trying to invite as many librarians and privacy folks, assemble the hivemind. Are you ready for that?

I am so excited for this event. This is a book for librarians and people interested in or concerned about data privacy. The book actually comes out only a week before the talk. I will be able to talk to all of you and, I’m really, totally stoked for that.

—October 28, 2022 via Zoom, Register for Free