We Need to Regulate Big Data as a Public Utility

Photo of Google Bikes at Google headquarters in Palo Alto. Photo by Roman Boed.

A majority of Americans believe that Big Tech should be regulated more than it is now. From increasingly targeted ads that make 43% of American cellphone users think their phones are listening to their conversations, to Facebook and Cambridge Analytica swaying the results of the 2016 presidential election by making susceptible users more likely to vote for GOP candidates, it is far past time that the era of the digital wild west come to an end. The question is: how exactly should data and machine learning algorithms be regulated?

The amalgamation of data that companies have gathered throughout the past couple decades isn’t inherently good or bad: it's a tool. In many of its applications, this tool can be used to achieve a broader social good. Big data, for example, can be used to analyze predictors of suicide in social media usage, or correlate otherwise unnoticed factors to prevent heart disease or other illnesses. GPS data on smartphones can help transit authorities design infrastructure to diminish traffic and design better public transportation. Even on a commercial level, 76% of people get frustrated when their interactions with brands aren’t personalized.

But big data can also be used to do immense damage to society. Under the current system, when Google or some other tech conglomerate collects information about an individual who uses their platform, the company owns that data and will use it to serve their own interest, regardless of how it may negatively affect the user. Companies can sell data to advertisers without taking into account how shady their buyer is, or use the information to make their platforms even more addicting and detrimental to a consumer’s mental health. Oftentimes, these companies don’t even know who their data is being sold to; middlemen known as data brokers buy up data from companies and then sell conglomerated profiles of targeted demographics to advertisers. These demographics can be as mundane as “women over 45”, to as insidious as “people with depression” or “consumers 90 days behind on their credit card payments.” While current consumer protection laws in some states–such as the California Consumer Privacy Act–do create some transparency in how their data is shared, those laws only cover a small percentage of Americans, and they are woefully inadequate in actually preventing abuse of data. 

The reason why common proposals for regulating data fail is because they treat the internet and big data as a private commodity to be sold on a market instead of what it should be: a global commons. Rather than data belonging to the corporations who collect it or to the individuals whose information is being collected, data should instead belong to everyone so that it can be legislated to serve the public good. 

This is the way it would work: when companies collect information from their consumers, that data would have a duty to not only serve the interests of the private company who collects it, but to the public interest as well. A classic example of this kind of public utility is water or electricity. Electricity and water providers are private companies, but they provide a service that the entire rest of society depends on and needs to have access to in order to live well. These companies are also natural monopolies, since the complexity of their infrastructure makes it much more efficient to only have a few operators, if not one, in a given area. It would be incredibly inefficient and expensive for multiple companies to set up water and power lines to every single house, instead of just having one. Because of the inherent advantages of being a sole provider of these services, these industries naturally monopolize.

 Without government regulation, these natural monopolies would be able to dictate prices of their goods and everyone would be forced to pay them, since electricity and water are essential items. However, electricity and water companies fulfill two important criteria: they are inherently uncompetitive, and they provide a service that all people must have universal access to. Because of this, these industries are classified as “public utilities”, which allows the government to heavily regulate the corporations that provide these services. In doing so, the government is able to set affordable prices for consumers, mandate environmental standards, award permits for new infrastructure, and more, thereby ensuring a corporation does not abuse its monopolistic advantage. Today 100% of Americans have access to electricity and 99.2% have access to clean water. While there's still work to be done in ensuring everyone has access to these goods, the public utility method of regulation has generally proven successful in providing all Americans with these basic necessities.

Electricity and water companies serve as an existing model to how Big Tech should be regulated as a public utility, because companies that primarily profit off of data collection such as Google, Facebook and Twitter fulfill the two criteria of both providing essential goods and being natural monopolies. By creating a platform where people can communicate, make content, and exchange goods and services, social media companies such as Facebook and Twitter essentially provide infrastructure for the internet, the same way that electricity and water companies do in real life. They have created platforms that 70% of Americans use, and lacking access to these platforms makes it incredibly difficult to interact with the rest of society. This logic also applies to search engines like Google, which provide the infrastructure for online navigation. Using the internet without Google is virtually impossible; it would require that a person know the exact web address of any website they wanted to visit, and 92% of internet searches are through Google. Clearly, these corporations provide essential communication and navigation services that people fundamentally need in order to exist online.

Tech companies can also be understood as natural monopolies since their utility is greater the fewer platforms there are. If an individual wants to communicate with their friends on a social media platform, or an advertiser wants to reach a consumer, it is more likely that the friends, advertisers, and consumers will all be on the same platform if fewer platforms exist. It is therefore easier to connect consumers to producers of goods, services, and information the fewer platforms there are for people to access. The market is intrinsically designed to encourage a lack of competition.

Because these tech companies fulfill these two criteria, they can set whatever price they’d like for using their services. While Big Tech companies do not usually charge for their services, their ability to exploit their monopolistic position comes in the form of the incredibly invasive data collection that users must submit to in order to participate in online discourse. Regardless of how unethical these companies’ use of your data is, the services only they provide are necessary to living in the 21st century, so as a consumer you have to “pay” for it with your personal data.

In order to end this abuse of power, the government must make it clear to Big Tech that because the services their platforms provide have become necessary for society to operate, regulation must ensure that they are being used in the public interest. By classifying data as a public utility, the federal government could then pass laws mandating equal access and non-discrimination policies for data-collecting companies. Furthermore, the government could classify how different types of data have to follow different guidelines for public and private usage based on how much it invades an individual’s privacy and how useful the data could be to the general public. This is necessary because different types of data need to be protected in different ways; for example, a person’s medical information should be protected more strictly than their sushi preferences. A Stanford Technology Law Review article outlines 5 classifications for different types of data sharing. These categories (Sharing Internal Data Analysis, Releasing Targeted Data, Data Pools, Granting Access to Public Actors, and Open Access) provide a framework for how data access can be oriented towards the public good. 

When a company collects information that is classified as important for public use, the above categories would provide them with a framework for how to share that data with the appropriate researchers, public interest organizations, or government bodies. These institutions can then use this valuable information to prevent traffic accidents, reduce heart disease, reduce food deserts, etc.. By using this framework, the government can outlaw data usage that adversely affects the public interest, such as swaying election results towards the highest bidder or designing addictive platforms that harm the mental health of children and adults. 

A global data commons could also integrate a framework of privacy protection for consumers. Under the current status quo, the incredibly personal information companies collect can be sold to anyone, from foreign governments to companies the consumer has never heard of. A global commons framework can standardize privacy policy across industries and require corporations to de-identify data by scrubbing it of identifiers like social security numbers, names, and passport IDs similar to the way the European Union does. Although your privacy is currently in the unknown hands of whatever organization collects your data and those they sell it to, a standardized system would end that ambiguity.

Just as the advent of water and electricity infrastructure at the end of the 19th century led to the creation of a utilities method of regulation to ensure that technological progression was available to everyone, the advent of the internet in the 21st century must be subjected to the same regulatory process to ensure that everyone has access to this technology, free of exploitation. We must start the process of orienting this new age of information towards one of public interest and away from dystopia– and regulating data as a public utility is a good place to start.

Max Edelstein (SEAS ’25) is a staff writer on CPR studying Environmental Engineering and Political Science. He has been engaged in politics and public policy for over a half a decade and is interested in the intersection of science, technology and policy.