Companies are spending heavily on artificial intelligence to leverage the range of business benefits the technology promises to deliver, but experts now warn that such investments may be at serious risk from regulators over the potential misuse of consumer data. Not only are significant penalties possible, but regulators could also force businesses to laboriously mine and delete specific datasets used to train AI models, as well as require them to scrap the algorithm underpinning a key service completely. This process is known as “algorithmic disgorgement,” and it has the potential to disrupt a wide range of businesses, including those not directly involved in AI development.
“Imagine a company whose entire customer service infrastructure relies on an AI-powered chatbot trained on data collected without proper consent,” said Adnan Masood, chief AI architect at tech consultancy UST. “If regulators demand disgorgement, the company could lose its core AI model overnight, crippling its customer service operations and forcing it to start from scratch.”
So far, this regulatory tool has been used rarely and usually only in cases where the degree of harm has been high. However, this is likely to change as AI adoption grows, especially since too few organizations know how the cutting-edge technology they are using works or what data it has been trained on, according to Dr. Clare Walsh, director of education at the Institute of Analytics. Without an answer to both questions, “companies may be pushing their luck,” she added.
Increasing Regulatory Scrutiny
Several cases have already demonstrated the very real business implications of algorithmic disgorgement. In a notable 2019 case, the U.S. Federal Trade Commission (FTC) forced U.K.-based political consulting firm Cambridge Analytica to delete all of its algorithms and models developed using the data of millions of Facebook users it had harvested without their knowledge or consent.
In fact, the FTC has been quite active in this arena in recent years. In 2021, after finding Everalbum used user-uploaded photos to build facial recognition technologies, the FTC ordered the firm to delete its models and the underlying data because they collected the images without proper consent. The following year, the FTC ordered WW International (formerly Weight Watchers) to delete all algorithms trained on data collected through Kurbo, its weight-loss app targeted toward children, because it had obtained the information without proper parental consent.
Further complicating matters, disgorgement orders have typically had short compliance deadlines—sometimes only 90 days or less. This makes operational disruption a very real prospect.
Sectors like health care, financial services and law enforcement are most obviously at risk due to the amount of sensitive data they hold and the level of potential harm they can cause, particularly in terms of facial recognition and discrimination bias. However, other industries that rely heavily on consumer data to shape their AI models are also under scrutiny. A common misperception is that enterprise users are not liable for third-party AI solutions or are not responsible for overseeing deployments of those solutions to mitigate the risk of bias, errors and hallucinations. However, anyone deploying AI is liable for how the technology is used, as well as the data that drives it and the outputs it produces.
Risk Management Actions
Since many organizations are going to rely heavily on third-party vendors for their AI-based tools, experts caution that companies need to ask questions to ensure that data ethics and privacy concerns have been addressed appropriately when developing the technology.
The first step is ensuring data provenance—the ability to trace where data originates, how it was collected, and how AI models are trained. “Firms often purchase AI models or datasets from vendors, but without clear visibility into the data's origins, they expose themselves to substantial risk if that data turns out to be improperly sourced,” Masood said.
As part of their AI governance strategy, companies should also conduct regular risk assessments of AI tools, particularly any that are not developed in-house, Masood advised. These assessments should evaluate not just the technical soundness of the AI, but also its ethical compliance, particularly with respect to bias, privacy and potential regulatory exposure.
Additionally, companies should prepare for the financial and technical burden of compliance. “The cost of retraining models, not to mention the potential delay in services or products, can be enormous,” Masood said. “Having contingency plans in place, such as alternative data sources or backup models, is crucial.”
According to Ojas Rege, senior vice president and general manager for privacy and data governance at software provider OneTrust, one of the biggest AI challenges is the fact that if a model is trained on inappropriate or non-compliant data, this data cannot be “unlearned” without rolling back the model itself. This means organizations must ensure the absolute integrity of the data they are using to train these models “up front and by design,” otherwise they could face complaints around privacy, security or ethical issues. This also underlines the fact that it is easier for companies to demand assurance about how the AI works before implementation rather than untangling it after the fact. Rege expects algorithmic disgorgement to happen “more frequently in the coming months, especially if data obtained without consent is used to train a model.”
Experts warn that the prospect of ramped-up enforcement means companies need to take a proactive approach to AI risk management. Since it is widely accepted that generative AI is prone to bias, errors and hallucinations, “the harms are foreseeable,” said Robert W. Taylor, of counsel with law firm Carstens, Allen and Gourley. This means companies need to show that they have effective compliance and risk management programs in place and that they are following best practices to demonstrate they are doing all they reasonably can to mitigate these foreseeable harms.
When establishing an effective compliance program, it is also critical to conduct a holistic and comprehensive legal risk assessment of the use case for an AI-based tool. As part of this assessment, companies need to ask themselves whether the governance/oversight plan for preventing harm is appropriate for how and why companies want to use AI.
“Required oversight varies from use case to use case,” Taylor said. “You need to focus on your use case—how you will use the third-party solution—rather than on the solution itself as the same third-party solution can have wildly differing risk profiles depending upon how you are using it.”
Critical Areas of Vulnerability
According to Jeremy Tilsner, managing director of the disputes and investigations practice at consulting firm Alvarez & Marsal, there are three critical areas where businesses need to be vigilant: 1) non-expert use of AI; 2) the risk of a vendor being forced to disgorge its algorithms; and 3) immature internal audit practices.
As AI development tools become increasingly user-friendly, more non-expert users across organizations will create or modify AI models “without fully understanding the legal and ethical implications,” he said. “This increases the likelihood that businesses unknowingly incorporate improperly sourced data into their systems” and further exposes them to the risk of algorithmic disgorgement.
Meanwhile, if a vendor is forced to undergo algorithmic disgorgement due to misuse of training data, any business relying on that vendor could face “cascading disruption,” especially if the AI underpins key processes such as customer analytics, risk assessment or automation. Because many companies have not yet developed standardized, effective audit practices for AI models, he said, they have limited visibility into how their AI services are built or how data is sourced, exposing them to disgorgement risk.
However, algorithmic disgorgement is not a straightforward remedy. Simply removing the data from the dataset is often insufficient because the model continues to “remember” the data it was trained on, even after deletion—an effect known as the “algorithmic shadow.” To address the privacy harm, regulators may require businesses to delete the entire model, erasing any benefits derived from misused data, thereby making disgorgement a significant operational risk.
While regulators in Europe have taken the lead in creating legislation around data privacy and AI use with GDPR and the AI Act, some experts believe that U.S. authorities are more likely to demand algorithmic disgorgement as they are more prone to “regulation by enforcement.” According to Tilsner, this may be the case for two reasons: First, more AI models are based in the United States than in Europe or elsewhere, so U.S. regulators have had to take a more proactive stance and enforce algorithmic disgorgement when necessary. Second, the GDPR’s focus on compliance and prevention “makes it less likely that sensitive data will find its way into the hands of would-be AI modelers in the first place.”
With no easy “unlearning” option for AI, removing data without gutting the model is nearly impossible. As a result, when regulators demand model deletion, “it can cripple years of work, disrupt entire business strategies, and shake stakeholder trust,” Rege said. “Whether you are building or buying AI systems, the message is clear: Get serious about AI governance, risk assessments and human oversight.”