School of Computer Science › Software and Societal Systems Department › News › 2026 › CyLab Study Finds 'Privacy-Preserving' Tracking Alternatives May Still Expose Users

May 01, 2026

CyLab Study Finds 'Privacy-Preserving' Tracking Alternatives May Still Expose Users

By Michael Cunningham

As major technology companies race to replace traditional online tracking tools with systems marketed as more privacy-conscious, new CyLab research suggests that some of those alternatives may offer far less protection than advertised.

In a recent study, researchers found that privacy systems built around grouping users by broad behavioral "topics" rather than individual identifiers can still leave people surprisingly vulnerable to re-identification when modern artificial intelligence models analyze behavior over time.

The findings, detailed in the recently published paper "Sequential Pattern Recognition Attacks against Deployed Topic-Based Mechanisms," raise broader concerns about whether many emerging privacy-preserving technologies are truly safeguarding users, or simply repackaging surveillance in a less obvious form.

Saranya Vijayakumar, a Ph.D. candidate in Carnegie Mellon's Computer Science Department and lead author, presented the paper at the 12th International Conference on Information Systems Security (ICISSP 2026) in Marbella, Spain, where it received the ICISSP 2026 Best Student Paper Award.

The research focused on systems like Google's now-deprecated Topics API, which was designed as a replacement for third-party cookies. Instead of assigning users a persistent ID that advertisers could track across websites, Topics categorized users based on general interests, such as cooking, sports, or news, with the goal of obscuring individual identity within larger groups.

But Vijayakumar said that premise begins to unravel when user behavior is analyzed across multiple points in time.

"The broader takeaway is that when you're looking at something temporally, the privacy-preserving nature can change a lot," said Vijayakumar. "If you have multiple epochs of data, you have to examine something thinking of yourself as an advertiser who can collect data over time."

Using a transformer-based machine learning framework, a type of AI model particularly effective at detecting sequential patterns, the researchers demonstrated that aggregated topic profiles could still be used to identify individual users with striking accuracy. Their model achieved nearly 34 percent re-identification accuracy on web browsing data and more than 95 percent accuracy on music listening behavior in our experimental setting, substantially outperforming previous attack methods.

The problem, Vijayakumar explained, is that while any single snapshot of generalized user interests may appear anonymous, repeated snapshots create a behavioral timeline that can become highly distinctive.

"Over the course of many weeks, you can build a profile not just within one week's topics, but across many topics," said Vijayakumar. "That ends up building another behavioral profile, kind of like cookies. It takes longer, but it's still something you're able to do."

The study also found that common safeguards, such as adding small amounts of random noise to topic assignments, did little to stop advanced attacks. Even industry-standard protections were often ineffective once machine learning systems leveraged temporal consistency.

We have the science behind what a good privacy mechanism should look like, and then companies are doing something else.

For Vijayakumar, the issue extends well beyond one discontinued Google product. She emphasized that the real lesson is not about a single company's implementation, but about a broader class of privacy mechanisms increasingly used across the tech industry.

"I want to de-emphasize Topics specifically and talk more about the temporal nature of our work," she said. "The clustering privacy mechanism itself is a weak idea, because it lacks formal guarantees and can fail under composition, especially over time."

That weakness, she argues, stems from a gap between privacy marketing and privacy science. While many systems present behavioral aggregation as inherently protective, Vijayakumar noted that stronger privacy guarantees, such as differential privacy, assume mathematically rigorous protections that many real-world products lack.

"We have the science behind what a good privacy mechanism should look like, and then companies are doing something else," she said.

The findings arrive amid growing public fatigue around cookie consent pop-ups and increasing consumer assumptions that privacy controls automatically equate to meaningful protection. Vijayakumar believes that misunderstanding can obscure larger ethical concerns.

"Privacy is a more fundamental right," she said. "Even if you're not hiding anything, we should be trying to protect privacy as a whole, and to do that, we need to protect individual privacy."

The research also highlights how implementation details can determine whether a privacy system succeeds or fails. Vijayakumar hopes the work encourages deeper scrutiny of similar clustering-based systems developed by other companies.

Rather than evaluating privacy protections only in isolated snapshots, she said, researchers and policymakers need to examine how data accumulates over time and how AI can exploit those patterns.

As companies continue searching for alternatives to invasive digital advertising practices, CyLab's findings suggest that replacing cookies with newer technologies may not be enough if the underlying assumptions about anonymity remain flawed.

"Without stronger, mathematically grounded safeguards that realistically address privacy risks, including those associated with modern AI technologies, privacy-preserving systems may still leave users more exposed than they realize," said Vijayakumar.

The research team also included S3D faculty members Norman Sadeh and Matt Fredrikson.

CyLab Study Finds 'Privacy-Preserving' Tracking Alternatives May Still Expose Users

Researchers found that privacy systems built around grouping users by broad behavioral "topics" rather than individual identifiers can still leave people surprisingly vulnerable to re-identification when modern AI models analyze behavior over time.