IBM develops software to reduce personal data in AI training

By William Smith
The IBM AI Privacy and Compliance Toolkit allows data scientists to create machine learning models that protect the privacy of training data...

Researchers at US technology giant IBM have developed ways of improving the protection of privacy during the training of artificial intelligence models.

The AI Privacy and Compliance Toolkit allows data scientists to create machine learning models that protect the privacy of training data while following the necessary data protection regulations.

Overcoming AI security issues

The issue is that, even if training data itself is not exposed, AI trained on real data might leak sensitive information if someone is determined enough.

The IBM software, which assesses privacy risk, has applications in industries ranging from fintech to health care to insurance - anywhere that relies on sensitivity training data. The software involves a number of approaches, including differential privacy (DP).

In a blog post, Abigail Goldsteen, Researcher in Data Security & Privacy, IBM Research, said: “Applied during the training process, DP could limit the effect of anyone’s data on the model’s output. It gives robust, mathematical privacy guarantees against potential attacks on a user, while still delivering accurate population statistics. [...] However, DP excels only when there’s just one or a few models to train. That’s because it’s necessary to apply a different method for each specific model type and architecture, making this tool tricky to use in large organizations with a lot of different models.”

The specifics

To counteract that, data can be anonymised before the model is trained. The process involves generalising data, by removing specific values and instead providing a blurred range. IBM’s innovation in its software is to tailor the extent of that process to the needs of the organisation.

“This technology anonymizes machine learning models while being guided by the model itself,” said Goldsteen. “We customize the data generalizations, optimizing them for the model’s specific analysis – resulting in an anonymized model with higher accuracy. The method is agnostic to the specific learning algorithm and can be easily applied to any machine learning model.

Share

Featured Articles

Data Centre Companies Seek Energy-Saving Solutions due to AI

A continued explosion in global AI demand is causing data centre companies to report a shortage of space across Europe, leading to expansion difficulties

The Dangers of AI Bias: Understanding the Business Risks

As Google seeks to fix AI bias issues within Gemini, concerns over AI and machine learning biases remain as developers consider combatting inaccuracies

Fujitsu to Combat 5G+ Network Complexities with AI

Fujitsu announces Virtuora IA, a collection of AI-powered network applications that used machine learning models to improve mobile network performance

MWC Barcelona 2024: The Power of AI in the Telco Industry

Technology

Upskilling Global Workers in AI with EY’s Beatriz Sanz Saiz

AI Strategy

Intuitive Machines: NASA's Odysseus bets on Private Company

Data & Analytics