IBM develops software to reduce personal data in AI training

By William Smith
The IBM AI Privacy and Compliance Toolkit allows data scientists to create machine learning models that protect the privacy of training data...

Researchers at US technology giant IBM have developed ways of improving the protection of privacy during the training of artificial intelligence models.

The AI Privacy and Compliance Toolkit allows data scientists to create machine learning models that protect the privacy of training data while following the necessary data protection regulations.

Overcoming AI security issues

The issue is that, even if training data itself is not exposed, AI trained on real data might leak sensitive information if someone is determined enough.

The IBM software, which assesses privacy risk, has applications in industries ranging from fintech to health care to insurance - anywhere that relies on sensitivity training data. The software involves a number of approaches, including differential privacy (DP).

In a blog post, Abigail Goldsteen, Researcher in Data Security & Privacy, IBM Research, said: “Applied during the training process, DP could limit the effect of anyone’s data on the model’s output. It gives robust, mathematical privacy guarantees against potential attacks on a user, while still delivering accurate population statistics. [...] However, DP excels only when there’s just one or a few models to train. That’s because it’s necessary to apply a different method for each specific model type and architecture, making this tool tricky to use in large organizations with a lot of different models.”

The specifics

To counteract that, data can be anonymised before the model is trained. The process involves generalising data, by removing specific values and instead providing a blurred range. IBM’s innovation in its software is to tailor the extent of that process to the needs of the organisation.

“This technology anonymizes machine learning models while being guided by the model itself,” said Goldsteen. “We customize the data generalizations, optimizing them for the model’s specific analysis – resulting in an anonymized model with higher accuracy. The method is agnostic to the specific learning algorithm and can be easily applied to any machine learning model.

Share

Featured Articles

IPhone 16: What Is Included in Its “Apple Intelligence”

The launch of the IPhone 16 gives the tech world a glimpse into what exactly the company's AI offering “Apple Intelligence” will be capable of

Why AI Ranks High on DHL's Logistics Trend Radar

The seventh edition of the DHL Logistics Trend Radar highlights how the freight company labels AI as integral to upcoming trends in the sector

Anthropic Challenging OpenAI with Claude Enterprise Launch

Anthropic's launch of an Claude Enterprise Plan chatbot sees it challenge OpenAI's ChatGPT Enterprise and its monopoly on enterprise Gen AI use

Reshaping Retail with AI: Valtech’s Rosanne Barendrecht

Data & Analytics

Amazon and Covariant Partner to Boost AI-Powered Warehouses

Robotics

What Does the World’s First International AI Treaty Include?

AI Strategy