The General Data Protection Regulation (GDPR) is going into effect today and that means users will have more control over their data. The regulation states that users in the European Union can request that companies delete any data that company has on them.
Because data is at the core of AI and machine learning, the GDPR also has a potential to impact those fields.
According to Roy Pereira, CEO of Zoom.Ai, there are two components of the regulation that will affect AI companies. The first is that AI requires a lot of data and companies will have to keep track of where that data is coming from. The second is a question of how deleted data will affect AI models.
If derived data is an aggregate of all user data, the question is: do companies have to rerun the AI models and come up with new insights once that user’s data is deleted? “With our AI models, we derive user-specific data which is obviously associated with a user, but we also aggregate all that to derive company-wide data or insights.”
This is one of the questions that not very clearly laid out in the regulation, and something that companies will need to figure out in the coming months.
It is also not very clear if data derived from another source will need to be removed in the same way. For example, if data is sourced in to a data processing company through a company like Google or Facebook, that data was technically not generated by the company. The data processing company then generates new data derived from that data. What is not very clear in the regulation is whether that derived data needs to be removed as well.
The GDPR will place a new cost on data, which up until now most companies viewed as a free resource. “If it was ever associated with a cost, that cost is close to zero because there is just so much data available,” said Pereira. “If you associate a cost with it, you can potentially see it being harder to accumulate a lot more data, which is what you need for good machine learning models.”
Pereira believes that the increased cost associated with data will only be temporary as companies need to re-tool to support the GDPR. Even though there will be an added cost, he does not see this as hampering innovation in AI. It will just become more expensive.
The cost will decrease over time, and Pereira predicts that in two years time companies will not even think about the cost because it will have become normalized.
“That’ll be the normal way of running a business, is to be careful about data,” Pereira said. “And you know we’ve seen this already with California companies having to be very careful about user data, having to be careful about notification of users during a breach or right after a breach. That’s part of GDPR as well.”
While the regulation only applies to users in the European Union, Pereira predicts that similar regulations on data will follow in the United States and other regions.