Companies that are scrambling to comply with the European Union's General Data Protection Regulation (GDPR) have a new tool to consider: Informatica's Compliance Data Lake, unveiled this week at the Strata Data Conference in New York.
With the new software, Informatica is bringing machine learning to bear on compliance issues, aiming to give enterprises a comprehensive view of compliance-related data stored not only in geographically dispersed databases but also in email, social media, instant messages, financial transactions and other non-traditional sources.
The idea is to provide more accurate compliance analytics and reporting to ensure and prove adherence to critical regulations like the GDPR.
The GDPR is the EU's latest rewrite of its data privacy laws. It enters into effect on May 25, 2018, and any enterprise -- whether it is based in Europe or not -- that processes EU residents' personal data must comply with the new regulations or face hefty fines.
The GDPR requires, among other things, that companies erase personal data on request unless there is a legitimate reason to retain it; inform those affected by data breaches; and design data protection into their products and services.
In order to meet these requirements, enterprises essentially need to get a handle on what data resides where -- and for many companies, that can be difficult, points out Informatica CEO Anil Chakravarthy. "There are so many fundamental questions about data that are hard for companies to answer right now," Chakravarthy says. "So what we built was a solution around answering those basic questions."
The Compliance Data Lake is built on Informatica's Intelligent Data Lake, Big Data Management, and Enterprise Information Catalog -- all available via subscription. The Compliance Data Lake is designed to tap the processing capabilities and information residing in the underlying software to let enterprise compliance analysts build comprehensive, trusted compliance reports.
Key to this is Informatica's Claire technology, which applies machine learning to gathering technical, business, operational and usage metadata. The Informatica software is delivered with more than 150 connectors to various third-party software packages.
"We apply machine learning to identify relationships among data in different databases," Chakravarthy says.
Based on this technology, the Compliance Data Lake determines what data resides where, and enforces compliance policies by, for example, limiting access to certain databases. The data surfaced by the Compliance Data Lake can also help enterprises understand what they need to do to make data sets residing outside the EU meet the GDPR guidelines: Companies must guarantee compliance with privacy rules when exporting EU residents' personal information outside the EU for processing.
Also at Strata, Informatica announced that its software now integrates with Hortonworks Atlas, a data governance tool designed to exchange metadata with other tools and processes within and outside of the Hadoop stack, and also with Cloudera Altus, a Platform-as-a-Service (PaaS) offering designed to speed the creation and operation of data pipelines for data lakes in the cloud.
Informatica went private in 2015 but plans to public again, Chakravarthy says. As a public company whose quarterly reports are under great scrutiny, "it's very hard to take long-term steps like reorienting your product portfolio around new ecosystems like cloud and big data."
At this point, the company has made its offerings available via subscription -- a big step. "Our goal is to enter the public market," Chakravarthy says. "We don’t have a timeline yet but were definitely making good progress toward that outcome."