Thursday, May 24, 2018

General Data Protection Regulation: What will happen to all the micro level analytics?


This article was first published in Analytics India Magazine on May 19, 2018; Co-author: Sanjay Fuloria

The recent episode involving Facebook and Cambridge Analytica raised data privacy concerns once again. This resulted in the closure of Cambridge Analytica, once a high-flying consulting company. They, allegedly, stole user data from Facebook to micro analyze the profiles of individuals.

Based on this micro analysis, Cambridge Analytica advised political parties on campaigning in elections. Facebook CEO Mark Zuckerberg also had to face some very tough questions at the Congressional hearing as a fall out of the above episode. On the face of it, data theft notwithstanding, this is a practice quite rampant in today’s digital world. 

Companies have terabytes of data which they analyze to promote their products and services by targeting the right audience, whether the consumer likes it or not.

With this background, the General Data Protection Regulation (GDPR) formulated by the European Union (EU), is coming into full effect from May 25, 2018, with the objective to protect data at the individual/consumer level. According to this regulation, the control of the data resides with the individuals. If they decide not to share their data, companies can’t use that data. Non-compliance or breach would result in huge fines. Some of the important points covered under GDPR are as follows:
  1. There is a penalty of €20 million or 4% of worldwide revenue for non-compliance.
  2. Consumers should opt-in for consent.
  3. GPS locations are also included in the definition of personal data.
Now, considering most of the global companies have huge analytics departments filled with data scientists, many of whom are PhDs, consumer data is their stock in trade. All their algorithms need copious amounts of data to function well. A lot of marketing programs use the individual level data to design marketing campaigns, both online and offline. 

When someone searches for a book online, for example, she gets a list of similar books she could buy. Now, if the individual doesn’t want to share this data with the company, the company will have to delete this information about the individual. 

The individual has a right to be forgotten as per the regulation. If a majority of individuals decide not to share their data, the marketing campaigns and the analytics engines would all go for a toss.

Some other global companies have outsourced their analytics work to niche vendors. They would also face the same data problem. This regulation applies to any company providing services to EU citizens irrespective of whether the company has a physical presence in the EU or not.

The companies will have to find an alternative approach to deal with consumers. They may not have sufficient time though. As far as GDPR compliance goes, only 7% companies are fully compliant as the May 25th deadline approaches.

As per a Crowd Research report, very few companies (only 40%) are hoping to be compliant by the deadline (https://www.zdnet.com/article/gdpr-compliance-for-many-companies-it-might-be-time-to-panic/). This becomes a tricky situation for the companies even before the analytics experts ask for individual data.

If the individuals raise a complaint after the deadline that their data was misused, companies are bound to pay a hefty fine. What about the individual level data that is already stored in company servers and on the cloud? It will take a huge effort and investment on the part of the companies to make that data safe and reach compliance.

This applies to machine learning algorithms as well. The whole premise of machine learning is that new data should be continuously fed to the algorithms to make adjustments and provide better predictions. Autonomous cars, for example, use image processing and they do need human images. 

Imagine a scenario where human beings refuse to share their images. Of course, the data scientists will have images of animals, trees and other objects to feed into their algorithms but this won’t prevent the autonomous vehicles from bumping into human beings leading to dire consequences.

We have already had a case in the U.S. where an autonomous car hit a human being resulting in death.

With the latest voice recognition technology and the advent of devices like Amazon’s Echo and Google’s Home, individuals are supposed to feel more comfortable with less friction in their lives. 

For this to happen, the speaker has to be ‘ON’ all the time and it will record whatever is spoken at home leading to dire consequences. This data will also be stored somewhere on the cloud. Sooner rather than later, individuals would not want their data to be shared between devices, going back to the earlier era of zero connectivity. It is still to be seen how all this plays out.

Companies are concerned that GDPR compliance will make the jobs of their employees more cumbersome and would make it very difficult for them to do business. They are hoping fines won’t be levied so soon after all and the deadlines might get extended. 

Whatever be the situation on May 25th, 2018, data privacy concerns are going to get shriller. Companies need to be ready with a ‘Plan B’, as far as their analytics strategies are concerned.