GDPR spell the end of Big Data
November 23, 2017

Will GDPR spell the end of Big Data mindset, is less now more?

By Paul Laughlin

Does GDPR spell the end of Big Data thinking? Will you stop gathering or holding onto data because “it might be useful“?

With a topic as important as GDPR, it’s good to hear from more than one viewpoint. For that reason, I am delighted to welcome back regular guest blogger, Tony Boobier.

This time, it sounds like Tony has been thinking about GDPR and Big Data, including a new ‘V’ for its definition.

Over to Tony to explain his reflections…

Why might GDPR spell the end of Big Data spending?

In the news this week, a joint survey by the International Association of Privacy Professionals, and EY, that Fortune 500 companies will each be spending an average of $16m, or nearly $8bn collectively, to avoid falling foul of the EU’s GDPR that comes into operation next May. Maybe it’s time to rethink the notion of how we use Big Data?

Considering also the aggregated amount spent, outside the Fortune 500 group, there are mega amounts of money at stake. The analysts Gartner suggest that in 2018, up to US$93bn could be spent on cyber security, with 65% of buying decisions relating to GDPR.

Big Data, which we define by the ‘4 Vs’ – volume, velocity, variety, and veracity (or accuracy), has increasingly been seen as one of the panacea for sustained revenue growth, especially when applied to customer information in the retail sector, but could it now been seen as an Achille’s Heel for many businesses?

GDPR, the new regulation aimed principally at personal data, places data security obligations and other broader requirements on both data ‘controllers’ and ‘processors’. The risks of getting it wrong are substantial, at up to €20m or 4% of global revenue.

Improved cyber security is at the top of the agenda, and, at first sight, it is critical to ensure that there is security throughout the entire value chain. This not only means making all the places secure that data is held, but also how the data is captured (in a multitude of devices), and how that data finds its way to the analytical hub.

Will GDPR spell the end of Big Data or just create another V?

It’s not only a superhuman task, but one where there are multiple areas of risk: In the systems, the processes, and in people themselves. Perhaps 100% security might prove to be impossible, despite everyone’s best efforts. Because of this, it might be time for to rethink our approach to using Big Data, and aim to use less data by introducing a fifth ‘V’ to add to the other four, and that is the ‘V’ of ‘Value’.

We already implicitly know that different types of data have different intrinsic values, dependent on their intended use.  In pensions, for example, insurers take into account whether an applicant is married or not, as those applicants who are married tend to live longer for example, and this changes the pension calculation. (They account for this phenomenon as married couples tend to ‘care and share’, and this apparently extends their lifespan.) But, in comparison, married status is less of an issue when music companies ‘push’ an offer of the latest release before you, as a suggestion for listening, and buying. So, data about married status is more valuable in one business area than another.

In a GDPR environment, it means that companies should decide what data is the most significant, in effect what are the ‘business drivers’. This approach of using less but more relevant data seems not only to help with issues of data security as there are fewer ‘points of leakage’, but is also consistent with two other key elements of the GDPR requirements, those of ‘data minimization’, and of the ‘data retention policy’.

Will GDPR spell the end of Big Data quantities of storage?

Data minimization’ requires organisations to process ‘only that information which it needs’. The data must be ‘adequate, relevant and limited to what is necessary’, but perhaps it may be difficult to be certain about those requirements.

One particular issue, is that it requires organisations to know ‘what is necessary’, which may be easier said than done. Data may need to be collected and analysed to allow understanding of whether it is relevant or not, and identify ‘false positives’ for example.

Take for instance the area of insurance fraud, and more specifically the topic of arson. It was only after collecting data that insurers discovered that arsonists who were pet owners had a tendency to take their pets out of the house before setting fire to it. Who might have foreseen that in advance as a key indicator of arson?

In a different area of the regulation, the ‘data retention policy’ also requires corporations to document a data retention schedule to ensure that they don’t keep the information longer than what is needed.

It seems to be a natural trait to hold onto information, sometimes information that we don’t really need. Think about those old bills and papers at home, and the number of digital photos that we keep, rarely or never to be looked at. (Don’t forget about all the memory and server space being used, with associated energy costs, which are overflowing with old photos and obsolete documents.)

In creating a retention policy, it’s important that organisations don’t overlook other issues, such as the Statute of Limitations which places time limits (or limitations) against certain legal actions. In the case of fraudulent breach of trust, for example, an action can seemingly be pursued indefinitely.

Who can say with certainty whether data and information will be needed at some time in the future to defend a legal action, and how might this impact on creating a retention policy?

Perhaps part of implementing a GDPR strategy might also involve taking appropriate legal advice.

Which GDPR questions do you still need answered?

Even with GDPR on the doorstep, only months away, it seems that there might still be important practical questions to be addressed. With such uncertainty, doesn’t it rest with organisations to show that they are at least following the principles of the regulation, even if following the letter of the law may be a little more complex?

Did regulators really intend that the big firms would need to spend $16m apiece to comply, costs which may well be passed onto the customer who the regulation is meant to protect? The GDPR regulation has echoes of the Solvency II insurance regulation which cost the UK insurance industry alone over £3bn, which some now say was unnecessary and excessive. As insurers asked for more and more clarification, what started as a set of general principles evolved into an onerous set of rules and regulations. Personal privacy is important, of course, but was the cost and complexity really what the authors of the regulation intended?

Perhaps GDPR will signal a change in the use of data, with the ‘Big Data’ era in effect coming to an end, and we might start to enter the ‘Smaller but Significant Data’ era?

Will GDPR spell the end of Big Data? What approach are you taking?

Thanks to Tony for his perspective and for focussing us onto two aspects of GDPR that I’ve not covered as much. Clearly limiting the personal data held is an expected aspect of GDPR (applying principles including purpose limitation, data minimization & storage limitation). The importance of cyber security & potential cost are also an important context to the decisions currently being taken by many leaders I know.

I hope your own work towards GDPR compliance is progressing well. If you would like to see this blog focus on any particular aspects or ‘grey areas’ just let me know. Hopefully our content is helping inform your thinking.