Just say no…to poorly thought-out data lakes

Firms are rushing into data lake projects without having concrete strategies in place, says Virginie O’Shea, who outlines questions to consider and why sometimes it’s better to step back and put plans on hold.

Back in the 1980s, British kids were avid viewers of a series that chronicled the lives of teenagers at a secondary school in a North London suburb. As a memory jog for those of you that were around at the time, the opening credits featured a comic book reel of poorly-drawn school sports activities and, inexplicably, a flying sausage (no, I am not joking).

The BBC has always taken its public service projects very seriously and one such project involved Grange Hill characters campaigning against drug addiction (in concert with one of the main characters’ – Zammo – descent into heroin addiction). And how does any of this relate to data lakes, I’m sure you’re wondering. Well, as a part of the public service announcement, the cast sang an off-key melody that told children to “Just Say No” to drugs (though, given the video, they should have also said no to permed mullets and triple denim, but I digress). I think many in the C-suite need to follow Zammo et al’s advice when it comes to poorly thought out data strategies.

Just because everyone else is doing it, doesn’t mean you have to.

This is advice we give to children, but don’t often follow in the business world. How many times have you seen firms rushing to keep up with the competition for fear of missing out (FOMO) on something? Firms often panic about being left behind and rush into projects without having a concrete strategy in place beforehand. There are examples all over the industry, but having spent the best part of the last few years talking to data managers, I can point to data lakes as one area where this has happened, frequently.

Stop, think, listen.

Questions need to be asked before embarking on such a project such as:

  1. How will we use the data we are going to chuck into the lake?
  2. What data sets are important to us as a business?
  3. How do we prevent the lake becoming a swamp of unusable dirty data?
  4. How does this endeavour bring value to the business that key stakeholders can understand and appreciate?
  5. What are the key metrics against which we are going to judge the success of the project?
  6. How are we going to phase the project, so that meaningful change can be evidenced at regular increments?
  7. What happens when your champion exits the building?

On the latter point, there sadly remains a high rate of turnover in the chief data officer (CDO) role within the financial services industry – it’s a hard job and there are many intangible challenges to embedding a data culture into a firm. If your firm is lucky enough to have one such champion in place, make sure the success of the data lake project isn’t predicated on the continued presence of your CDO.

Contingency planning and a robust data governance support structure are key items to be addressed before you think about turning your data into something usable. The success or failure of a data lake seems to be less about the details of the technology used to create the lake, though it does have a bearing on success, and more about the planning around assessing data quality, ensuring you understand the data and its various taxonomies, and spending time building the foundations of a data culture within the business community.

Very few firms stop and consider the bigger picture for the business when it comes to data projects, including how to make the change project relevant for business users. I’ve heard too many data managers that understand the data through and through fail to get the business buy-in they so desperately need to improve what is essentially one of the fundamentals of financial services (i.e. clean data). Telling a C-suite executive that a project will improve the data quality score from a 2 to a 4 on X scale isn’t going to help matters much. Explaining how consistency of a certain data set might enable them to more effectively target a certain market, on the other hand, will garner attention. As long as you don’t get into the weeds of what the project involves. The most successful CDOs can talk the language of data managers and IT, and the language of the business – and, most importantly, act as a translator between the two.

Data lakes may have witnessed a surge in C-suite interest over the last couple of years due to the FOMO effect. But like many such projects, money can be easily wasted on rushing to keep up with the so-called “in crowd”. Sometimes it’s best to listen to wisdom of the cast of Grange Hill and just say no – right now, anyway. Take your time, plan it out properly.