February 18, 2020

Getting practical with a great introduction to DataOps & why it matters

By Paul Laughlin

Chatting with a leading light in the Scottish Data Science scene recently, we both agreed that the term DataOps is coming to the fore.

There has been a focus on DevOps within IT teams over recent years, as well as continued demand for Data Science. So, highlighting the need for this overlapping skillset is not surprising.

However, a new book “Practical DataOps: delivering agile Data Science at scale” brings to life the need for this specialism. The author, Harvinder Atwal, is well qualified to share on this subject. Harvinder is currently CDO for MoneySupermarket Group. He’s also led analytics & data science development in Lloyds Banking Group & dunnhumby.

I’ve worked with Harvinder and know of old both his technical skills & business outcomes focus, so I was very pleased to receive a copy of his book to review. I’m pleased to confirm that I can heartily recommend it to data leaders. Harvinder has delivered a very helpful practitioner leader’s guide to DataOps that fills a real gap in the market.

Too many CDO handbooks or other texts focussed on data leaders either stay too generalist or focus solely on traditional data management approach. I have not come across another book that so expertly weaves Data Science & its data needs as a guide to achieving effective DataOps.

So, let me skim over what you can expect to get from this book and why I’m a fan.

The Problem with Data Science

It’s always helpful as a speaker or author to anchor the audience to a clear problem or challenge. Harvinder uses a wider range of sources to evidence the current problems with Data Science. Including knowledge gaps, lack of support & leadership misconceptions.

The need for a Data Strategy

As a first response to the problem identified in the first chapter, Harvinder highlights the need for a big picture strategy. Starting with better clarity on what is (and is not) a data strategy, he walks the reader through how to produce one. A useful balance of organisational alignment, benchmarking & operational doability.

Learning from Lean Thinking

In the first chapter of Part 2 (“Towards DataOps“), the first focus is on lessons from the emergence of Lean Thinking at Toyota. Harvinder explains how to translate Toyoda’s focus on elimination of waste to the reasons for delays & inefficiency in Data Science delivery.

Mastering Agile Collaboration

I’ve written before on the relevance of Agile Working for Data Science & the need for many Analytics team to have a consistent Methodology. This chapter provides a really useful introduction to Agile methods (including more on XP & SAFe than many include). Agile principles & practices are then applied to the world of DataOps. Well worth applying in data teams, a number of the practical recommendations reminded me of tips from Enda Ridge’s Guerrilla Analytics.

Build Feedback & Measurement

Next, Harvinder moves his attention to metrics. He builds on the wisdom of Systems Thinking & Continous Improvement approaches (inc. Six Sigma). As well as recommending A/B testing & measurement, he shares a number of handy tools. These include a Health Checklist, Starfish & Sailboat retrospective. All of which I can see working in practice and engaging teams with an element of ‘serious play‘.

Building Trust through Data Governance

We’ve shared a number of times in this blog the critical need to gain trust. This chapter is one of the parts which evidences that this is so much more than just a technical book. Facing into the challenges of GDPR and the risks of widespread data usage, Harvinder shares recommended automation of key tasks. He also tackles the all-important stakeholder relationship management aspects of fostering trust through pragmatic data governance.

DevOps for DataOps why they need to be friends

Having working data governance & effective data science processes is one thing. But delay and waste will still be created if there is not a mutually supported route for Data Science products to be tested & promoted to ‘live’ IT-supported systems. This requires a culture & practices that gets the best out of both DataOps & DevOps teams. Harvinder proposes an understanding and approach that help that happen.

Organising your team(s) for DataOps

All too often when seeking to advise on a new function, consultants rush to propose an org chart. It really helps in this book that Harvinder has laid so many strategy, culture & process foundations first. That said, how to organise your people matters for all teams, DataOps included. In this chapter, he stresses the importance of small, self-organising, multiskilled teams. But also the need to be pragmatic about which approach will work in your organisation. Plenty of food for thought.

The Self-Service Organisation

The first chapter in this final section is entitled DataOps technology. I so strongly agree with Harvinder’s opening paragraph, that will quote it verbatim:

Technology is deliberately left until the final chapter because while it is essential, it is less critical than people, culture, and processes. If tools were all it took to be successful, then Silicon Valley giants would not open-source their crown jewels such as Kubernetes, TensorFlow, Apache Kafka, and Apache Airflow.

Practical DataOps, Harvinder Atwal

Well said & completely in line with the warning I have sought to consistently share through posts on analytics software demos & shiny new technology.

That said, in this chapter, Harvinder shares a really useful overview of the required technology ecosystem for today’s DataOps teams. Beyond this, he also points the reader to use Wardley Maps & Technology Radars to keep a watch on tech developments & options.

Leaders with more mature teams will also benefit from reflecting on his final chapter. In ‘The DataOps Factory‘, Harvinder shares a vision of how to bring it all together. Particularly helpful are the diagrams for understanding the DataOps hierarchy of needs & steps toward such a more automated self-service factory. I am a sceptic about most use of self-service analytics but providing this journey of options makes sense.

Do you have a DataOps function, do you understand your need?

DataOps may be a relatively new term in the growing nomenclature for Data teams, but this book makes clear that’s more than hype. I recommend this book to all data leaders and CDO. Especially those tasked with supporting the needs of Data Science teams.

What about you? Do you have a story to tell on understanding & mastering DataOps teams? If so, I’d love to hear from you. Please share your view in the comments below or Contact Me for a possible interview.