SoftwareOne logo

What is Data Operations DataOps and how to implement it?

SoftwareOne blog editorial team
Blog Editorial Team
A blurry image of a city at night.

The information we gather as organisations can give us a lot of answers, but only if we know how to listen. After exploring the worlds of DevOps and security, let's talk about data. Data is the new oil! But is it just any data?How do you know if the information you are using is the right kind of fuel for your organisation?Every company has three main assets:

  • Organisation: people, culture, the business it operates
  • Data: information it gathers while doing business and from the market
  • Technology enabling the organisation to leverage this data to mitigate business risk.

DevOps and SecOps are about ensuring that technology to meet business goals is deployed and secure. What about the data that flows through the systems?

Why do we gather and process data?

Organisations use data for one of three reasons:

  • Making improvements to their business
  • Developing new offering
  • Inventing business models

How advanced a company is in using its information depends on its maturity stage. But almost all business use cases for data fall into one of these three categories.

The stages of advancement in data use

First, the organisation becomes "data-aware." There is this feeling that there is more to your data than what you are using at the moment. Questions are being asked and answered by digging through the information at individual level in Excel files. There is no standard data model. A lot of time is being spent on finding out how to get the right data from many sources, and use it. When an organisation grows, the amount of information it gathers and processes grows as well. The way data is collected and used also matures. After the initial data processing in silos, there is the next stage. Technology is used to build data warehouses and data marts. New people are on-boarded with the task of building data capabilities. You have, or even are yourself, a Chief Data Officer. The usage of data at the organisation matures still. There are more business cases. More people use it daily to find new insights or control operations. The company becomes "data-guided"! Here's the catch. As the usage of information matures, its maintenance becomes more complicated.

  • The number of data sources and data flows between them grows
  • The growing complexity of data raises questions about it being correct.

Does it sound familiar? Aren't those the same issues that troubled application development? When speed and integration became an issue, we turned to DevOps to solve it. Why not turn to DataOps? Does such a thing even exist? Yes, it does! 

What is data operations, or DataOps?

DataOps is a process, like DevOps, used by data and analytical teams. Its purpose is "to improve quality and reduce the time cycle of data analytics." We got this quote from Wikipedia. Let's try to explain it in a more straightforward way. Think about your organisation. You know its business. You have an idea that based on data, you can improve a process and provide more value for the core business. How do you test and confirm it? What do you need to check it? Your data warehouse test environment? A copy of your production data model and data itself?How long will it take for you to get it? Hours? Days? Weeks?Let's assume you've got a test environment. You developed your change to the data model, and the move is brilliant!Do you want everyone to use it? How do you do it, and how long does it take?Do you have to go through a test environment deployment and manual validation? Would it take days? Weeks?As an alternative: can you commit your change to a code repository? Can you have it deployed to your data warehouse by tomorrow?When the change is deployed, there might be some side-effects. Errors, even. How do you know if your data flow will still make sense from the business perspective? The fact that it works doesn't mean it contains logical and business knowledge.

What else does your current data process involve?

The answers to these questions are what you need to take care of to make sure your business uses the correct data. All of it takes time in the standard model to develop, test, and deploy. All the time that is used for it slows down your business in taking advantage of data. And all of these issues raise your business risk. Ready for a change? This is where DataOps steps in. Data operations is about innovating your value chain of data. It facilitates easy testing and validation of ideas and bringing it as a value to your organisation. Let's discuss how you can introduce it to your processes.

Standardise your data environment

Every journey starts with directions. The common language also helps to navigate it. Build a shared repository of information and practices in your organisation around data engineering. Knowledge is no longer spread across many places and in people's heads. It lives in a shared repository, where everyone can find and update it. What is the goal? Your data team starts to organise and standardise the approach to your solutions. Instead of 100s different ways of doing something, now you have your own data framework! We've built our own Data Domain Framework: a shared repository of best practices, available for everyone working with data projects.

Pro tip: don't aim to standardise everything in one shot. Don't let this effort stop you from doing work. It is a living standard. Build it along the way when delivering value from your data projects.

Turn your data environment into code

You can't speed things up if you don't turn your data projects into code. Everything lives as code nowadays. The Azure cloud makes it much more manageable.

All these are services in Azure. If it is in the cloud, it means it is based on APIs and can be automated. Once you allow your data and infrastructure to process it as code, you may proceed to the next step, which is delivering two crucial elements of your data operations:

  • Orchestrate and test
  • Deploy.

Automate data flow testing and monitoring

Data flows can be described in code deployed into your Azure services. They make building new environments easy. They also provide standardisation and automation of such deployments. Once data flows are treated as code, you can also build automated tests to verify the streams of data and its quality. Instead of spending days on execution of manual data validation tests, you can make them part of the pipeline and execute every day or at every deployment.

It provides so much needed trust in data and saves time. Now you can check if your information is right every time a change is implemented to data flows or models. It also lets you find out if something was changed at the source. And all of this using a single dashboard!

Automate data environment deployments

With your data project living as code, built on standard practices, and automation of tests, you are ready for the next step: deployment. A typical data project deployment is time-consuming and laborious. But after a transformation into a DataOps project, you can deploy it as any other code – with a CI/CD pipeline.

So, what are the benefits of DataOps?

It allows you to build environments. It allows you to execute data flow tests as part of the deployment. Finally, it allows you to merge new changes – to deliver value.

See what your data can do!

With those four elements, you started your organisation's journey into DataOps. Now you are ready to iterate and innovate on data, to reduce your business risk. It also makes your data project fun and interesting for teams working on it. They are no longer SQL people, but the DataOps team!

Let's end with some resources

To help you with your journey into DataOps, here are a few links which will let you dig more into the topic:

  • DataOps Manifesto – read it with your team and find out where you are on this journey
  • DataOps resources on Wikipedia
  • DataKitchen DataOps book and blog – they provide excellent support to get started with data operations
  • DataOps is not DevOps for data – a great blog post to understand the concepts behind DataOps in a nutshell.

If you'd like to take a step back and look more into the topic of DevOps, you can complete this questionnaire to find out how your teams are doing right now and where you could improve

Author

SoftwareOne blog editorial team

Blog Editorial Team

We analyse the latest IT trends and industry-relevant innovations to keep you up-to-date with the latest technology.