Visibility into the CI/CD Pipeline
This is the tenth post in a series we’re calling CI/CD:DB. Read from the beginning.
Collaboration is central to DevOps practices such as CI/CD. Collaboration is so important within DevOps that it is regarded as one of the core ‘pillars’ of DevOps as embodied by the CALMS acronym (Collaboration, Automation, Lean, Measurement, Sharing). It may seem obvious that a common understanding of a problem and a common framework for sharing solutions to the problem should be a given, but in many organizations that can be difficult.
While we cannot address many of the common causes of weak collaboration in the scope of this blog series, we can discuss how we should use our automated CI/CD pipeline to enable collaboration. Providing visibility into our pipeline and its automated processes will help to ensure that the people working in and around those processes have good situational awareness of where things are and what is going on.
This kind of visibility comes from building our workflows with transparency in mind and making it easy for anyone who wants or needs to understand what is happening in the pipeline to get the information they seek.
Transparency can be unsettling in some organizational cultures. However, without it, the overall organization cannot achieve a critical mass of information about its systems and no one will have a holistic view of how things are actually working. This inevitably leads to inefficiencies and problems.
Consider this truth: Even the smartest people will make the wrong decisions if they have incomplete information. It will be the absolutely correct decision as far as they know, of course, but the key part of the statement is ‘as far as they know’. Think about what that means if all your smart people have only a partial context for all the decisions that they are making every day. How many ‘correct’, but wrong, decisions are happening in your organization every day just because people cannot really see what is going on?
From transparency to actionable visibility
Transparency makes everything visible, but the goal of enabling visibility into processes is to give stakeholders better situational awareness. This means that beyond just seeing things in an environment, they have to be able to put those things in an understandable context and understand their implications on future outcomes.
In database CI/CD terms, this is all about getting insight into the following:
- Identifying the database changes in the pipeline
- Understanding the current status of those changes
- Understanding which activities have happened, or are happening, within the pipeline
Identifying the changes
Before we discuss what information we want to know about a change within a pipeline, we first have to address the basic challenge of identifying individual changes so that we can track them. In order to do that, we need to be able to determine and present enough information about each change at all times so that we can answer some basic questions:
- Where did the change come from?
- What batch is the change a part of?
- What feature is the change associated with?
- When was the change created?
- Who created the change?
If we can answer those questions, we can identify changes and use that identity to help any interested person understand what change or changes they are observing or to help them find a change that they are seeking.
Understanding change status
Next, we have to establish the status of the change within the pipeline. That status generally has to do with its state relative to a given environment. By knowing the relationship of a change to each environment in our pipeline, we can make determinations about the progress it has made from the creation timeframe toward its eventual use in production or removal from the pipeline. That means that for each and every environment, we must be able to determine and list:
- Which changes have been applied to the environment?
- Which changes, if any, failed when attempting to apply them to the environment?
- Are any changes currently undergoing a validation cycle in the environment?
Finally, we have to be able to display information about how the pipeline and its changes have reached their current situation. This means that we have to summarize all of the events that have happened to both the changes and the environments across the whole pipeline. For this, we need to know the following:
- When was each change applied to each environment?
- What were the failure events when changes were applied to each environment?
- Who applied each of those changes in each environment?
- How were the changes applied in each environment?
- When was each environment refresh event?
- Which changes required more than one attempt to get them successfully applied to any environment?
- When did those failures occur?
Visibility is at the heart of effective collaboration regardless of the process, but it is central to DevOps practices such as CI/CD. It does not happen naturally and requires deliberate effort when the process is being designed. Done correctly, transparent processes provide teams good situational awareness and empower people to make better decisions and drive toward goals more efficiently and effectively. In the next post, we’ll cover how to handle problems with database changes.
A Guide for Bringing Database Changes into Your CI/CD Pipeline
Automate BigQuery schema change and version control with database DevOps
Google's BigQuery is a fully managed, serverless cloud data warehouse, or database as a service (DBaaS), that brings unparalleled scalability and convenience to data analytics.