June 16, 2020

Pipeline Design Goals & Challenges

This is the second post in a series we’re calling CI/CD:DB. Read from the beginning.

In the first blog of this series, we talked about bringing databases into the CI/CD discussion along with application changes. In this part, we are going to take our general knowledge of pipelines, apply the five principles we discussed in blog one, and use that combination to set a plan.

3 Stages of software delivery pipelines

To begin, let’s agree that software delivery pipelines all generally break down into three stages:

  1. Creating the needed change
  2. Validating that the change is “production-ready”
  3. Delivering the change

As soon as the change is created in the first stage, it represents the potential to achieve the value put forth by the business need that caused it to be created in the first place. That potential value is not realized, however, until the third stage — when it is actually being used by the customers/users of the system. Which leaves the second stage, Validation, as an apparent impediment or bottleneck to realizing value. That view is too simplistic. There are actually bottlenecks throughout the pipeline. Long validation processes are often symptoms as much as they are causes of problems. Either way, our new pipeline structure must focus on ensuring that changes spend no more time than truly necessary in the validation stage.

Principles of CI/CD

Next, let’s go a bit deeper into the CI/CD principles mentioned in blog 1. They help establish an interrelated way of thinking about the pipeline in order to ensure that validation work is minimized — regardless of whether the bottleneck in question is a direct cause or symptom of a deficiency elsewhere in the pipeline. Let’s consider them in this context.

  • Build quality in
    This principle is obvious — if you can consistently do something correctly the first time, you will be more efficient than someone who has to do it multiple times. And you will be vastly more efficient if finding out that it is broken requires hours or days of time and other people’s labor to figure it out. Building trust in the consistency of initial quality means whole swaths of validation can be reduced or eliminated.
  • Work in small batches
    A small number of changes takes less time to assess for impact, to troubleshoot if it is problematic, and to correct. Consider what it would take to figure out which of 100 changes is causing a problem when compared to what it would take to figure out a problem if there is only one change in flight. It is a lot faster to correct as well. A small-batch approach means that the actual task of validating individual changes gets much simpler and carries less overhead.
  • Computers perform repetitive tasks; people solve problems
    This principle is all about using automation intelligently. This is often mistaken for just test automation. It is not. Automation applies to EVERY repetitive task in the pipeline to provide consistency and speed while minimizing the opportunity for human error and the need to wait for handoffs.
  • Relentlessly pursue continuous improvement
    This is about continuously optimizing the processes. A highly automated process that is moving small, easily tracked batches is much easier to measure. These measurements can then be more quickly applied to identify and remediate flow problems which improve the overall efficiency of the pipeline.
  • Everyone is responsible
    The notion of responsibility in this context is not about culpability for problems, but rather about the fact that everyone is empowered to improve the process as a whole. That whole includes parts each individual is directly responsible for as well as all the other parts. They are therefore responsible for how every process tweak impacts efficiency. They are responsible for making sure that everyone has enough information to make good decisions and get the changes right the first time. And so on.

Bringing CI/CD to the database

By combining the goal of creating a low friction, low validation pipeline flow with the five principles, we can focus our design efforts and identify a series of questions we will have to answer to bring CI/CD flow to our database changes:

  1. How do we bring database changes from multiple creators together to define a batch to be processed?
  2. How do we empower change creators with a self-service facility to determine whether their database change is acceptable and learn what that means?
  3. How do we use that self-service facility to evolve our definition of “good enough to test” and therefore the quality of database changes coming into the validation cycle?
  4. How do we make the validation process itself a rugged and highly tested asset?
  5. How do we ensure that the infrastructure that underpins our pipeline is trustworthy?
  6. How do we equip our change creators with the tools they need to create the best database changes?
  7. How do we provide safety and guardrails to identify when the pipeline itself has a problem?
  8. How do we track the progress of our database changes through the pipeline?
  9. How do we handle problems with database changes once they get into the pipeline?
  10. How do we measure our pipeline effectiveness and efficiency?

This list of questions will shape how we redesign our database change pipeline and will be cumulative as we move from left (change creation) to right (production). So, each of the next posts in the series will address one of these questions beginning with blog 3, where we look at defining batches.

Share on: