Reliable and Predictable Database Deployments
This is the sixth post in a series we’re calling CI/CD:DB. Read from the beginning.
In Part 5, we wrapped our discussion of bringing Continuous Integration (CI) to database changes. At this point in the discussion, we have a good understanding of what it takes to repeatedly produce batches of database changes that consistently meet known quality standards. Now that we have those batches, we need to do something with them! This post is where we shift our focus to making sure we actually know our batches of changes will actually do what we expect when they get to production.
Rehearsing for production
A key part of the pipeline’s job is to make sure that the deployment process itself is verified. In other words, we are not just testing the change; we are also verifying that we can reliably get those changes into an environment with production-grade reliability. That means that every time we deploy the changes into any environment, it serves as practice — a rehearsal — for what we will eventually do with those changes in production. This even extends to production deployments — the current deployment to production is also a practice session for the next production deployment.
We deliberately use the words ‘rehearsal’ and ‘practice’ when discussing deploying changes. Consider the old adage frequently heard in performing arts and sports: “Amateurs practice until they get it right; professionals practice until they cannot get it wrong”. The goal is to manage and improve your organization’s deployment process for its key applications so that it cannot go wrong.
After our CI process, we have the batch cleanly defined and verified to a known level of quality, so the question is really about what we do to put that defined batch of changes into a target environment. If we expect reliability, that must be a process that is done exactly the same way — with absolutely no variation — everywhere in the pipeline.
The process consists of two primary components:
- Batch handling
- Change application
In order to achieve the level of process integrity we are discussing, both components must work together.
While we have talked about creating immutable batches in the CI process, batch handling is just as important at deployment timeframe. Immutable batches mean that they cannot change once they have been created. Period. That means that nothing in the deployment process can tolerate or, worse, expect intervention with the contents of a batch. So, a batch can only be abandoned and replaced. This principle ensures that if there is a shortcoming it the CI process to where creates batches that need frequent replacement, that shortcoming is quickly identified and fixed. More critically to this topic, it removes a source of variability from the process of delivering changes to environments.
The change application part of the process — where the changes are actually deployed to a target environment — also must be the same every time. This is a very absolute statement and implies automation. Only machines are capable of delivering the kind of consistency that is required. This means an engineered automation process that is able to the following:
- Deal with the expected batch structure variations due to different types of database changes
- Handle target environment differences without requiring batch alterations or process modifications. For example, it will likely need a parameterization capability.
- Alert users when a batch has changed from its creation point
- Describe errors clearly when it encounters them
- Be easily updated when the process itself needs enhancement
Once the process is under control and it is not subject to unexpected changes, you can systematically improve its reliability. This means that you build checks on the outcomes of the process. For database changes, some examples of these checks might be:
- Examining the post-deployment state of the database (i.e., after a batch has been applied) to ensure that it meets the expected outcome.
- Comparing the post-deployment state of a given database to that of another in the same pipeline when that second database had just received the same batch.
- The ability to rerun a given batch on a target database and have nothing changed as a result.
Ultimately, these are examples. Post-deployment checks for a given database or application system will vary with the nature of that database or application system.
A process with strong integrity and reliability is crucial for dealing with database changes and how they are delivered to a target environment. However, the effectiveness of the process will always be dependent on the quality and condition of the target environment. Maintaining a chain of databases so that they are synchronized with each other in a logical progression is a challenging topic. We will be looking into Trustworthy Database Environments in the next post.
Interested in more reliable and predictable database change for your organizations? Contact us.
A Guide for Bringing Database Changes into Your CI/CD Pipeline
Automate BigQuery schema change and version control with database DevOps
Google's BigQuery is a fully managed, serverless cloud data warehouse, or database as a service (DBaaS), that brings unparalleled scalability and convenience to data analytics.