Keeping Test Data Management in Sync with Database Development

November 15, 2021
Keeping Test Data Management in Sync with Database Development

The key to smooth application delivery is aligning the processes of managing new changes to a pipeline of databases and ensuring that the test environments have appropriate datasets. In this blog, I’ll show you how using Liquibase and DATPROF together helps organizations take a strong step toward ensuring their teams are in sync.

Understanding Test Data Management

Test Data Management (TDM) is central to having trustworthy environments to effectively test applications, yet it’s a tricky practice to get right. Trustworthy testing environments require known datasets that represent both the good and bad data the application will see in the real Production environment. TDM is key to ensuring that data is present and in a state where it’s ready to run the test suite so that the results are truly representative and accurate. 

TDM is best described as a practice unto itself because, in addition to managing known datasets, it must deal with privacy and regulatory concerns around data handling. This requires diligence and bespoke tools such as DATPROF that provide teams the capabilities they need to handle the full set of issues around managing test datasets. These can be broken into three main sets of functionality:

  • Data Masking & Synthetic Data Generation – Teams must ensure sensitive data is properly redacted and obfuscated such that nothing is inappropriately disclosed while simultaneously maintaining the datasets at production quality to ensure test accuracy. Additional synthetic test data that is not present in production should be easily generated and added to the test data set. DATPROF solves this with their DATPROF Privacy product.
  • Data Subsetting – It is impractical to maintain a production-size dataset for high-frequency testing. To begin, it is unnecessary – the tests only require representative data. Further, large datasets are unwieldy and can take longer to provision or reset an environment. Finally, maintaining many large datasets incurs unnecessary storage costs. This can be handled by DATPROF Subset.
  • Automation – Central to any DevOps discussion is automation – removing human activity in handling repeated activities. For TDM, this shows up in the need for things like self-service provisioning or refresh of environments, upgrading environments based on incremental schema and dataset changes, and inspecting environments for unexpected differences. DATPROF provides an automation framework in the form of their DATPROF Runtime tool.

Liquibase and TDM

Keeping the TDM dataset ready to go in situations where the application team is rapidly releasing new features is the hard part because of the required revisions and updates to the database structure. This means that the TDM practice must be tightly integrated with the database dev change management practices so that they can stay synchronized and keep the changes flowing smoothly. If they are not synchronized, it is likely that the TDM team will become a bottleneck in delivering the new features.

Synchronizing the database changes from the application team with the activity of the TDM team requires three main things:

  1. Change visibility: Developers (or those creating changes) need to define the changes so they are clearly visible to downstream teams, such as TDM
  2. Consuming changes: The TDM team should be able to consume those changes with automation
  3. Inspecting unexpected changes: The entire team needs the ability to identify if any manual, out-of-band changes are made in any environment

Change Visibility

The easiest way to detect new changes is to do so proactively. Using the DevOps principle of ‘shift left’, it’s easy to catch the changes as they are created. For Liquibase, this is an intrinsic thing — empowering database developers to add new items to a changelog that can then be stored in a shared space, such as a source control system (e.g., Git). 

Once the changelog is shared, the team can use standard automation tooling to inspect the changelog for new changes. The automated check can be performed in two ways:

  • Comparing the changelog to a prior version of itself to identify new additions.
  • Testing the changelog against a target database to see if there are undeployed changes. 

Should the result of either of these tests result in ‘new changes found’,  the automation tool can send notifications out to the broader team – including the TDM team. By integrating a proactive alert into the process, the TDM team can address any gaps in the dataset sooner and faster without altering or disrupting the flow of changes in any way.

Consuming Changes

The TDM team should be able to consume new changes without any special action or friction — that means changes should be immediately executable into a database so they can begin sorting what is needed for a new dataset. Liquibase is very handy for TDM teams because it always processes incremental changes relative to the starting state of the database.

For example, most TDM tools (such as DATPROF Runtime) have a self-service or automated provisioning capability. These tools can reset test environments on-demand to a known point based on a snapshot. However, there is often a period where the latest changes are not in the snapshot, which can cause delays. The solution is to have DATPROF Runtime call Liquibase immediately following the provisioning or reset of an environment to ensure the latest changes are present. This solution requires no special configuration or integration of the tools — it simply exploits their native behaviors to ensure the test teams have exactly the database changes they are expecting.

Inspecting for Unexpected Changes

Finally, there is always the case where changes get added manually or in an unofficial way. These out-of-band changes cause problems for all teams involved. They happen for a number of reasons and must be handled. Fortunately, Liquibase can inspect databases for these types of changes in several ways. 

Liquibase has a diff command that allows it to compare any given target database to either another database or a snapshot of a database (created with Liquibase’s snapshot command). This is a simple check that is easy to automate. Ideally, there should never be unexpected changes in a database, but if there are, Liquibase will report them in a way that is easy for an automation tool to parse. 

This enables several useful patterns. The simplest pattern is where an automation tool, such as DATPROF Runtime, executes Liquibase diff on some interval (e.g., once every couple of hours) against a key database and looks for unexpected changes, and then notifies the TDM team if any are found. More advanced versions might integrate the inspection as a guardrail in the CICD process itself, serving as a check in the deployment workflow.

Better Together

Tight alignment between the processes of managing new changes to a pipeline of databases and ensuring that the test environments have appropriate datasets is key to smooth application delivery. Team productivity increases when there is minimal friction between the two. Using tools like Liquibase and DATPROF together help organizations take a strong step toward ensuring teams are aligned. Learn more about how Liquibase and DATPROF work together.

ARTICLE AUTHOR

Dan Zentgraf
Dan Zentgraf
Director of Solutions Architecture at Liquibase

Dan is a technology professional focused on bringing DevOps practices to database teams. He has worked in the IT space for over 20 years, serving in Product Management, Agile/DevOps consulting & coaching, and Solution Engineering.