Database Design Best Practices
Behind every good application is good database design. At its simplest, a database holds data. The devil is in the details of how the data is written, stored, and retrieved. With microservices, containers, and cloud considerations, the basics are still important. In fact, they are probably more important than ever.
Use the right tool(s) for the job
I’ll never forget the day my son discovered the thrill of using a hammer. He was a big fan of Bob the Builder and received a plastic set of tools for his birthday. He went for the hammer first. You’ve probably heard “when you have a hammer, everything looks like a nail”. My son saw everything as something that needed to be hammered.
The same thing is true with databases. Many DBAs and developers have a favorite data store and use it as a hammer for everything, even if it’s not necessarily the best database model for your use case.
The database is at the heart of every project. It’s important to take the time to map out the needs of the project and how the database is going to meet them. If you don’t take the time to get the database design right now, you’ll find that any substantial changes in database structure you’ll need to make in the future could have a huge negative impact on the whole project.
It’s so easy to get in the mindset of “just get it done”. Then there is “no time” to go back and fix issues that come up properly. That’s when the hacking starts. Everyone says they’ll go back and fix it later and it never happens.
There are 344 databases currently being ranked at DB-engines. There are a lot of options out there: document stores, time series, key-value, graph, the list goes on and on. It’s very important to not limit your project or application by using just one or two databases because you’re in a hurry. Don’t force a relational model when using something like Elasticsearch makes way more sense.
Side note: You can avoid database lock-in by using a tool for database migrations that works for many types of databases (like Liquibase).
Normalize and/or optimize for queries
Whether you’re using relational databases or NoSQL databases, relationships are important. In either case, you will need to think about how to set up your database, analyzing and prioritizing based on expected queries. For example, let’s say you have a database of books and their authors. Do you organize your database by book where each book contains a subfield with a list of that book’s authors? Or do you do the reverse, storing one document per author with a subfield list of that author’s books? Either way you organize the data, it makes one type of query more efficient.
It’s up to you to know your application and which type of queries will work best. Once you know that, you can design a database to support those queries. Optimizing your databases this way makes development easier and enhances database performance.
Create naming standards
While there are some general rules you should try to follow, the rules themselves don’t matter here as much as writing them down and making them consistent. No one enjoys playing the “I think this is what this means” game. Especially when it comes to data. The way data is interpreted completely revolves around names used here.
For example, database name “T_SLS_TM”. What is that? Remember that you’re not naming a start up in 2010, so you don’t have to drop vowels. It’s much more important to remove ambiguity than vowels to save space.
I crowd-sourced some naming tips from some very smart DBAs here at Datical. Here are their thoughts:
- Make all names meaningful. Tables, columns, objects. All of it should be meaningful and unique.
- Don’t allow the database to auto-generate names. You’ll end up with “SYS$1020000345”. Not helpful.
- Only use ASCII characters. Otherwise, you’ll get errors down the road.
- Decide on whether or not you’ll use singular or plural for table names. For example, should you call the table that stores all of your user information “Users” table or “User” table? If you choose “Users”, make sure to make the other tables plural form also (“Profiles”, “Case Numbers”, etc.).
- Don’t use case-sensitive object names. It will just cause problems in the future.
- Objects should correspond with table names. Outside of “t_” prepended to name, they should match identically the object name. So, if object name in the code is Customer, table should be t_Customer.
- If you’re going to use stored procedures instead of a web service, the name should be something like “getCustomerNameByAddress” or “getCustomerAddressByName”. For writing, “updateCustomerAddress” or “updateCustomerZipCode” should be used.
- If you’re going to use views instead of a web service, prepend with “v_” to differentiate from tables.
- Naming anything with `” ‘` is just icky.
Equally important to good naming standards, is good communication and documentation. By carefully naming your objects, views, and columns, everyone will understand what your database is modeling, but this isn’t enough. If no one knows you have standards (or if they are too hard to find or use) they won’t be followed.
Developers and DBAs aren’t known to have the best relationship, but it is getting better. DevOps is helping the silos come down a bit. One way to improve this further is through good culture and communication. Make sure the most junior developer can follow what’s going on. The format should be easy to use and the information available and up to date.
Understand your automation options
There are new tools available that help make it easy for developers to receive instant feedback their database code and allow them to check their code into source control along with their application code. Liquibase Dynamic Rules Engine automates your organization’s database rules so that they are applied across the board, freeing up their valuable time and making developers more productive.
Manually reviewing database code is tedious work and introduces human error. With today’s software release cadence, it makes sense to automate reviews as much as possible.
Summing it up
It’s important to follow database design best practices from the planning phase all the way through to deployment so your organization avoids downtime and database bottleneck that can delay application releases. We just scratched the surface on some best practices here (there are volumes of blogs and books on the subject). The best teacher by far is experience. Just remember that automation can be your friend and also, don’t GRANT “anything” TO PUBLIC.
Automate BigQuery schema change and version control with database DevOps
Google's BigQuery is a fully managed, serverless cloud data warehouse, or database as a service (DBaaS), that brings unparalleled scalability and convenience to data analytics.