Are you suffering from database problems?
Fear not, you’re not alone. A lot of companies are suffering from overburdened database servers. This causes frustration and will ultimately cause you loss of money because you cannot service customers and orders fast enough.
What’s the cause?
When you experience database performance problems, you will typically look at problem areas such as:
- Poor use of indexes
- Poor database configuration
- Scalability problems due to access patterns
- Naïve use of database triggers
- Database deadlocks due to transaction boundaries that overlap
While this might get your database to perform better, the performance problems are often a symptom of deeper problems, such as:
- Inappropriate data models
- The database is trying to be all things to everybody
- All business areas are held in the same database
- You use your database for things that might be kept better elsewhere
- Still believing that you have to have one common model for everything
When you have a hammer it’s easy to see all problems as nails.
Investigating the root cause of the problem
Sometimes a performance problem can be fixed by optimizing queries, adding indexes, etc. Performing these optimization is a challenging tasks that typically requires experts in your particular brand of database.
Often performance problems have a much deeper cause than just missing indexes and bad query plans. If you don’t investigate the source of your problem, then just optimizing indexes/etc. equates putting a band-aid on a broken leg. It might look like you’ve solved the problem, but you’re just covering up symptoms of a larger problem.
Often the source of the problem lies in how the database is structured and how it is being used by its client applications. To get to the root of the problem you have to analyze data access patterns & transactional-boundaries and how they affect your data models and the inherent scalability of the solution.
Examples of common causes of database problems
One big relational model
Relational databases are really good at creating a consistent and logical representation for certain types of information (through data constraints and normalization).
This is very nice for certain use cases, such as creating or updating data, where we want to be sure that data is stored optimally (e.g. 3rd normal form) and invalid data doesn’t occur.
One size fits all
The downside of this is that queries typically needs to use several levels of JOINS and/or UNIONS to create a combined data view that fits a web service or web view. Worst case you can end up with queries the can fill an entire white board…
The challenge with one model for both queries, reporting and update (reads & writes) is that the model needs to handle to a lot of different concerns. What should work best: reads or writes?
We can easily end up with something that works poorly for both reads and writes or which favors one at the cost of the other. Handling this requires a different architectural approach such as CQRS.
Are your data relational?
When you have a hammer it’s easy to see all problems as nails. Many companies automatically chose to use a Relational database for data persistence because it’s what they’re accustomed to use.
If your data is a good match for a relational model then that’s a good choice. However experience has proved to us that a lot of business data typically gets shoe-horned into a relational data model because it was the only choice for data persistence, not because it was the best option. With NoSQL databases you can get to chose the right way to store your data being it Key/Value pair, Column Oriented models, Documents or Graphs.
Size DOES matter
Another problem arises when models grow too big. This is typically caused by missing Boundary and Transactional analysis, which results in a system/data-model that takes on too many tasks/concerns. The bigger the model, the harder it is to understand. A big model also tends to have problems with many cross entity relations, which increases the risk of transaction deadlocks. It is also much harder to evolve and scale such a system due to the ACID properties of a relational database (true scalability requires relaxed constraints and consistency, which goes against most relational databases).
To ensure proper system/data boundaries we need to look at how the system works with its data and what transactional boundaries are at play. This is where Analysis and Design methods from Domain Driven Design (DDD) comes in.
One database cluster server multiple Applications Servers
It’s not uncommon to see a scaling scenario, such as this with several applications servers and one or two active database servers, for handling ALL the data and queries of the application.
A variation of this solution is to introduce an in-memory cache, e.g. MemCached, which will serve the role of the slave databases. With the introduction of e.g. MemCached, is that you now need to figure out a synchronization mechanism for writes to the database, so you can populate your in-memory cache with the right data. One way to approach such a synchronization mechanism would be to use Events. EventSourcing in combination with CQRS is one way to do this.
Need more scalability than your relational database allows?
Relational databases are perfect for certain types of applications. For other types of applications that need to provide linear scalability (double the number of servers and get double the performance and storage) or 99,999% availability you need to approach data storage differently. With distributed systems you need to think in terms of the CAP theorem which means you need to decide between instant data consistency and high availability in case of system/network failures. Certain NoSQL databases supports such scenarios much better than your typical Relational Database.
Is all lost?
No, it is not all lost. CQRS, Bounded Contexts and performing a good Domain Analysis are just some of the ways to tackle the root of the problems and not just handling the symptoms. The best part is that can achieve better performance and scalability without having to rewrite your entire application.