Database Performance Problems

Database Performance Problems

Are you suffering from database problems?

Fear not, you’re not alone. A lot of companies are suffering from overburdened database servers. This causes frustration and will ultimately cause you loss of money because you cannot service customers and orders fast enough.

What’s the cause?

When you experience database performance problems, you will typically look at problem areas such as:

  • Poor use of indexes
  • Poor database configuration
  • Scalability problems due to access patterns
  • Naïve use of database triggers
  • Database deadlocks due to transaction boundaries that overlap

While this might get your database to perform better, the performance problems are often a symptom of deeper problems, such as:

  • Inappropriate data models
  • The database is trying to be all things to everybody
  • All business areas are held in the same database
  • You use your database for things that might be kept better elsewhere
  • Still believing that you have to have one common model for everything
When you have a hammer it’s easy to see all problems as nails.

TigerTeam can help you get to the root cause of your database performance problems by doing a full analysis of how the database is used by its client application(s).

From this knowledge we can help you shape your application and database to get the best out of both without having to perform an expensive rewrite.

Investigating the root cause of the problem

Sometimes a performance problem can be fixed by optimizing queries, adding indexes, etc. Performing these optimization is a challenging tasks that typically requires experts in your particular brand of database.

Often performance problems have a much deeper cause than just missing indexes and bad query plans. If you don’t investigate the source of your problem, then just optimizing indexes/etc. equates putting a band-aid on a broken leg. It might look like you’ve solved the problem, but you’re just covering up symptoms of a larger problem.

Are you only curing the symptoms?

Often the source of the problem lies in how the database is structured and how it is being used by its client applications. To get to the root of the problem you have to analyze data access patterns & transactional-boundaries and how they affect your data models and the inherent scalability of the solution.

Examples of common causes of database problems

One big relational model

One model to rule them all?

Relational databases are really good at creating a consistent and logical representation for certain types of information (through data constraints and normalization).

This is very nice for certain use cases, such as creating or updating data, where we want to be sure that data is stored optimally (e.g. 3rd normal form) and invalid data doesn’t occur.

One size fits all

The Query of Despair

The downside of this is that queries typically needs to use several levels of JOINS and/or UNIONS to create a combined data view that fits a web service or web view. Worst case you can end up with queries the can fill an entire white board…

The challenge with one model for both queries, reporting and update (reads & writes) is that the model needs to handle to a lot of different concerns. What should work best: reads or writes?

We can easily end up with something that works poorly for both reads and writes or which favors one at the cost of the other. Handling this requires a different architectural approach such as CQRS.

Are your data relational?

When you have a hammer it’s easy to see all problems as nails. Many companies automatically chose to use a Relational database for data persistence because it’s what they’re accustomed to use.
If your data is a good match for a relational model then that’s a good choice. However experience has proved to us that a lot of business data typically gets shoe-horned into a relational data model because it was the only choice for data persistence, not because it was the best option. With NoSQL databases you can get to chose the right way to store your data being it Key/Value pair, Column Oriented models, Documents or Graphs.

Size DOES matter

Another problem arises when models grow too big. This is typically caused by missing Boundary and Transactional analysis, which results in a system/data-model that takes on too many tasks/concerns. The bigger the model, the harder it is to understand. A big model also tends to have problems with many cross entity relations, which increases the risk of transaction deadlocks. It is also much harder to evolve and scale such a system due to the ACID properties of a relational database (true scalability requires relaxed constraints and consistency, which goes against most relational databases).

To ensure proper system/data boundaries we need to look at how the system works with its data and what transactional boundaries are at play. This is where Analysis and Design methods from Domain Driven Design (DDD) comes in.

One database cluster server multiple Applications Servers

A problem with many database systems and databases is that they can be very hard or expensive to scale beyond the classic 2 server cluster with one master server and a single hot standby server.
It’s not uncommon to see a scaling scenario, such as this with several applications servers and one or two active database servers, for handling ALL the data and queries of the application.
A burning hot database cluster
A common, but also very expensive, solution to this problem is to scale up and buy a much bigger database server with more memory and more CPUs. For some database systems this is not a problem, but for those that charge per CPU this can become a big financial burden in license costs alone. Availability of the system is not improved by this solution as we typically still have the same limited number of database servers (typically 2)
Buy a bigger database Server
Another option is to setup a database Master/Slave system, where you only write to the Master but can read from all Slaves. These solutions typically work by log shipping, which means that the Slave databases will lack behind the Master (also known as Eventual Consistency). This is typically not a problem if you design your solution for it, but if you try to retro fit a Master/Slave solution onto an unknowing application you can experience some unpleasant surprises. The major issue is the inability to read changes you just wrote to the database (due to Slaves lacking behind the Master, so updates written to the Master might not have made it onto the Slave you are reading from). Setting up Master/Slave with failover and new Master selection (in case the cluster determines the master is down) requires skills and at the application level the ability to handle something called split brain.

A variation of this solution is to introduce an in-memory cache, e.g. MemCached, which will serve the role of the slave databases. With the introduction of e.g. MemCached, is that you now need to figure out a synchronization mechanism for writes to the database, so you can populate your in-memory cache with the right data. One way to approach such a synchronization mechanism would be to use Events. EventSourcing in combination with CQRS is one way to do this.

Need more scalability than your relational database allows?

Relational databases are perfect for certain types of applications. For other types of applications that need to provide linear scalability (double the number of servers and get double the performance and storage) or 99,999% availability you need to approach data storage differently. With distributed systems you need to think in terms of the CAP theorem which means you need to decide between instant data consistency and high availability in case of system/network failures. Certain NoSQL databases supports such scenarios much better than your typical Relational Database.

Is all lost?

No, it is not all lost. CQRS, Bounded Contexts and performing a good Domain Analysis are just some of the ways to tackle the root of the problems and not just handling the symptoms. The best part is that can achieve better performance and scalability without having to rewrite your entire application.


No comments yet.