Benchmarking throughput and scalability on Sql Azure Federations

January 25th, 2012 Add Your Comments

Recently I have developed a benchmark for testing the number of queries per second that Sql Azure with its recently launched Federations feature can provide. In this article I will present the methodology I have used for benchmarking, the benchmark results as well as a few personal considerations.

Purpose of the benchmark and disclaimer

There were two main reasons for creating the benchmark whose results are being presented in this article. The first one was that I was curious to see the federations in action and get a firsthand experience on how well they can scale the load on a database.

My second reason was to gather data that would help approximate the amount of resources needed to support different load amounts. As a solution provider you are often asked questions like “what would it take to make our application support X number of users per day?” and while the Azure platform is all about elasticity and about being able to add or remove resources as needed, it would still be nice to be able to provide a very rough approximate figure on the infrastructure an application would need for a requested load.

That being said, please bear in mind that the results published in this article are not intended to be used as a reference but rather as informative figures. Use them only to get an overview. For any critical application make sure you are properly researching and testing before proposing and implementing a solution.

Benchmark and test methodology

The benchmark measures the queries per second (qps) that a database can provide. All queries are represented by a single select operation, selecting just one row for the database. This is in slight contrast with some other benchmarks that are measuring the number of items that a database can read per second, but for that they are retrieving multiple rows in each select operation.

Below is a representation of the infrastructure I have been using:

Benchmark infrastructure

The test database contained only one table, with the structure below:

Column name Column type
Id int primary key
FirstName nvarchar(100)
LastName nvarchar(100)
Age int
Comments nvarchar(max)

The table only contained one index which was the default one created by Sql Azure on the primary key. The table was populated with 500k (500000) rows. The Comments field was filled with a 1024 characters text meaning that the rough size for each row is a just above 2KB.

During the tests the database was split in a variable number of federations which will be detailed when presenting the benchmark results. Sql Azure web edition databases have been used during the tests, all of them with the size limited to 1GB, the smallest size you can purchase, which at the time I am writing this article costs only 9.99$ / month, as described here.

The benchmark execution instances were all part of a single Windows Azure Web Role. For the tests presented in this article only small instances machines have been used, however their number has been increased or decreased during the various benchmark runs.

Each execution instance had the job to execute 100k (100000) select queries and return the time it took to complete the operation. Each select operation retrieved only one row which was selected by its ID, using the query below:

select id, firstName, lastName, age, comments from Person where id = @id

In order to avoid any sort of result caching from Sql Azure, each of the 100k queries was set to select a different row. Each execution instance had range of IDs from which it would make the selects (like 1…100000, 2…200000 etc.), so that at database level queries would be well distributed on the entire table.

Each execution instance used 25 threads to perform the 100k queries, with an equal number of queries distributed between all the threads.

All queries were being issued and their data retrieved by the means of the DataReader class. All the queries retrieved the entire data for each row (that is approximately 2KB), but that data was not deserialized, since that operation isn’t relevant in measuring a database’s throughput. Below is the code that has been used:

cmd = conn.CreateCommand();
cmd.CommandText = @"select id, firstName, lastName, age, comments from Person where id = @id";
cmd.Parameters.Add(new SqlParameter() { ParameterName = "@id", DbType = DbType.Int32, Value = id });
reader = cmd.ExecuteReader();
reader.Read();

result = reader["id"] as Nullable<int>;
return result;

The benchmark master had the role of running the benchmarks on multiple execution instances at the same time and compute the global result. The communication between benchmark master and the benchmark instances was being made through SOAP web services. On each benchmark run there would be a single call between the master and each of the involved machines, making the master very loosely coupled with the instances.

Considering this you might wonder why I needed the master as a machine inside Azure and not use my PC directly. The reason is because from a PC outside Azure you cannot call each of the instances individually. You can only access the Web Role as whole and let the balancer choose which machine would serve the request. Windows Azure is using a Round Robin balancer, which in theory would take care that an equal number of requests get to each machine, but in real life I found out that there are cases when the same machine would be called in two or more consecutive cases.

Since inside Azure, each machine has a local IP, the solution was to create benchmark execution master inside Azure and let it call each of the machines directly by using the local IP.

The benchmark execution master and all the execution instances were all part of the same Azure Web Role. The Web Role and the database were all part of the same Azure datacenter, the one for West Europe.

Benchmark results

This section presents the benchmark results. Please note that for each result displayed there were at least 3 benchmark runs, the displayed result being an average of each run.

 Not federated database

This first test was made on a standard Sql Azure database, without any federation. The database was made of a single table with the structure and number of rows as described above.

Benchmark infrastructure

 Federated database, 2 federations

This test was made on database with an identical structure as above, but with the test table split between two federations. The split was made at row 200001. The test involving a single machine was hitting a single federation, so it is not very relevant since it behaved much like the test on the not federated database. All the other tests had the number of queries equally distributed between the two federations, which in real life would be an ideal situation.

Benchmark infrastructure

Important performance note: My initial way of making the database calls was by creating a new SqlConnection object for each call. However, on a federated database, each time a connection is created, the federation selection also needs to be made, like in the code below:

federationCommand.CommandText = String.Format("USE FEDERATION PersonFederation (FederationId = {0}) WITH RESET, FILTERING = OFF", ID);
federationCommand.ExecuteNonQuery();

The problem is that the above command is quite costly, since in my first test I was only able to obtain 2338 queries per second using 8 executing instances. That is more than 2 times lower than a single testing instance using a not federated database! After that disappointing result, I decided to use a single SqlConnection object for each thread in each executing instance and the results got a lot better. All the results published in this article have been using this approach.

Cihan Biyikoglu, Program Manager in SQL Azure, has recently blogged about the way the USE FEDERATION command is working. As it turns out it is being handled by a layer outside the database, which has multiple nodes and is load balanced. Unfortunately, as my tests revealed, it would be best to reuse connections as often as possible to avoid reselecting the federation.

 Federated database, 4 federations

This test was made on the same database as above, but split in 4 federations. The splits were made at rows 100001, 200001 and 300001. All the tests had the number of queries equally distributed between the four federations, which in real life would be an ideal situation.

Benchmark infrastructure

 Federated database, 4 federations, simple selects

This test was made in identical conditions as the above, with the single difference that the select query didn’t include the Comments column (the one which was 2KB on each row):

select id, firstName, lastName, age from FederatedPerson where id = @id

Because the Comments field represented most of the size of each row, the returned data is now much smaller (somewhere at 60-70 bytes / row).

Benchmark infrastructure

Since each federation is in fact a standalone Sql Azure instance, as long as the queries only hit individual federations, a federated database should scale linearly with the number of federations. Considering this and the results above, we can estimate that a Sql Azure database with 10 federations would be able to easily deliver 100k (100000) queries per second (for simple queries like above). If each federation is smaller in size than 1GB, you can have a database delivering that power for 110$ / month (10$ required for the root database).

Some thoughts on the benchmark results

As mentioned above, each benchmark was run for at least 3 times and the published result is an average for all the runs. During the individual runs I noticed differences in the results as high as 50%, though usually they stayed below 15%. The reason for that is that, as in any virtual infrastructure, the performance of each machine is influenced by the overall load on the physical machine on which it is hosted.

As Web or Worker role machines are concerned, processor, memory and bandwidth are allocated in advance and they should perform quite predictably. When it comes to Sql Azure instances though, things become a bit more complicated, as those instances are in fact running on shared database servers. In this case there is no predefined resource quota that each instance will be using on a server, so from time to time each individual instance might get affected by the current load on all the other instances with which it is sharing the same physical machine. However, I expect the variations in performance to be between average levels.

Comments are closed.