SQL Server I/O Performance

Get Started Today!

SQL Server I/O performance is crucial to overall performance. Access to data on disk is much slower than from memory, so getting the most out of local disk and SAN is essential.

There is a lot of advice on the web and in books about SQL Server I/O performance, but I haven't found a single source listing everything to consider. This is my attempt to bring all the information together in one place.

So here is a list of everything I can think of that can impact I/O performance. I have ordered it starting at the physical disks and moving up the wire to the server and finally the code and database schema.

Failed Disk

When a drive fails in a disk array it will need to be replaced. The impact on performance before replacement depends on the storage array and RAID configuration used.

RAID 5 and RAID 6 use distributed parity, and this parity is used to calculate the reads when a disk fails. Read performance loses the advantage of reading from multiple disks.

This is also true, although to a lesser degree, on RAID 1 (mirrored) arrays. Reads lose the advantage of reading from multiple stripes for data on the failed disk, and writes may be slightly slower due to the increase in average seek time.

Write Cache

When a transaction is committed, the write to the transaction log has to complete before the transaction is marked as being committed. This is essential to ensure transactional integrity.

It used to be that write cache was not recommended, but a lot of the latest storage arrays have battery-backed caches that are fully certified for use with SQL Server.

If you have the option to vary the distribution of memory between read and write cache, try to allocate as much as possible to the write cache. This is because SQL Server performs its own read caching via the buffer pool, so any additional read cache on the disk controller has no benefit.

Description of caching disk controllers in SQL Server

Thin Provisioning

Thin provisioning is a technology provided by some SANs whereby the actual disk storage used is just enough for the data, while appearing to the server to be full sized, with loads of free space.

Where the total disk allocated to all servers exceeds the amount of physical storage, this is known as over-provisioning.

Some SAN vendors try to claim that performance is not affected, but that's not always true.

I saw this issue recently on a 3PAR array. Sequential reads were significantly slower on thin provisioned LUNs. Switching to thick provisioned LUNs more than doubled the sequential read throughput.

Where Are The Disks?

Are they where you think they are?

It is perfectly possible to be connected to a storage array, but for the I/O requests to pass through that array to another. This is sometimes done as a cheap way to increase disk space - using existing hardware that is being underutilized is less costly than purchasing more disks.

The trouble is that this introduces yet another component into the path and is detrimental to performance - and the DBA may not even be aware of it.

Make sure you know how the SAN is configured.

Smart Tiering

This is called different things by different vendors. The storage array will consist of two or more types of disk, of varying performance and cost.

There are the slower 10K disks - these are the cheapest.

Then you have the 15K disks. These are faster but more expensive.

And then there may be some super-fast SSDs. These are even more expensive, although the price is coming down.

Smart tiering migrates data between tiers so that more commonly accessed data is on the faster storage while less commonly used data drops down to the slower storage.

This is OK in principle, but you are the DBA. You should already know which data needs to be accessed quickly and which can be slower. Do you really want an algorithm making this decision for you?

And regular maintenance tasks can confuse the whole thing anyway. Consider a load of index rebuilds running overnight. Let's suppose the last database to be processed is an archive database - do you want this is to be hogging the SSD when the users login first thing in the morning, while the mission critical database is languishing down in the bottom tier?

This is an oversimplification, of course. The tiering algorithms are more sophisticated than that, but my point stands. You should decide the priorities for your SQL Server data. Don't let the SAN vendors (or storage admins) persuade you otherwise.

Storage Level Replication

Storage level replication is a disaster recovery feature that copies block level data from the primary SAN to another - often located in a separate data center.

The SAN vendors claim no impact on performance, and this is true if correctly configured. But I have seen poorly configured replication have a serious impact on performance.

One client suffered a couple of years of poor I/O performance. When I joined them I questioned whether the storage replication was responsible. I was told not to be so silly - the vendor has checked and it is not the problem - it must be SQL Server itself! A few months later I was contacted again - they had turned off the replication while in the process of moving to a new data center and guess what? Write latency improved by an order of magnitude.

Let me repeat that this was caused by poor configuration and most storage replication does not noticeably affect performance. But it’s another thing to consider if you're struggling with SQL Server I/O performance.

Host Bus Adapters

Check that the SAN and HBA firmware are compatible. Sometimes when a SAN is upgraded, the HBAs on the servers are overlooked. This can result in irregular errors, or even make the storage inaccessible.

Have a look at the HBA queue depth. A common default is 32, which may not be optimal. Some studies have shown that increasing this to 64 or higher can improve performance. It could also make things worse, depending on workload, SAN make and model, disk layout, etc. So test thoroughly if you can.

Some storage admins discourage modifying HBA queue depth as they think everyone will want the same on their servers and the storage array will be swamped. And they’re right, too! Persuade them that it is just for you. Promise not to tell anyone else. Whatever. Just get your extra queue depth if you think it will benefit performance.

Too Many Servers

When a company forks out a small fortune on a storage area network, they want to get value for money. So naturally, every new server that comes along gets hooked up so it can make use of all that lovely disk space.

This is fine until a couple of servers start issuing a lot of I/O requests and other users complain of a performance slowdown.

This is something I see repeatedly at so many clients, and there is no easy solution. The company doesn't want or can't afford to purchase another SAN. If you think this is a problem for you, put a schedule together of all jobs - across all servers - and try to reschedule some so that workload is distributed more evenly.

Partition Alignment and Formatting

I will briefly mention partition alignment, although Windows 2008 uses a default offset of 1MB so this is less of an issue than it used to be.

I am also not convinced that a lot of modern SANs benefit much from the practise. I performed a test on an EVA a few years ago and found just a 2% improvement.

Nevertheless, a few percent is still worth striving for. Unfortunately you will have to tear down your volumes and recreate your partitions if this is to be fixed on an existing system. This is probably not worth the hassle unless you are striving for every last inch of performance.

For more information see Disk Partition Alignment Best Practices for SQL Server.

Formatting is something else that should be performed correctly. SQL Server stores data in 8KB pages, but these are retrieved in blocks of 8, called extents. If the disks are formatted with 64KB allocation units, this can have a significant performance benefit.

Multipathing

If you are not using local disk then you should have some redundancy built into your storage subsystem.

If you have a SAN you have a complicated network of HBAs, fabric, switches and controllers between SQL Server and the disks. There should be at least two HBAs, switches, etc., and these should all be connected together in such a way that there are multiple paths to the disks.

This redundancy is primarily for high availability, but if the multipathing has been configured as active/active you may see performance benefits as well.

SAN Multipathing Part 1: What are Paths?

Network Attached Storage

Since SQL Server 2008 R2 it has been possible to create, restore or attach a database on a file share. This has a number of possible uses, and particularly for dev/test environments it can make capacity management easier, and make moving databases between servers much quicker.

The question to be asked, though, is "Do you really want this in production?"

Performance will not be as good as local or SAN drives. There are additional components in the chain, so reliability may not be as good. And by using the network, your data uses the same infrastructure as all the other TCP/IP traffic, which again could impact performance.

But there's good news! While availability is still a worry, improvements in SMB on Windows Server 2012 (and via an update to WIndows Server 2008 R2) have made it significantly faster. I saw a quote from a Microsoft employee somewhere that claimed 97% of the performance of local storage. I can't find the quote now, and I don't remember if he was measuring latency or throughput.

Disk Fragmentation

How often do you use the Disk Defragmenter tool on your PC to analyze and defragment your C: drive?

How often do you check fragmentation on the disks on your SQL Servers? For most people that is nowhere near as often, I'll bet.

Yet volume fragmentation is just as detrimental to SQL Server performance as it is to your PC.

You can reduce the likelihood of disk fragmentation in a number of ways:

  • Pre-size data and log files, rather than rely on auto-growth;
  • Set auto-growth increments to sensible values instead of the default 10%;
  • Avoid shrinking data and log files;
  • Never, ever use the autoshrink database option;
  • Ensure disks are dedicated to SQL Server and not shared with other applications;

You can check fragmentation using the same tool as on your PC. Disk Defragmenter is available on all server versions of Windows.

Another way to check is via the Win32_Volume class in WMI. This bit of PowerShell reports the file percent fragmentation for all volumes on a given server.

gwmi -cn YOURSERVERNAME -class Win32_Volume | select Name, @{n="FilePercentFragmentation"; e={$_.DefragAnalysis().DefragAnalysis.FilePercentFragmentation}} | ft -auto

If you have significant fragmentation there are a couple of ways to fix it. My preferred option is as follows, but requires some downtime.

  1. Stop the SQL services;
  2. Backup the files on the disk (especially mdf, ndf and ldf files - better safe than sorry);
  3. Run the Windows Disk Defragmenter tool;
  4. Start the SQL services;
  5. Check the error log to ensure no errors during startup;
  6. Run CHECKDB against all databases (except tempdb). I've never seen the defrag tool cause corruption, but you can't be too careful;

Another option that doesn't require downtime is to use a third party tool such as Diskeeper. This can be very effective at fixing and preventing disk fragmentation, but it costs money and uses a filter driver - see my comments below.

Filter Drivers

A filter driver is a piece of software that sits between an I/O request and the write to disk. It allows the write to be examined and rejected, modified or audited. The most common type of filter driver is installed by anti-virus software.

You do not want anti-virus software checking every single write to your database files. You also don't want it checking your backups either, or writes to the error log, or default trace.

If you have AV software installed, you can specify exclusions. Exclude all folders used by SQL Server, plus the drives used by data and log files, plus the folders used for backups.

Even better is to turn off online AV checking, and schedule a scan at a quiet time.

OLTP and BI on the Same Server

It is rare to find a system that is purely OLTP. Most will have some sort of reporting element as well. Unfortunately, the two types of workload do not always coexist happily.

I've been reading a lot of articles by Joe Chang, and in one article he explains why this is the case. Essentially, OLTP query plans retrieve rows in small batches (less than a threshold of 25 rows) and these I/O requests are handled synchronously by the database engine, meaning that they wait for the data to be retrieved before continuing. Large BI workloads and reporting queries, often with parallel plans, issue asynchronous I/O requests and take full advantage of the HBA ability to queue requests. As a result, the OLTP requests have to queue up behind the BI requests, causing OLTP performance to degrade significantly.

Auto-grow and Instant File Initialization

It is good to have auto-grow enabled, just as a precaution, although you should also pre-size data and log files so that it is rarely needed.

However, what happens if a data file grows and you don't have instant file initialization enabled? Especially if the auto-grow is set too big. All I/O against the file has to wait for the file growth to complete, and this may be reported in the infamous "I/Os taken longer than 15 seconds to complete" message in the error log.

Instant initialization won't help with log growth, so make sure log auto-growth increments are not too high.

For more information about instant file initialization and how to enable it, see this link Database File Initialization.

And while on the subject of auto-grow, see the section on proportional fill, below.

Transaction Log Performance

How long do your transaction log writes take? Less than 1ms? More than 5ms?

Look at virtual file stats, performance counters, or the WRITELOG wait time to see if log write latency is an issue for you.

Writes to the transaction log are sequential, and so the write head on the disk should ideally be where it was from the last log write. This means no seek time, and blazingly fast write times. And since a transaction cannot commit until the log has hardened to disk, you rely on these fast writes for a performant system.

Advice for years has been for the transaction log for each database to be on its own disk. And this advice is still good for local disk, and for some storage arrays. But now that a lot of SANs have their own battery-backed write cache, this advice is not as critical as it used to be. Provided the cache is big enough to cope with peak bursts of write activity (and see my earlier comments about allocating more cache to writes than to reads) you will get very low latency.

So what if you don't have the luxury of a mega-bucks SAN and loads of write cache? Then the advice that's been around since the 1990s is still valid:

  • One transaction log file per database on it's own drive;
  • RAID 1, RAID 1+0 or RAID 0+1;

So assuming you are happy with your log file layout, what else could be slowing down your log writes?

Virtual Log Files

Although a transaction log is written to sequentially, the file itself can become fragmented internally. When it is first created it consists of several chunks called virtual log files.

Every time it is grown, whether manually or automatically, several more virtual log files are added. A transaction log that grows multiple times can end up with thousands of virtual log files.

Having too many VLFs can slow down logging and may also slow down log backups. You also need to be careful to avoid VLFs that are too big. An inactive virtual log file is not cleared until the end is reached and the next one starts to be used. For full recovery model, this doesn't happen until the next log backup. So a log backup will suddenly have a lot more work to, and may cause performance problems while it takes place.

The answer for a big transaction log is to set an initial size of maximum 8000MB, and then manually grow in chunks of 8000MB up to the target size. This results in maximum VLF size of 512MB, without creating an excessively large number of VLFs.

Note: this advice is for manual growth only. Do not auto grow by 8000MB! All transactions in the database will stop while the extra space is initialised. Autogrow should be much smaller - but try to manually size the file so that auto grow is unlikely to be needed.

Transaction Log VLFs – too many or too few?

And see here for an obscure bug: Bug: log file growth broken for multiples of 4GB.

Log Manager Limits

The database engine sets limits on the amount of log that can be in flight at any one time. This is a per-database limit, and depends on the version of SQL Server being used.

SQL Server limits the number of outstanding I/Os and MB per second. The limits vary with version and whether 32 bit or 64 bit. See Diagnosing Transaction Log Performance Issues and Limits of the Log Manager for more details.

This is why the write latency should be as low as possible. If it takes 20ms to write to the transaction log, and you are limited to 32 I/Os in flight at a time, that means a maximum of 1600 transactions per second, well below what a lot of high volume OLTP databases require.

This also emphasises the importance of keeping transaction sizes small, as one very large transaction could conceivably hold up other transactions while it commits.

If you think these limits are affecting log write performance in your databases there are several ways to tackle the problem:

  • Work on increasing log write performance;
  • If you have minimally logged operations you can switch the database to use the BULK LOGGED recovery model. Careful though - a log backup containing a minimally logged operation has to be restored in full. Point in time restore is not possible.
  • Split a high volume database into 2 or more databases, as the log limits apply per database;

Non-Sequential Log Activity

There are actions performed by the database engine that move the write head away from the end of the log file. If transactions are still being committed while this happens, you have a seek overhead and log performance gets worse. Operations that read from the log files include rollback of large transactions, log backups and replication (the log reader agent).

There is little you can do about most of these, but avoiding large rollbacks is something that should be tackled at the design and development stage of an application.

Proportional Fill

Very active tables can be placed in a file group that has multiple data files. This can improve read performance if they are on different physical disks, and it can improve write performance by limiting contention in the allocation pages (especially true for tempdb).

You lose some of the benefit, though, if you don't take advantage of the proportional fill algorithm. Proportional fill is the process by which the database tries to allocate new pages in proportion to the amount of free space in each data file in the file group.

To get the maximum benefit make sure that each file is the same size, and is always grown by the same increment. This is for both manual and auto growth.

One thing to be aware of is how the auto growth works. SQL Server does its best to fill the files at the same rate, but one will always fill up just before the others, and this file will then auto grow on its own. This then gets more new page allocations than the others and becomes a temporary hotspot until the others also auto grow and catch up.

This is unlikely to cause problems for most databases, although for tempdb it may be more noticeable. Trace flag 1117 causes all data files in a file group to grow together, so is worth considering if this is an issue for you. Personally I would rather manually size the files so that auto growth isn't necessary.

tempdb Configuration

Let's start with a few things that everybody agrees on:

  • tempdb files should be placed on the fastest storage available. Local SSD is ideal, and from SQL Server 2012 this is even possible on a cluster;
  • Pre-size the data and log files, as auto growth may cause performance issues while it occurs;
  • New temporary objects are created all the time, so contention in the GAM, SGAM and PFS pages may be an issue in some environments;

And now some differences of opinion:

  • There is loads of advice all over the web to create one tempdb data file per core to reduce allocation contention. Paul Randall disagrees (A SQL Server DBA myth a day: (12/30) tempdb should always have one data file per processor core). He says that too many files can actually make things worse. His solution is to create fewer files and to increase only if necessary;
  • There is more advice, often repeated, to separate tempdb files from other databases and put them on their own physical spindles. Joe Chang disagrees and has a very good argument for using the common pool of disks. (Data, Log and Temp file placement). I'll leave you to decide what to do!

AutoShrink

The AutoShrink database option has been around ever since I started using SQL Server, causing lots of performance problems for people who have enabled it without fully realising what it does. Often a third party application will install a database with this option enabled, and the DBA may not notice it until later.

So why is it bad? Two reasons:

  • It is always used in conjunction with auto grow, and the continuous cycle of grow-shrink-grow causes a huge amount of physical disk fragmentation. I've already covered that topic earlier in this article;
  • While it performs the shrink there is a lot of additional I/O, which slows down the system for everything else;

Disable it. Allocate enough space for the data and log files, and size them accordingly. And don't forget to fix all that fragmentation while you're at it.

Turn Off AutoShrink!

Insufficient Memory

This is an article about SQL Server I/O performance, not memory. So I don't want to cover it in any detail here - that is a subject for a different article.

I just want to remind you that SQL Server loves memory - the more the better. If your entire database(s) fits into memory you'll have a much faster system, bypassing all that slow I/O.

Lack of memory can lead to dirty pages being flushed to disk more often to make space for more pages being read.

Lack of memory can also lead to increased tempdb I/O, as more worktables for sort and hash operations have to spool to disk.

Anyway, the point of this section is really to make one statement: Fill your servers with as much memory as you can afford, and as much as the edition of SQL Server and Windows can address.

SQL Server 2014 has a new feature allowing some tables to be retained in memory, and accessed via natively compiled stored procedures. Some redesign of some of your existing code may be needed to take advantage of this, but it looks like a great performance boost for those OLTP systems that start to use it.

SQL Server In-Memory OLTP Internals Overview for CTP2

High Use of tempdb

tempdb can be a major consumer of I/O and may affect overall performance if used excessively.

It is worth looking at the various reasons for its use, and examining your system to ensure you have minimized these as far as possible.

User-created temporary objects

The most common of these are temporary tables, table variables and cursors. If there is a high rate of creation this can lead to allocation page contention, although increasing the number of tempdb data-files may partially alleviate this.

Processes creating very large temporary tables or table variables are a big no-no, as these can cause a lot of I/O.

Internal Objects

The database engine creates work-tables in tempdb for handling hash joins, sorting and spooling of intermediate result sets.

When sort operations or hash joins need more memory than has been granted they spill to disk (using tempdb) and you will see Hash warnings and Sort warnings in the default trace.

I originally wrote a couple of paragraphs about how and why this happens and what you can do to prevent it, but then I found this post that explains it much better - Understanding Hash, Sort and Exchange Spill Events.

Version Store

The third use of tempdb is for the version store. This is used for row versioning.

Row versions are created when snapshot isolation or read committed snapshot option is used. They are also created during online index rebuilds for updates and deletes made during the rebuild and for handling data modifications to multiple active result sets (MARS).

A poorly written application (or rogue user) performing a large update that affects many thousands of rows when a row versioning based isolation level is in use may cause rapid growth in tempdb and adversely impact I/O performance for other users.

Table and Index Scans

A table scan is a scan of a heap. An index scan is a scan of a clustered or non-clustered index. Both may be the best option if a covering index does not exist and a lot of rows are likely to be retrieved.

A clustered index scan performs better than a table scan - yet another reason for avoiding heaps! But what causes a scan to be used in the first place, and how can you make a seek more likely?

Out of date statistics

Before checking indexes and code, make sure that statistics are up to date. Enable "auto create statistics". If "auto update statistics" is not enabled make sure you run a manual statistics update regularly. This is a good idea even if "auto update statistics" is enabled, as the threshold of approximately 20% of changed rows before the auto update kicks in is often not enough, especially where new rows are added with an ascending key.

Index Choice

Sometimes an existing index is not used. Have a look at improving its selectivity, possibly by adding additional columns, or modifying the column order.

Consider whether a covering index could be created. A seek is more likely to be performed if no bookmark lookups will be needed. See these posts on the "tipping point" by Kimberly Tripp. The Tipping Point.

Inefficient TSQL

The way a query is written can also result in a scan, even if a useful index exists. Some of the reasons for this are:

Non-sargable expressions in the WHERE clause. "sarg" means Simple ARGument. So move calculations away from the columns and onto the constants instead. So for example, this will not use the index on OrderDate:

WHERE DATEADD(DAY, 1, OrderDate) > GETDATE()

Whereas this will use an index if it exists (and it is selective enough):

WHERE OrderDate > DATEADD(DAY, -1, GETDATE())

Implicit conversions in a query may also result in a scan. See this post by Jonathan Kehayias Implicit Conversions that cause Index Scans.

Bad Parameter Sniffing

Parameter sniffing is a good thing. It allows plan re-use and improves performance. But sometimes it results in a less efficient execution plan for some parameters.

Here's a great explanation, using an elephant and a mouse! The Elephant and the Mouse, or, Parameter Sniffing in SQL Server

Index Maintenance

Every index has to be maintained. I'm not talking about maintenance plans, but about the fact that when rows are inserted, deleted and updated, the non-clustered indexes also have to be changed. This means additional I/O for each index on a table.

So it is a mistake to have more indexes than you need. Check that all indexes are being used. Check for duplicates and redundant indexes (where the columns in one are a subset of the columns in another). Check for indexes where the first column is identical but the rest are not - sometimes these can be merged. And of course, test, test, test.

Index Fragmentation

Index fragmentation affects I/O performance in several ways.

  • Range scans are less efficient, and less able to make use of read-ahead reads;
  • Empty space created in the pages reduces the density of the data, meaning more read I/O is necessary;
  • The fragmentation itself is caused by page splits, which means more write I/O;

There are a number things that can be done to reduce the impact of fragmentation, or to reduce the amount of fragmentation.

  • Rebuild or reorganize indexes regularly;
  • Specify a lower fill factor so that page splits occur less often (though not too low, see below);
  • Change the clustered index to use an ascending key so that new rows are appended to the end, rather than inserted in a random place in the middle;

Here is an excellent way of tracking page splits: Tracking Problematic Page Splits in SQL Server 2012 Extended Events.

Forwarded Records

When a row in a heap is updated and requires more space, it is copied to a new page. But non-clustered indexes are not updated to point to the new page. Instead, a pointer is added to the original page to show where the row has moved to. This is called a forwarding pointer, and there could potentially be a long chain of these pointers to traverse to find the eventual data. Naturally, this means more I/O.

A heap cannot be defragmented by rebuilding the index (there isn't one!). The only way to do this is to create a clustered index on the heap, and then drop it afterwards. Be aware that this will cause all non-clustered indexes to be rebuilt twice - once for the new clustered index, and again when it is dropped. If there are a lot of these it is a good idea to drop the non-clustered indexes first, and recreate them afterwards.

Better still is to avoid heaps where possible. I accept there may be cases where they are the more efficient choice (inserting into archive tables, for example), but always consider whether a clustered index would be a better option - it usually is.

Wasted Space

In an ideal world every data page on disk (and in memory) would be 100% full. This would mean the minimum of I/O is needed to read and write the data.

In practise, there is wasted space in nearly all pages - sometimes a very high percent - and there are a lot of reasons why this occurs.

Low fill factor

I've mentioned fill factor already. If it is too high, and page splits are occurring when rows are inserted or updated, it is sensible to rebuild the index with a lower fill factor. However, if the fill factor is too low you may have a lot of wasted space in the database pages, resulting in more I/O and memory use. This is one of those "suck it and see" scenarios. Sometimes a compromise is needed.

Page splits

This is also discussed above. But as well as fragmentation, page splits can also result in wasted space if the empty space is not reused. The solution is to defragment by rebuilding or reorganizing indexes regularly.

Wasteful Choice of Data Types

Use the smallest data types you can. And try to avoid the fixed length datatypes, like CHAR(255), unless you regularly update to the longest length and want to avoid page splits. The reasoning is simple. If you only use 20 characters out of 200, that is 90% wasted space, and more I/O as result. The higher density of data per page the better.

Lazy thinking might make developers create AddressLine1, AddressLine2, etc as CHAR(255), because they don't actually know what the longest should be. In this case, either do some research, find out that the longest is 50 characters (for example) and reduce them to CHAR(50), or use a variable length data type.

Schema Design

I've already mentioned choice of data types above, but there are other schema design decisions that can affect the amount of I/O generated by an application database.

The most common one is designing tables that are too wide. I sometimes see a table with 20, 30, 50, even 100 columns. This means fewer rows fit on a page, and for some extreme cases there is room for just one row per page - and often a lot of wasted space as well (if the row is just slightly wider than half a page, that's 50% wasted).

If you really do need 50 columns for your Customer table, ask yourself how many of these are regularly accessed. An alternative is to split into 2 tables. Customer, with just a few of the commonly used columns, and CustomerDetail with the rest. Of course, the choice of which columns to move is important. You don't want to start joining the tables for every query as that defeats the object of the exercise.

Page or Row Compression

Compression is another way of compacting the data onto a page to reduce disk space and I/O.

Use of row or page compression can dramatically improve I/O performance, but CPU usage does increase. As long as you are not already seeing CPU bottlenecks, compression may be an option to consider.

Be aware that compression is an Enterprise edition feature only.

Data Compression

Backup Compression

Since SQL Server 2008 R2, backup compression has been available on Standard edition as well as Enterprise. This is major benefit and I recommend that it be enabled on all instances. As well as creating smaller backups, it is also quicker and means less write I/O. The small increase in CPU usage is well worth it.

Enable it by default so that if someone sets off an ad hoc backup it will have minimal I/O impact.

Synchronous Mirroring/AlwaysOn

High safety mode in database mirroring, or synchronous commit mode in AlwaysOn, both emphasise availability over performance. A transaction on the mirroring principal server or primary replica does not commit until it receives a message back from the mirror or secondary replica that the transaction has been hardened to the transaction log.

This increases transactional latency, particularly when the servers are in different physical locations.

Resource Governor in 2014

Up until and including SQL Server 2012 resource governor has only been able to throttle CPU and memory usage. Finally the ability to include I/O in a resource pool has been added to SQL Server 2014.

This has obvious use as a way of limiting the impact of reports on the system from a particular user, department or application.

Gathering The Evidence!

There are a lot of ways you can measure SQL Server I/O performance and identify which areas need looking at.

Most of what follows is available in SQL CoPilot in graphical and tabular form, both as averages since last service start and as snapshots of current activity.

Wait Types

Use sys.dm_os_wait_stats to check number of waits and wait times for IO_COMPLETION, LOGBUFFER, WRITELOG and PAGEIOLATCH_*.

See sys.dm_os_wait_stats (Transact-SQL) for a description of all wait types.

Use this script to focus on the I/O wait types:

SELECT wait_type, waiting_tasks_count, wait_time_ms - signal_wait_time_ms AS total_wait_time_ms,
  
1. * ( wait_time_ms - signal_wait_time_ms ) / CASE
  
WHEN waiting_tasks_count = 0 THEN 1
  
ELSE waiting_tasks_count
  
END AS avg_wait_ms
FROM   sys.dm_os_wait_stats
WHERE  wait_type IN ( 'IO_COMPLETION', 'LOGBUFFER', 'WRITELOG', 'PAGEIOLATCH_SH', 'PAGEIOLATCH_UP', 'PAGEIOLATCH_EX', 'PAGEIOLATCH_DT', 'PAGEIOLATCH_KP' )

This shows averages since the last service restart, or since the wait stats were last cleared.

To clear the wait stats, use

DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR);

You can also check sys.dm_os_waiting_tasks to see what is currently being waited for.

Virtual File Stats

Query sys.dm_io_virtual_file_stats to find out which data and log files get the most read and write I/O, and the latency for each file calculated using the stall in ms.

SELECT d.name AS database_name,mf.name AS logical_file_name,num_of_bytes_read,num_of_bytes_written,num_of_reads,num_of_writes,1. * io_stall_read_ms / ( num_of_reads + 1 ) avg_read_stall_ms,1. * io_stall_write_ms / ( num_of_writes + 1 ) avg_write_stall_ms
FROM   sys.dm_io_virtual_file_stats(NULL, NULL) vfs
JOIN   sys.master_files mf ON vfs.database_id = mf.database_id
                              
AND vfs.FILE_ID = mf.FILE_ID
JOIN   sys.databases d ON mf.database_id = d.database_id

Performance Counters

There are two ways of looking at performance counters. Select from sys.dm_os_performance_counters, which shows all the SQL Server counters, or use Windows Performance Monitor (perfmon) to see the other OS counters as well.

Some counters to look at are:

SQL Server:Buffer Manager

  • Lazy writes/sec
    The number of times per second that dirty pages are flushed to disk by the Lazy Writer process. An indication of low memory, but listed here as it causes more I/O.
  • Checkpoint pages/sec
    The number of dirty pages flushed to disk per second by the checkpoint process.
  • Page reads/sec
    Number of physical pages read from disk per second
  • Page writes/sec
    Number of physical pages written to disk per second
  • Readahead pages/sec
    Pages read from disk in advance of them being needed. Expect to see high values in BI workloads, but not for OLTP

SQL Server:Access Methods

  • Forwarded records/sec
    Should be as low as possible. See above for explanation of forwarded records.
  • Full scans/sec
    The number of unrestricted full scans. Use of UDFs and table variables can contribute to this, but concentrating on seeks will help to keep the value down
  • Page splits/sec
    The number of page splits per second - combining splits due to pages being added to the end of a clustered index as well as "genuine" splits when a row is moved to a new page. Use the technique from the link in the section on index fragmentation, above, to get a more accurate breakdown
  • Skipped ghosted records/sec
    For information about ghosted records see An In-depth Look at Ghost Records in SQL Server
  • Workfiles created/sec
    A measure of tempdb activity
  • Worktables created/sec
    A measure of tempdb activity

SQL Server:Databases

  • Log bytes flushed/sec
    The rate at which log records are written to disk
  • Log flush wait time
    The duration of the last log flush for each database
  • Log flush waits/sec
    The number of commits per second waiting for a log flush

Logical Disk

  • Avg Disk secs/Read
  • Avg Disk secs/Write
  • Avg Disk Read bytes/sec
  • Avg Disk Write bytes/sec

Using the sys.dm_os_performance_counters DMV, a lot of counters display a raw value, which has to be monitored over time to see values per second. Others have to be divided by a base value to get a percentage. This makes this DMV less useful unless you perform these calculations and either monitor over time or take an average since the last server restart.

This script uses the tempdb creation date to get the number of seconds since the service started and calculates the averages for these counters. It also retrieves all other counters and calculates those that are derived from a base value.

USE master;
SET NOCOUNT ON;
DECLARE @up_secs bigint
SELECT @up_secs = DATEDIFF(second, create_date, GETDATE()) FROM sys.databases WHERE name = 'tempdb'

SELECT RTRIM(object_name) object_name, RTRIM(instance_name) instance_name, RTRIM(counter_name) counter_name, cntr_value
FROM sys.dm_os_performance_counters
WHERE cntr_type = 65792
UNION ALL
SELECT RTRIM(object_name), RTRIM(instance_name), RTRIM(counter_name), 1. * CAST(cntr_value AS bigint) / @up_secs
FROM sys.dm_os_performance_counters
WHERE cntr_type = 272696576
UNION ALL
SELECT RTRIM(v.object_name), RTRIM(v.instance_name), RTRIM(v.counter_name), 100. * v.cntr_value / CASE WHEN b.cntr_value = 0 THEN 1 ELSE b.cntr_value END
FROM
(
  
SELECT object_name, instance_name, counter_name, cntr_value
  
FROM sys.dm_os_performance_counters
  
WHERE cntr_type = 537003264) v
JOIN (
  
SELECT object_name, instance_name, counter_name, cntr_value
  
FROM sys.dm_os_performance_counters
  
WHERE cntr_type = 1073939712) b ON v.object_name = b.object_name AND v.instance_name = b.instance_name AND RTRIM(v.counter_name) + ' base' = RTRIM(b.counter_name)
UNION ALL
SELECT RTRIM(v.object_name), RTRIM(v.instance_name), RTRIM(v.counter_name), 1. * v.cntr_value / CASE WHEN b.cntr_value = 0 THEN 1 ELSE b.cntr_value END
FROM
(
  
SELECT object_name, instance_name, counter_name, cntr_value
  
FROM sys.dm_os_performance_counters
  
WHERE cntr_type = 1073874176) v
JOIN (
  
SELECT object_name, instance_name, counter_name, cntr_value
  
FROM sys.dm_os_performance_counters
  
WHERE cntr_type = 1073939712) b ON v.object_name = b.object_name AND v.instance_name = b.instance_name AND REPLACE(RTRIM(v.counter_name), ' (ms)', '') + ' Base' = RTRIM(b.counter_name)
ORDER BY object_name, instance_name, counter_name

Dynamic Management Views and Functions

As well as the DMVs in the above scripts, there are a number of others that are useful for diagnosing SQL Server I/O performance problems. Here are all the ones I use. I'll add some sample scripts when I get the time:

  • sys.dm_os_wait_stats
  • sys.dm_io_virtual_file_stats
  • sys.dm_os_performance_counters
  • sys.dm_io_pending_io_requests
  • sys.dm_db_index_operational_stats
  • sys.dm_db_index_usage_stats
  • sys.dm_db_index_physical_stats
  • sys.dm_os_buffer_descriptors

Tracing

It can also be useful to see what activity there is on the instance. Here are your options:

Profiler

The Profiler tool is quick and easy to use - you can start tracing in a matter of seconds. However, there is some overhead and it may impact performance itself - especially when a lot of columns are selected. A server side trace is a better option.

Server-side Trace

A server-side trace has less of an impact than running Profiler. It has to be scripted using system stored procedures, but Profiler has the ability to generate the script for you.

Here's a great tutorial on how to do that.
How to create a SQL Server trace file, schedule it to run, and analyze the trace file results

Extended Event Sessions

Extended events were first introduced in SQL Server 2008, and have been considerably enhanced in SQL 2012. They are very lightweight, and the use of server-side traces and Profiler is now deprecated. Nevertheless, use of extended events may impact performance of high transaction systems if you are not careful. Use an asynchronous target and avoid complicated predicates to limit the overhead.

Using SQL Server 2008 Extended Events

Tools

There are a number of tools for gathering performance data from your servers.

SQLIO is a simple tool that creates a file on disk and tests latency and throughput for random/sequential I/O, at various block sizes and with a variable number of threads. These are all fully configurable. SQLIO is a great way of getting a baseline on a new server or storage, for future comparison.

Third party tools are another option for viewing performance metrics. Some show you what is happening on the server right now. Others are built into more complex (and expensive) monitoring solutions.

Performance metrics obtained on virtual servers are unreliable. Performance counters and wait stats may give the impression that everything is OK, when it is not. I recommend the use of the performance monitoring tools provided by the VM vendor. In the case of VMWare, this is very easy to use and is built into Virtual Center.

Summary

This turned into a much bigger article than I expected - SQL Server I/O performance is a big subject!

I started with everything I knew, and double checked my facts by searching the web and checking books. In the process I learnt a whole lot of new stuff and found a lot of useful links. It has been a useful exercise.

Hopefully this has been useful for you too.

Start for FREE today

  • Increase DBA Productivity
  • Quickly Find and Fix Issues
  • Installs in Seconds