Monday, August 24, 2015

Summary for Performance Tuning Using SQL Server DMVs Book / Chapter 6: Physical Disk Statistics and Utilization

Chapter 6: Physical Disk Statistics and Utilization

  • It is vital that, before even installing SLQ Server, you gain a good understanding of the I/O capacity of your disk subsystem, using tools such as SQLIO.
  • sys.dm_db_partition_stats helps us find the largest tables and indexes, especially those that are subject to heavy updates, monitor the effectiveness of their partitioning scheme, or investigate the need to upgrade/configure the disk I/O subsystem appropriately. It returns one row per table partition; if the table is not partitioned it will return one row for the table. Note that when querying this view, we filter out system objects which reach to around 100 for each database.
  • The query in Listing 6.1 will provide the total number of rows in all clustered indexes and heaps on a given SQL Server instance. System objects such as sys.dm_db_partition_stats are updated asynchronously, for performance reasons, so the counts may not be completely up to date.
  • Rebuilding indexes can bloat transaction logs and, when using differential backups, it means that all the pages in the index/tables will have changed since the last full backup … P230.
    https://social.msdn.microsoft.com/Forums/en-US/5146912a-395c-4129-8aeb-4f82b949bdfc/does-reorganizing-indexes-bloat-transaction-logs-and-differential-backups-as-rebuilding-them-the?forum=sqlkjmanageability
  • The occurrence of page splits (which happens due to inserting new records in full data pages, which in turn causes fragmentation) can be minimized to some degree, though not avoided altogether, by setting the appropriate fill factor for the clustered table (e.g. 90%) thus allowing space for new data on each page. In fact, a common cause of fragmentation is rebuilding clustered tables and indexes and forgetting to set the fill factor appropriately. By default, the fill factor will be 0 (meaning zero spare space). This can end up causing a lot more subsequent fragmentation than was resolved by rebuilding! … P233.
  • Many designers persist in the habit of using GUIDs for surrogate keys, and clustering on them. GUIDs are random in nature, and tend not to be created sequentially and, as a result, insertions of data into the middle of the table are common which causes pages splitting and fragmentation in P233-235 there are examples to illustrate this.
  • With heaps, the storage engine inserts all rows at the end of the table, in order of arrival. This makes inserting into a heap super-fast. As such, many people use heaps as a place to drop rows (for instance, when logging operations, and even when loading data using bulk copy), while avoiding the performance impact of index maintenance. On the contrary reading from a fragmented heap, is a performance nightmare.
  • Diagnosing I/O Bottlenecks: start with PerfMon counters such as PhysicalDisk Object: Avg. Disk Queue Length and Avg. Disk Reads/Sec, which can help you work out the number of I/Os per disk, per second, and how many physical I/O requests are being queued, on a given disk. And also use sys.dm_os_wait_stats DMV and look if you can find PAGEIOLATCH_EX or PAGEIOLATCH_SH among the top waits, this indicates that many sessions are experiencing delays in obtaining a latch for a buffer, since the buffer is involved in physical I/O requests … P239
  • sys.dm_io_virtual_file_stats DMF gives cumulative physical I/O statistics, indicating how frequently the file has been used by the database for reads and writes since the server was last rebooted. It also provides a very useful metric in the form of the "I/O stall" time, which indicates the total amount of time that user processes have waited for I/O to be completed on the file in question. Ultimately, high stall rates could simply indicate that the disk I/O subsystem is inadequate to handle the required I/O throughput Note that this DMF measures physical I/O only. Logical I/O operations that read from cached data will not show up here. Explanation of the columns it returns are in P240-241.
  • Investigating physical I/O and I/O stalls: the writers store in a temporary table the result of the query in listing 6.13, which captures the baseline disk I/O statistics from sys.dm_io_virtual_file_stats, then wait for 10 seconds (or depending on the workload on your server), then they run the query in listing 6.15 which compares the current statistics with the baseline. And they are suggesting that, in any event, it is certainly worrying to see that the stall times on a drive are substantially greater than the elapsed time on that same drive … P243-246.
  • sys.dm_io_pending_io_requests DMV returns a row for each currently pending I/O request at the file level. More than two or three pending process could indicate an issue. If you regularly observe a high number of pending I/O requests on a single drive, you should consider moving some of the files onto a separate drive, on a different access channel. Using the query in Listing 6.16, we can view the file name, the status, and how long the operation has been waiting.
  • In an optimized system the read:write ratio should ideally be close to 50:50. In reality there are almost always more reads than writes. A higher ratio than around 80:20, start to suspect non-optimal queries, or insufficient cache memory to avoid physical disk access. If you find that the read:write ratio is 50:50 based on counts (NUMBER of read operations versus write operations), and 99:1 based on data (SIZE or amount of data read versus written), this indicates that you are reading a lot of data to write a little data, which could be caused by inefficient database code, perhaps allowing users to search in a very inefficient manner, resulting in table scans. Bear in mind that this includes only actual writes to the files, or reads from the file, and will not reflect data read from the cache, or written to the cache and not yet flushed to the disk. Listing 6.17 calculates this based on the AMOUNT (size) of data. Listing 6.18 does the same, but also slice and group on the drive letter. If those queries results something to worry about, the next step is to obtain the read:write ratios for this database in terms of the number of read and write operations. If this method reveals a ratio much closer to 50:50 then we know that reads are reading a disproportionately high amount of data. You can find this using query in listing 6.19. Listing 6.21 gets number of reads and writes at the table level, but note that the returned counts include logical operations also.
  • Listing 6.22 demonstrates how to get an overview of tempdb utilization.
Referencehttp://www.amazon.com/Performance-Tuning-Server-Dynamic-Management/dp/1906434476

No comments:

Post a Comment