Wednesday, June 3, 2015

Summary for Accidental DBA Book / Chapter 2: Disk I/O Configuration

Chapter 2: Disk I/O Configuration

Random versus sequential I/O

  • SQL Server employs a read-ahead mechanism that can read a number of contiguous pages, up to 128 pages on Standard Edition and 1,024 pages on Enterprise Edition, in a single I/O operation … P45.
  • Sequential I/O (is any operation where the blocks can be read from, or written to, disk without having to reposition the disk head on the drive) can benefit from the read-ahead mechanism. Unlike Random I/O where the disk head on the drive has to change positions on the platter, incurring seek latency as a part of the operation, which reduces the performance and number of operations in comparison to sequential I/O … P45.
  • Read operations in general, especially in OLTP systems, are random I/O operations … P46.

Choosing the Right RAID Level

  • RAID technology is used to achieve the following objectives:
    • Increase levels of I/O performance, measured in Input/Output Operations Per Second (IOPS).
    • Increase levels of I/O throughput, measured in Megabytes Per Second.
    • Increase storage capacity available in a single logical device.
    • Gain data redundancy … P46.
  • RAID 0: strips the data across multiple drives, allowing the read and write operations to be shared amongst the drives inside the array. This level of RAID provides the best performance for both read and write operations, but provides no redundancy or protection against data loss … P48.
  • RAIN 1: provides protection against the loss of data from a single disk by mirroring the writes to a second disk, but doesn't provide added write performance to the system. RAID 1 can be used for storing a single transaction log because of sequential nature of the operations on the transaction logs, yet having multiple transaction log files will have the effect of random I/O, because of the movement of the disk head to perform operations against each of the files being written to sequentially.  … P49.
  • RAID 5: is commonly known as "striping with parity;" the data is striped across multiples disks, as per RAID 0, but parity data is stored in order to provide protection from single disk failure. The minimum number of disks required for a RAID 5 array is three. RAID 5 provides redundancy with minimal reduction in storage capacity, Striping the data across multiple disks improves read performance, but the need to maintain parity data incurs a performance penalty for writes. For heavy read but low write databases, RAID 5 can be optimal for the data files. RAID 5 is not recommended for the transaction log files, due heavy write activity … P50, P51.
  • RAID 6: is an extension of RAID 5 but, instead of a single distributed parity bit, it uses double-distributed parity bits. RAID 6 has a performance penalty similar to RAID 5 for write operations … P51, P52.
  • RAID 10: It provides redundancy by first mirroring each disk, using RAID 1, and then striping those mirrored disks, with RAID 0, to improve performance. Cost is the problem in this configuration … P53.
  • NTFS format allocation unit sizes: the 4 K default for NTFS is good for file servers and the operating system drives, but not database data files which perform better using a 64 K allocation unit … P55.
  • The two most common tools used for benchmarking storage configurations for SQL Server are SQLIO and IOmeter. Of the two, IOmeter is the most flexible, and can generate mixed I/O workloads that more closely reflect what might be generated by SQL Server. IOmeter also has a graphical user interface that is used for configuring the tests and monitoring their progress. … P56.
  • SQLIOSim is a tool by Microsoft to tests the storage using the same disk operations that SQL Server would perform. This tool should be used to validate that the I/O subsystem functions correctly under heavy loads, but it should not be used for performance benchmarking the configuration … P57.

Workload considerations

Data files

  • The appropriate disk configuration for the data files of a database depends heavily on the read-to-write ratio for the database. SQL Server tracks the I/O usage of the database files for an instance and makes this information available in the sys.dm_io_virtual_file_stats Dynamic Management Function … P57.
  • For a database that is primarily read-only, RAID 5 or RAID 6 can offer good read performance, while also maximizing the available storage. RAID 5 or 6 arrays are commonly used for data warehouses, or for storing data where write latency doesn't impact overall system performance. For OLTP implementations of heavy-write databases, RAID 1+0 provides the best performance … P58.

Log files

  • Since the transaction log is written to sequentially, RAID 1 can be used in most situations.
  • Having the log files for multiple highly transactional databases on the same physical disks can result in write I/O bottlenecks, often shown by high WRITELOG waits in sys.dm_os_wait_stats, and by high io_write_stall_ms values in sys.dm_io_virtual_file_stats() for the transaction log file … P58.

Special considerations for tempdb

  • As a general rule, the tempdb database files should be physically separate from the user data files and transaction log files, on a dedicated disk array. Since tempdb is a writeheavy database, RAID 1 or RAID 1+0 are usually the configurations best able to support the concurrent workload of tempdb … P59.
  • Creating multiple files for tempdb is highly recommended as this will reduce contention on pages when allocations are made. Generally create one file per processor, but not more than eight unless there is still contention … P60.

Diagnosing Disk I/O Issues

  • A primary tool for investigating disk I/O issues is PerfMon and specifically the Physical Disk\Disk sec/Reads and Physical Disk\Disk sec/Writes counters. The key for performance is having the lowest latency possible and the guideline latency values for each of these counters are as follows:
    • Less than 10 ms = good performance
    • Between 10 ms and 20 ms = slow performance
    • Between 20 ms and 50 ms = poor performance
    • Greater than 50 ms = significant performance problem … P65.

Common Disk I/O Problems

  • The first step to resolve I/O issues it to make sure the reason is not missing indexes or poorly written queries.
  • The following misconfigurations are at the heart of many of the disk I/O issues: sizing for capacity instead of I/O performance, incorrect workload isolation, incorrect partition alignment, and incorrect bandwidth using SAN configurations … P66.
  • Use WMI query to investigate possible disk partition misalignment
    wmic partition get BlockSize, StartingOffset, Name, Index
    If the StartingOffset value is not evenly divisible without a remainder, or decimal result, by the stripe unit size being used by the RAID controller, then the disk is misaligned. Fixing the misalignment is easy, yet a destructive operation that erases all the data on the disk. If you are using Windows 2008 or newer, you can format your drive through Disk Management GUI with a 64K allocation unit. If you are using an older OS, you should use DISKPART … P69. 
Referencehttp://www.amazon.com/Troubleshooting-SQL-Server-Guide-Accidental/dp/1906434786

No comments:

Post a Comment