Thursday, June 4, 2015

Summary for Accidental DBA Book / Chapter 5: Missing Indexes

Chapter 5: Missing Indexes


  • Is it better to create multiple indexes, where each individual index has the optimal key column order, based on column selectivity, or to create a single index that covers multiple queries, but has a less selective column order? For a data warehouse, where there are significantly more read operations than write operations, the multiple indexes option may be appropriate. For an OLTP system, where there are more writes than reads, a less selective index that covers multiple queries using a less than optimal column order may be most appropriate … P165.
  • A query cannot seek on an index unless the query filters on a left-based subset of the index key. In other words, it is often (though not always) best to order the index key columns such that the most selective column is the first column in the index. This reduces the number of database pages that must be read by the database engine while traversing the index, in order to satisfy the query.
  • Listing 5.1 demonstrates this. Listing 5.2 is a solution for Listing 5.1 … P166.
  • A covering index is one that contains all of the columns needed by a query, as either key or non-key columns, preventing the need to access the table or clustered index using lookup operations, and so decreasing the number of I/O operations required to return the data … P169.
  • In data warehouse environments having indexes with long included columns is acceptable, and it is recommended to disable or drop indexes during ETL to eliminate the impact of indexes maintenance. In contrast, OLTP DB would generally use fewer included columns due to the impact on the performance of data manipulation operations … P170.
  • You can use the following to identifying missing indexes: 
    1. Database Engine Tuning Advisor: can be used to analyze a single query or an entire database workload, in the form of a trace file generated by SQL Server Profiler. The quality of the DTA's index analysis will only be as high as the quality of the workload that is provided, thus the trace in SQL Server Profiler should be run for a period that convers a full cycle of the work load and covers most of the queries. P174-180 shows how. 
    2. Missing index feature: the database engine tracks information about indexes that do not exist but that the optimizer could have used during query execution. This information is stored in the missing index DMVs, and in the MissingIndexGroup in showplan for a query. Those missing indexes must be used with due care, to avoid overlapping or duplicate indexes, as those missing index recommendations are not workload based; they are derived from the execution of individual queries …P181. (Listing 5.10 is XQuery to the ShowPlan XML)
  • Missing index DMVs: 
    • sys.dm_db_missing_index_details – stores detailed information regarding indexes the optimizer would have used had they been available, such as columns that could have been used to resolve equality or inequality predicates, and suggested INCLUDE columns for covering a query.
    • sys.dm_db_missing_index_columns – accepts an index_handle and returns a list of columns that would comprise the suggested index.
    • sys.dm_db_missing_index_group_stats – returns summary information regarding the potential benefit of a "missing" index, based, for example, on the number of seeks and scans that would have benefited.
    • sys.dm_db_missing_index_groups – a join view between _group_stats and _index_details … P182.
  • Missing index DMVs limitations:
    • Its information exists in memory and doesn't exist beyond SQL service restarts.
    • Index key columns are not ordered according to cardinality … P182.
  • To estimate performance improvement associated with a specific missing index in the database use the query in Listing 5.9 … P183. Focus on the indexes with an impact value higher than 50,000.
  • A good rule of thumb is for any FOREIGN KEY columns that are commonly used in JOIN operations to have an associated index, either with the FOREIGN KEY column as the leading column in the index, or as a column further down the index key, depending on the queries. The code in Listing 5.11 can be used to identify non-indexed FOREIGN KEY columns in a database. That query is effective for database using single-column PRIMARY KEYs, yet partially helpful … P188.
  • The query shown in Listing 5.12, is any non-clustered index that has never been used for a seek, scan, or lookup operation by SQL Server, but is associated with a significant number of update operations. These indexes can be considered to be unused and should be dropped from the database. This code should only be used when the database has been online for a significant period of time, in order to ensure that the appropriate workload has been executed.  … P190.
  • In her following two blog posts, Kimberly Tripp offers further insight into the complications of identifying duplicate indexes, and provides a stored procedure for identifying and removing them: "How can you tell if an index is REALLY a duplicate?" (http://sqlskills.com/BLOGS/KIMBERLY/post/UnderstandingDuplicateIndexes.aspx) and "Removing duplicate indexes" (http://sqlskills.com/BLOGS/KIMBERLY/post/RemovingDuplicateIndexes.aspx) … P191.

No comments:

Post a Comment