Saturday, April 2, 2016

Analyzing Index Usage in SQL Server

For our applications, we need to have effective indexes to insure acceptable database performance.  This really boils down into two questions.

Am I missing any indexes that I should have in my database?
If I am missing an index, it is likely that SQL Server will have to resort to a scan operation over the table in order to find the data needed for a SQL Statements.  This means SQL Server will have to read and process each row of the table, which is not only slow but also resource intensive.

Do I have any indexes that are not being used that I should drop?
Indexes are not free.  They are a separate physical structure that SQL Server has to maintain whenever a DML (insert, update or delete) statement is executed against the table.  As long as an index is being used, it is a reasonable trade off to pay the price of maintaining the index because usually, a queries that use the index are run far more frequently than DML statements that have to update the index.  But how do we know this is the case?

These questions can be easily answered using SQL Server dynamic management views.  DBA's have known about these DMV's for years, but many developers do not.  While in some organizations a DBA is fully involved with the development of an application, in my experience this is not the case in most organizations.  So it is the development team that is creating tables and writing SQL for their data access layer.  To do this effectively and have a well performing application, they need visibility to what is happening inside of SQL Server, especially views like the ones I am about to show you that helps someone determine if they have the right indexes created.

So let's see how DMV's can help answer these questions


Finding Missing Indexes

Here is the query we are going to use

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
SELECT     
    TableName = d.statement,
    d.equality_columns, 
    d.inequality_columns,
    d.included_columns, 
    s.user_scans,
    s.user_seeks,
    s.avg_total_user_cost,
    s.avg_user_impact,
    AverageCostSavings = ROUND(s.avg_total_user_cost * (s.avg_user_impact/100.0), 3),
    TotalCostSavings = ROUND(s.avg_total_user_cost * (s.avg_user_impact/100.0) * (s.user_seeks + s.user_scans),3)
FROM sys.dm_db_missing_index_groups g
INNER JOIN sys.dm_db_missing_index_group_stats s
    ON s.group_handle = g.index_group_handle
INNER JOIN sys.dm_db_missing_index_details d
    ON d.index_handle = g.index_handle
WHERE d.database_id = db_id()
ORDER BY TableName, TotalCostSavings DESC;

This query works by joining together the sys.dm_db_missing_index_groupssys.dm_db_missing_index_group_stats, and sys.dm_db_missing_index_details views.  It also limits is results to just the current database by virtue of line 18.  If you wanted to get results database wide, you would simply remove this WHERE criteria.

These views give us information on the index SQL Server believes should be created.  This includes the table name (like 2 - no idea why this is called statement in the DMV), the columns that should be part of the index key (lines 3 and 4) and any possible include columns for the index (line 5).  Between these columns, we could construct our create index statement.

We also get some statistics around why SQL Server believes the index should be created.  The user_scans and user_seeks values (lines 6 and 7) represent how many times this index could have been used in one of the operations had the index existed.  So if you see a high number, especially in the user_seeks columns, this means there are large numbers of statements executing that could benefit from this index.

The avg_total_user_cost column (line 8) gives us the average cost of statements that have been run against this table that would benefit from the index.  The avg_user_impact column (line 9) tells us the percent that SQL Server believes this cost would be reduced by creating this index.  From these two values, you can calculate some the average cost savings per statement (line 10) and the total cost savings for all statement executions (line 11) to give you an idea of how significant the cost savings might be.

I would, however, encourage you to not simply go and create an index for every row in this table.  What you want to do is look through the recommendations in this table and look for patterns.  You will find a number of recommendations for each table and many of these will be similar.  So what you want to do is analyze these together and figure out which indexes make sense.  This may also mean modifying an existing index rather than creating a new index.  With this view, you can see all the recommendations that exist for a table and come up with the right set of indexes for the table.



Index Usage Statistics


Let's now tackle the problems of making sure our indexes are being used and identifying indexes that may need to be dropped.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
SELECT
    [DatabaseName] = DB_Name(db_id()),
    [TableName] = OBJECT_NAME(i.object_id),
    [IndexName] = i.name, 
    [IndexType] = i.type_desc,
    [TotalUsage] = IsNull(user_seeks, 0) + IsNull(user_scans, 0) + IsNull(user_lookups, 0),
    [UserSeeks] = IsNull(user_seeks, 0),
    [UserScans] = IsNull(user_scans, 0), 
    [UserLookups] = IsNull(user_lookups, 0),
    [UserUpdates] = IsNull(user_updates, 0)
FROM sys.indexes i 
INNER JOIN sys.objects o
    ON i.object_id = o.object_id
LEFT OUTER JOIN sys.dm_db_index_usage_stats s
    ON s.object_id = i.object_id
    AND s.index_id = i.index_id
WHERE 
    (OBJECTPROPERTY(i.object_id, 'IsMsShipped') = 0)
ORDER BY [TableName], [IndexName];


This query uses the sys.dm_db_index_usage_stats table and joins it with the sys.indexes system view.  Again, we are only getting data for the current database, this time because the sys.objects and sys.indexes views only contain data for the current database

We have some information about the index, including the table it is on, the index name and the index type.  In the index type column, two of the most common values you will see are CLUSTERED and NONCLUSTERED.  CLUSTERED refers to a clustered index, which is how SQL Server typically lays out the data for a table.  We really want to focus on NONCLUSTERED indexes, because these are indexes we have created on the table for some specific purpose, and we have control over the columns in those indexes and if they should exist or not.

There are three types of ways an index can be used:

  • user_seeks - SQL Server is traversing the b-tree structure of the index to perform a lookup on a key.  This is the operation we want to see.  SQL Server is using the index as intended.
  • user_scans - SQL Server is reading the entire index to find the value(s) it wants.  If this is a clustered index, that means SQL Server is reading the entire table.  If this is a Nonclustered index, all of the index keys have to be read.  This is much more expensive than a seek operation
  • user_lookups - These are lookup operations against the table, typically when SQL Server is looking up a row in the clustered index structure of a table.
The user_updates column (line 10) gives the maintenance cost of the index -- that is how many time a DML statement has caused this index to need to be updated.

What you want to look for are indexes with very few or zero user_seeks.  These are indexes that you are paying to maintain, but for whatever reason SQL Server is not able to use.  And then you want to investigate why that index is not being used.  Maybe the index is a unique index and is there to only to enforce a constraint.  Maybe something in the application has changed so an index is no longer used.  Maybe the columns are in the wrong order or the index is not selective enough.  You want to figure this out and then either A) modify the index so it can be used or B) drop the index.  These stats give you the visibility into what indexes you need to look at.





4 comments:

  1. That's very interesting, although I'm not entirely sure if I get it right. Is it any similar to microsoft dynamics 365 for operations, or not really?
    All the best!

    ReplyDelete
  2. How to unravel if MS SQL Server Queries takes longer than Unusual? Contact to Microsoft SQL Server Support
    Keep in mind that, if you’re ordinary inquiries take longer than unordinary then most likely it could be because of asset dispute from locking. The primary purpose for this issue is, if any new procedures have been added to your framework. It is safe to say that you are ready to take care of this issue? If not then rapidly contact to Remote Infrastructure Management Support for Microsoft SQL Server or Online MS SQL Server Support to determine this inquiry issue. Our expert specialists will legitimately take mind your issue and gives you propelled support.
    For More Info: https://cognegicsystems.com/
    Contact Number: 1-800-450-8670
    Email Address- info@cognegicsystems.com
    Company’s Address- 507 Copper Square Drive Bethel Connecticut (USA) 06801

    ReplyDelete
  3. Watched your session on this from Pluralsight. Very helpful! Thanks!

    ReplyDelete
  4. I feel SQL server and other aspects actually help one look for more ways to find out solutions to some complex IT problems.

    SQL Server Load Rest API

    ReplyDelete