Muller Jannie

About myself and everyone else.

Disk Reads, pun intended.

sys.dm_io_virtual_file_stats

In 2005 and 2008 we saw an old function converted into a handy dmv. It’s probably still a function.. so it’s been renamed.

I don’t think it’s more handy now than it was previously but it certainly provides a nice addition to a DBA dmv toolbox.

I’m not going to cover what the columns does as it is really very straight forward. Why would you use this dmv. In a previous post I’ve mentioned how to look at how many times an index is accessed. This is great but you cannot use that data to gauge how much data is being read or written. It is thus common to see a small lookup table as being accessed the most (which is a good sign btw).

This particular dmv gives us an idea how much data is going to and from disk. It gives us a nice breakdown of the log and data files. It also gives us an idea on the amount of time waiting. When I was first starting to query this view I couldn’t understand how one second elapsed could translate into ten seconds waiting time. It didn’t take too long to figure out this was because you have multiple process waiting at the same time. “duh..”

With this view we can find out what is our benchmark, it’s that handy thing to say when you are at the next tech ed or IT conference. We write 10gb in 30 seconds. Then again, the size is questionable as to why you are writing that much in the first place, can you write less, can you optimize. Hence, I usually keep quiet and listen to the people boasting about volume. Your figures should be as low as possible with minimum waits end of story.

I’m going to join sys.dm_io_virtual_file_stats to the sys.master_files (sysaltfiles 2k) in order to get log VS data IO and the drive letter. In my query I’m going to use the rank function to give me a rank of IO over the server as I’m interested in overall server activity. You can filter for a number of databases etc. I’m also using the sample_ms(uptime in milliseconds) column to find out roughly a avg IO. There is a better way to do this which I’ll cover in another post but lets start at the top in drill deeper.

With CTE
As
(select     db_name(a.database_id) as DBname,
            b.name As [File],
            b.type_desc,
            left(b.physical_name,1) As Drive,
            num_of_bytes_read / 1024./ 1024. As num_of_mbytes_read,
            num_of_bytes_written /1024./1024. As num_of_mbytes_written,
            rank() over (partition by (@@servername) order by (num_of_bytes_read ) desc) As Read_Rank,
            rank() over (partition by (@@servername) order by (num_of_bytes_written) desc) As Write_Rank,
            (num_of_bytes_read / ((sample_ms / 1000 / 60))) / 1024. / 1024. As mbytes_read_per_minute,
            (num_of_bytes_written / ((sample_ms / 1000 / 60)))  / 1024. / 1024.  As mbytes_written_per_minute
from sys.dm_io_virtual_file_stats (null,null) a
      inner join sys.master_files b
            on a.database_id = b.database_id
     And a.file_id = b.file_id)

Select * from CTE  order by write_rank

I don’t like how the code looks in the blog. Edited the html looks better now.

Output

How to read the output

 I’ve ordered by writes, thus database “a” is the most intensive on my server for writing. Since the server has been restarted it has written nearly 1.5tb if I have the numbers right.  Ask yourself is the database you see here what you expect to see. Is the data file or log file a top ranked. The avg IO per minute is calculated by the sample_ms. This is an idea but not spot on accurate. I’ll do another post how to get that done, that is also the reason why I’m using a CTE to do this, we would be able to reuse the code later.

You now have the data to find out if you have a write or read intensive server. NOTE: Very valuable info that I didn’t include in this is the waiting times. When you know what your workload is you can adapt accordingly , you can also work out a percentage of database activity. You have enough data to look at which drives are getting hammered and how to distribute the load. It’s best to use this in conjunction with wait stats if you want to see if you can up the throughput.

This data is more easily available through the view than through perf counters. If you want to quickly do a spot check on a server I’d suggest to use resource monitor. I find that extremely handy. It will also drill down to the file the same as this dmv. I can see that my tempdb is using all files nicely as the IO is nearly the same for both data files. Ah! You can also use this to verify  if you filegroup strategy is working. You can also use this data to determine if you can move your archive data to a slower drive .

In terms of the query it’s really worth the effort because of the excellent value you get from the data. It’s also not complicated to understand.

Other fun things to do,

Hook this up to multiserver query and get a quick picture on how your hardware match your relative system. If you have large integrated systems where dataflow occurs through replication etc. you can also quickly spot bottlenecks between systems.

Add the db size do this query and find out how the IO compares to your database size. In a warehouse with ETL you can have a 100% turnover on data. On the HR system’s database, well you should read the data multiple times over. This could indicate poor \ missing indexes.

NOTE: I have not tested if this takes into account the buffer pool. i.e. if it’s only IO from  disk  or if it’s buffer pool. If it does include data in the buffer that stats would be vastly different that you OS counters as that will only do physical reads. From Bol it says data from file, so I assume it’s only physical reads. Arg I have to test this.

It won’t change the results but it’s good to be thorough and sure that you know exactly how things work. How to test this, you can drop clean buffers, free cache, or restart (but you will lose that sample_ms). Need I say use a test server. Run the query above, query a large table (which will be loaded into cache), run above query again. Then run a query on large table again. Run the above query and compare values. Did they change. If your table was cached (it needs to fit in your memory if you test box has 1GB ram and you have a 4GB table you are going to read from disk no matter how hard you try). This should give you results.

Happy testing.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Information

This entry was posted on August 18, 2011 by in SQL Server and tagged , , .
%d bloggers like this: