Every once in a while a seemingly simple question pops up on an email distribution list or Yammer thread that ends up spawning a lively and productive conversation.  We had one recently that started out with the question:


For SQL Server running on XtremIO, is there any current guidance we should be giving our customers to best take advantage of the 4K element size on XtremIO?


One of the thoughts behind the question was motivated by the existence of a 4k disk allocation unit (DAU) option when formatting a disk with NTFS.  Since a 4K allocation size significantly increases the size of the allocation structures managed by NTFS and the frequency with which allocations are made there better be a good reason for using them.  With the relatively large files sizes typical of most SQL Server applications, we aren't worried about wasted space even when using the largest available allocation size of 64K so there is really no good reason to use anything but 64K allocation units for SQL Server.  This just opened the flood gates to many more SQL I/O questions that went beyond NTFS allocation unit size and XtremIO.  I gathered together a summary of the question and best responses in the table below.



Are SQL Server write sizes independent of DAU size?

  1. Yes.

How much does the additional overhead of 4K DAU impact
  performance? Will SQL Server still maintain its write sizes?

Most of the impact of
  DAU is felt as you create new space in a data or log file.  If you let
  SQL zero initialize the data file then all the overhead for allocation occurs
  at once.  Log files always initialize upon creation.  If you use
  instant file initialization with a data file then you get a small added
  overhead every time Windows allocates new space to the file.  When I
  have tested file creation with a new data file with 4K vs 64K DAU the
  difference in time to create is only a couple of percent difference.  It
  is almost the same thing if you run 8K or 64K random reads and writes with
  SQLIO against a disk with 4K vs 64K allocation units, a couple of percent
  difference but always favors the 64K DAU.

Does SQL typically write to the log in 4KB and DBs in
  32-64KB I/Os?

Log writes can be a
  small as 512 bytes (one sector) up to 60K (the maximum SQL will allow before
  forcing a flush to disk).  See the Kendra Little article referenced
  below for lots more details about this topic.
  DB writes can be 8K (one page) up to 256K (the maximum size the gather
  write operation will bundle before flushing to disk).

SQL Server uses a lazy writer and will try to write whole
  extents (if possible)

The lazy writer is
  constantly scanning the buffer pool for pages that are not being referenced
  often.  When the lazy writer flushes a page, the buffer manager will try
  to do a gather write exactly like if the page is being written by a check
  point so that can be 8-256K.

If an extent in RAM is only partially dirty (various pages
  dirty) then will only those dirty pages get written.  That would result in a smaller write

The gather write
  operation is looking for adjacent pages from a single data file up to 32
  pages.  An extent is 8 pages. There is no consideration for “wanting to”
  or “having to” to write entire extents from the pool.  The buffer manager
  will always read an entire extent from disk even if only one page from that
  extent is required for the operation.

Typically SQL write performance is governed by TLOG write
  speed and not by DB write speed, unless doing a bulk input.

You could have a nice
  wide RAID10 set for the logs with blistering sequential write speed and a
  RAID 6 SATA set for the data file.  It is hard to make general
  statements like this but if all you are doing is writing data (no reads) then
  often times the log file ideal of 5ms per write will get exceeded before the
  20ms per write to the data file(s) that usually indicates that I/O wait time
  are going to start impacting performance.


Another great outcome from this discussion was the sharing of favorite white papers and blog posts.  I have collected those links and added them here.


SQL Server Best Practices 2012 http://msdn.microsoft.com/en-us/sqlserver/bb671430 and,


SQL Server Best Practices Article in 2007 http://technet.microsoft.com/en-us/library/cc966412.aspx


There is also an interested blog here from Kendra Little:


http://www.brentozar.com/archive/2012/05/how-big-your-log-writes-spying-on-sql-server-transaction-log/ …read the
comments also


The best IO size information I’ve seen came for the SQL CAT team a few years back.  I think it’s still relatively accurate, check out the appendix of the presentation from the link below. 




For the 10 years I worked at Microsoft I found the SQL Server technical community a constant source of knowledge and challenge that made me a better SQL Server professional.  The community at EMC is certainly smaller but no less dedicated to sharing information and promoting technical excellence.  I hear the same thing from anyone involved with PASS and local users groups.  Let us know how you participate in a SQL Server community and what it means to you.  Thanks for reading.