Rebuilding a Heap–Don’t Do It

I saw someone mention recently that you can run a REBUILD on a heap table. I hadn’t realized that, but when I looked in BOL, I saw that indeed you can run this on a heap. That’s interesting, in that you can now move the pages in a heap around to remove fragmentation. At first glance that sounds good, but I wasn’t sure.  A little more searching was required.

The authoritative source for me on many things like this is SQLskills, and sure enough, Paul Randal has a myth piece on this. Paul has a great explanation, but basically if you rebuild the HEAP structure, you are creating work in that every non-clustered index also has to be rebuilt. Why? The pointers back to the heap pages, which are locations, will change.

What about adding a clustered index and dropping it? Nooooooo, and again, I learned something new. This causes two rebuilds of the non-clustered indexes as they are rebuilt with the cluster addition and then rebuilt when the table changes back to a heap (to get the heap locations). That’s crazy, and certainly not what we want.

The short answer here is that you want a clustered index, for the maintenance reasons here, as well as others. If you don’t have a fundamental reason not to create a clustered index, just add one.

Just do it.

About way0utwest

Editor, SQLServerCentral
This entry was posted in Blog and tagged , , . Bookmark the permalink.

3 Responses to Rebuilding a Heap–Don’t Do It

  1. I think there are some exceptions to a flat out don’t rebuild a heap rule, and that if you need to rebuild make it a cluster. If you’re using a heap for insert performance purposes (no contention and page split on the last page of the table in very high concurrency inserts, for example, Uwe Ricken has done great seminar sessions demoing this) then adding a clustered index might end up having a serious impact on your performance. And then once you have a heap, there’s something far more important to run a rebuild for than fragmentation: rebuild row redirections after updates. These are created when a row is updated but the update would not fit on the original page of the row. Instead of performing a page split and having to rebuild the equivalent part of all the NCIs, a new row is created ona new page, and the original row is changed to be a pointer to the new row. So if you imagine a non clustered index on a heap, this is a pointer to the original row, which is now another pointer (which in turn when updated if its too big to fit will become another pointer).

    Obviously this is a specific case, but the rebuild, during a maintenance window can keep this in check, and will be well worth doing.

    • way0utwest says:

      There are exceptions to every rule. Staging tables, perhaps, very high concurrency, there are reasons to break the rule.

      However, it’s a rule. If you don’t have a great exception, don’t break this. I want to note to people that are learning, that aren’t sure, this is a rule. If you learn more and discover why this causes an issue in specific situations, and there aren’t many, then break it.

  2. Pingback: Don’t Rebuild Heaps – Curated SQL

Comments are closed.