sql-servernewid

Changing to newsequentialid() on an existing table with 4 million records


My Question is almost the same like Changing newid() to newsequentialid() on an existing table

if I change to newsequentialid() than the index should be more compact, correct? And what happens if the sequence is hitting exists ID's while inserting new records? Will the database check that before?


Solution

  • It will have less fragmentation so you could say its' more compact. But it will still be the same size per key (16 bytes + overhead) for a guid key. The benefit of using a sequential guid vs a nonsequential guid is that you have less chances for a page split. A page split is where a logical page has to have a record inserted, but would be more than the page is allowed to hold, so the page is "split"; half to one page and half to another. Sometimes a page split causes another page to split, and theoretically you can have a cascading and costly page split by just inserting one new record. When you use a sequential key, it's less likely that you'll randomly triggger a page split somewhere in the middle of your index, so you reduce the likelihood of those happening. Using a sequential guid also helps optimize range scans (e.g selecting between one value and another value) but with a GUID, it's very unlikely that you'll end up doing many range based scans, since the value is basically meaningless.

    What happens when the sequence hits an existing iD? You get a PK violation. SQL doesn't ensure that a GUID can only be used once. Sequential ID's start at a new seed every time the server is restarted so, in theory, you could skip back in the sequence and then wind up covering the same value twice. However, as with GUIDs in general, the liklihood of this happening is so astronomically small as to be statistically insignificant.

    As with everything, the cost and benefits depend on your specific scenario. If you're looking to replace a GUID key with a sequential key, see if it's possible to use an int or bigint surrogate key instead of a GUID, because generally, all things being equal, an integer will outperform a guid in every case. 4 Million records will trivially fit into an INT data type and even more trivially into a bigint.

    Hope this helps.