According to Microsoft's documentation on NEWSEQUENTIALID
, the output of NEWSEQUENTIALID is predictable. But how predictable is predictable? Say I have a GUID that was generated by NEWSEQUENTIALID
, how hard would it be to:
/order?id=842
tells me that there are 842 orders in the application.Below is some background information about what I am doing and what the various tradeoffs are.
One of the security benefits of using GUID's over integers as primary keys is that GUID's are hard to guess. E.g. say a hacker sees a URL like /user?id=845
he might try to access /user?id=0
, since it is probable that the first user in the database is an administrative user. Moreover, a hacker can iterate over /user?id=0..1..2
to quickly gather all users.
Similarly, a privacy downside of integers is that they leak information. /order?id=482
tells me that the web shop has had 482 orders since its implementation.
Unfortunately, using GUID's as primary keys has well-known performance downsides. To this end, SQL Server introduced the NEWSEQUENTIALID
function. In this question, I would like to learn how predictable the output of NEWSEQUENTIALID
is.
The underlying OS function is UuidCreateSequential
. The value is derived from one of your network cards MAC address and a per-os-boot incremental value. See RFC4122. SQL Server does some byte-shuffling to make the result sort properly. So the value is highly predictable, in a sense. Specifically, if you know a value you can immediately predict a range of similar value.
However one cannot predict the equivalent of id=0
, nor can it predict that 52DE358F-45F1-E311-93EA-00269E58F20D
means the store sold at least 482 items.
The only 'approved' random generation is CRYPT_GEN_RANDOM
(which wraps CryptGenRandom
) but that is obviously a horrible key candidate.