sqlsql-serverhashsql-server-2014guid

SQL Server - Generate unique ID to compare several columns altogether


In SQL Server, is it possible to generate a GUID using a specific piece of data as an input value. For example,

DECLARE @seed1 VARCHAR(10) = 'Test'
DECLARE @seed1 VARCHAR(10) = 'Testing'
SELECT NEWID(@seed1) -- will always return the same output value
SELECT NEWID(@seed2) -- will always return the same output value, and will be different to the example above

I know this completely goes against the point of GUIDs, in that the ID would not be unique. I'm looking for a way to detect duplicate records based on certain criteria (the @seed value).

I've tried generating a VARBINARY string using the HASHBYTES function, however joining between tables using VARBINARY seems extremely slow. I'm hoping to find a similar alternative that is more efficient.

Edit: for more information on why I'm looking to achieve this.

I'm looking for a fast and efficient way of detecting duplicate information that exists on two tables. For example, I have first name, last name & email. When these are concatenated, should can be used to check whether these records eexists in table A and table B.

Simply joining on these fields is possible and provides the correct result, however is quite slow. Therefore, I was hoping to find a way of transforming the data into something such as a GUID, which would make the joins much more efficient.


Solution

  • I think you can use CHECKSUM function for returning int type.