Alright here comes the million dollar question
Assume that i have the following table
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[tblUsersProfile](
[personId] [int] IDENTITY(1,1) NOT NULL,
[personName] [varchar](16) NOT NULL,
[personSurName] [varchar](16) NOT NULL,
CONSTRAINT [PK_tblUsersProfile] PRIMARY KEY CLUSTERED
(
[personId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
Now lets say this table has 2 million records
So the next identity id
is 2,000,001
However i am making mass delete and the number of records becomes 45,321
However next identity id
will be 2,000,001
still
So reordering table with recreation would make any difference
In the first case there will be 45,321 records however identity id will be 2,000,001
In the second case there will be again 45,321 records however identity id will be 45,322
So would there be any performance, storage etc difference between these 2 cases?
Thank you
SQL Server 2014
Expanding on my comment to explain a bit further. For clarity the comment was:
No, there will be no impact on performance. Since this is your clustering key, whether or not the seed is 45,322, or 2,000,0001, the record will still be entered onto the next free space on the clustered index after the record 45,321. The value of an Identity column is intended to be meaningless, if it is not you are probably not using correctly. After a big delete like that you may end up with some index fragmentation, but the identity seed is completely unrelated to this.
With regard to the fragmentation, on a very simplified example you might have a table with 5 pages, with 100 records per page:
Now, if you do your delete, and remove all records where the last digit is not 1, and all records where the ID is over 300 you get:
When we now insert to this table, whether the next identity is 291, or 501, it doesn't change anything. The pages must remain in the correct order, so the highest ID is 291, so the next record must be inserted after that, on the same page if there is space, otherwise a new page is created. In this case there are 9 empty slots on the 3rd page, so the next record can be inserted there. Since both 292, and 500 are higher than 291, the behaviour is the same.
In both cases the issue remains that after the delete you have 3 pages with lots of free space (only 10% full), you now only have 30 records, which will happily fit on one page, so you could rebuild your index to do this, so that now you only need to read a single page to get all your data.
Again, I stress, that this is a very simple example, and I would not suggest rebuilding a clustered index to free up 2 pages!
It is also important to stress that this behaviour is because the ID column is the clustering key, NOT the primary key. They are not necessarily one and the same, however, if you were clustering on something other than your identity column it would still make no difference to performance if you were to reseed it or not after the delete. Identity columns are there for purely that, identity, the actual value is irrelevant as long as you can uniquely identify a row.
SAMPLE TEST CODE
-- CREATE TABLE AND FILL WITH 100,000 ROWS
IF OBJECT_ID(N'dbo.DefragTest', 'U') IS NOT NULL
DROP TABLE dbo.DefragTest;
CREATE TABLE dbo.DefragTest (ID INT IDENTITY(1, 1) PRIMARY KEY, Filler CHAR(1) NULL);
INSERT dbo.DefragTest (Filler)
SELECT TOP 100000 NULL
FROM sys.all_objects AS a, sys.all_objects AS b;
-- CHECK PAGE STATISTICS
SELECT Stage = 'After Initial Insert',
IdentitySeed = IDENT_CURRENT(N'dbo.DefragTest'),
p.rows,
a.total_pages,
a.data_pages,
AvgRecordsPerPage = CAST(p.rows / CAST(a.data_pages AS FLOAT) AS DECIMAL(10, 2))
FROM sys.partitions AS p
LEFT JOIN sys.allocation_units AS a
ON a.container_id = p.partition_id
WHERE p.[object_id] = OBJECT_ID(N'dbo.DefragTest', 'U')
AND p.index_id IN (0, 1); -- CLUSTERED OR HEAP
-- DELETE RECORDS
DELETE dbo.DefragTest
WHERE ID % 10 != 1
OR ID > 50000;
-- CHECK PAGE STATISTICS
SELECT Stage = 'After Delete',
IdentitySeed = IDENT_CURRENT(N'dbo.DefragTest'),
p.rows,
a.total_pages,
a.data_pages,
AvgRecordsPerPage = CAST(p.rows / CAST(a.data_pages AS FLOAT) AS DECIMAL(10, 2))
FROM sys.partitions AS p
LEFT JOIN sys.allocation_units AS a
ON a.container_id = p.partition_id
WHERE p.[object_id] = OBJECT_ID(N'dbo.DefragTest', 'U')
AND p.index_id IN (0, 1); -- CLUSTERED OR HEAP
-- RESEED (REMOVED FOR ONE RUN)
DBCC CHECKIDENT ('dbo.DefragTest', RESEED, 50000);
--INSERT ROWS TO SEE EFFECT ON PAGE
INSERT dbo.DefragTest (Filler)
SELECT TOP 10000 NULL
FROM sys.all_objects AS a;
-- CHECK PAGE STATISTICS
SELECT Stage = 'After Second Insert',
IdentitySeed = IDENT_CURRENT(N'dbo.DefragTest'),
p.rows,
a.total_pages,
a.data_pages,
AvgRecordsPerPage = CAST(p.rows / CAST(a.data_pages AS FLOAT) AS DECIMAL(10, 2))
FROM sys.partitions AS p
LEFT JOIN sys.allocation_units AS a
ON a.container_id = p.partition_id
WHERE p.[object_id] = OBJECT_ID(N'dbo.DefragTest', 'U')
AND p.index_id IN (0, 1); -- CLUSTERED OR HEAP
-- CHECK READS REQUIRED FOR FULL TABLE SCAN
SET STATISTICS IO ON;
SELECT COUNT(Filler)
FROM dbo.DefragTest;
-- REBUILD INDEX
ALTER INDEX PK_DefragTest__ID ON dbo.DefragTest REBUILD;
-- CHECK PAGE STATISTICS
SELECT Stage = 'After Index Rebuild',
IdentitySeed = IDENT_CURRENT(N'dbo.DefragTest'),
p.rows,
a.total_pages,
a.data_pages,
AvgRecordsPerPage = CAST(p.rows / CAST(a.data_pages AS FLOAT) AS DECIMAL(10, 2))
FROM sys.partitions AS p
LEFT JOIN sys.allocation_units AS a
ON a.container_id = p.partition_id
WHERE p.[object_id] = OBJECT_ID(N'dbo.DefragTest', 'U')
AND p.index_id IN (0, 1); -- CLUSTERED OR HEAP
-- CHECK READS REQUIRED FOR FULL TABLE SCAN
SELECT COUNT(Filler)
FROM dbo.DefragTest;
SET STATISTICS IO OFF;
Output with Reseed:
Stage IdentitySeed rows total_pages data_pages AvgRecordsPerPage
After Initial Insert 100000 100000 178 174 574.71
After Delete 100000 5000 90 87 57.47
After Second Insert 52624 7624 98 91 83.78
After Index Rebuild 52624 7624 18 14 544.57
Table 'DefragTest'. Scan count 1, logical reads 93 (Count before rebuild)
Table 'DefragTest'. Scan count 1, logical reads 16 (Count after rebuild)
Output without Reseed:
Stage IdentitySeed rows total_pages data_pages AvgRecordsPerPage
After Initial Insert 100000 100000 178 174 574.71
After Delete 100000 5000 90 87 57.47
After Second Insert 102624 7624 98 91 83.78
After Index Rebuild 52624 7624 18 14 544.57
Table 'DefragTest'. Scan count 1, logical reads 93 (Count before rebuild)
Table 'DefragTest'. Scan count 1, logical reads 16 (Count after rebuild)
As you can see, in each case there is no difference, in the way the data is stored or read, it is only the value of IDENT_INCR()
that changes, and in both cases rebuilding the clustered index drastically reduces the number of pages, which in turn improves query performance since there are less logical reads to get the same amount of data.