sqlsql-serverindexingmergesqlperformance

Slow MERGE on millions of rows


I'm merging millions (1-10) of rows into a table that contains 20+ millions of rows.

Target table has 16 columns, 1 PK column (NormalID).

Problem I found is that some records have their PK column Ids changed. So for example record "A" with PK id of "12345678", will have its id changed to "4567890". Hence why I've put in a delete in the merge. I've limited the source pull of data to be 2 months at the time.

Source Table (this usually holds a month or two worth of data. For example year 2024 with September (9) and October (10):

CREATE TABLE [dbo].[slowmergetbl_source](
    [sEntity] [varchar](50) NOT NULL,
    [wYear] [smallint] NOT NULL,
    [wPeriod] [smallint] NOT NULL,
    [sAccount] [varchar](50) NOT NULL,
    [wBracket] [smallint] NOT NULL,
    [sCurrency] [varchar](3) NULL,
    [dValue] [float] NULL,
    [bDirty] [bit] NULL DEFAULT 1,
    [dFactValue] [float] NULL,
    [wEntityId] [int] NULL,
    [wAccountId] [int] NULL,
    [wTimeId] [int] NULL,
    [wExtDimId1] [int] NOT NULL DEFAULT 0,
    [NormalID] [int]  NOT NULL,
CONSTRAINT [PK_SourceNormalID] PRIMARY KEY CLUSTERED ([NormalID] ASC)
);

Table script for the main table that is being merged into:

CREATE TABLE [reporting].[slowmergetbl_target](
    [sEntity] [varchar](50) NOT NULL,
    [wYear] [smallint] NOT NULL,
    [wPeriod] [smallint] NOT NULL,
    [sAccount] [varchar](50) NOT NULL,
    [wBracket] [smallint] NOT NULL,
    [sCurrency] [varchar](3) NULL,
    [dValue] [float] NULL,
    [bDirty] [bit] NULL DEFAULT 1,
    [dFactValue] [float] NULL,
    [wEntityId] [int] NULL,
    [wAccountId] [int] NULL,
    [wTimeId] [int] NULL,
    [wExtDimId1] [int] NOT NULL DEFAULT 0,
    [NormalID] [int]  NOT NULL,
    [ModifiedDate] [datetime] NULL,
    [ModifyType] [varchar](10) NULL,
CONSTRAINT [PK_NormalID] PRIMARY KEY CLUSTERED ([NormalID] ASC),
INDEX [IDX_Year] NONCLUSTERED ([wYear] ASC),
INDEX [IDX_Period] NONCLUSTERED ([wPeriod] ASC),

INDEX [IX_ReportQuery] NONCLUSTERED
(
[wYear],
[dFactValue]
)
INCLUDE ([sEntity],[wPeriod],[sAccount],[wBracket],[wAccountId]),


INDEX [IX_ReportQuery2] NONCLUSTERED
(
[sAccount],
[wBracket])
INCLUDE ([sEntity],[wYear],[wPeriod],[dValue],[wEntityId]),

INDEX [IDX_Normal_ModifyType] NONCLUSTERED ([ModifyType] ASC)

);

Merge statement:

merge reporting.slowmergetbl_target as a
using dbo.slowmergetbl_source as b on a.normalid = b.normalid

WHEN NOT MATCHED BY SOURCE and a.wyear in (
2024
) and a.wPeriod in (9,10 )

THEN UPDATE SET
a.modifieddate=getdate(),
a.modifytype='DELETED'

when not matched by target
then insert (
[sEntity],
[wYear],
[wPeriod],
[sAccount],
[wBracket],
[sCurrency],
[dValue],
[bDirty],
[dFactValue],
[wEntityId],
[wAccountId],
[wTimeId],
[wExtDimId1],
[NormalID],
[ModifiedDate],
[ModifyType]
) 
values (
b.[sEntity],
b.[wYear],
b.[wPeriod],
b.[sAccount],
b.[wBracket],
b.[sCurrency],
b.[dValue],
b.[bDirty],
b.[dFactValue],
b.[wEntityId],
b.[wAccountId],
b.[wTimeId],
b.[wExtDimId1],
b.[NormalID],
getdate(),'INSERT')

when matched
and (
isnull(a.[sAccount],'')<>isnull(b.[sAccount],'')
or isnull(a.[sEntity],'')<>isnull(b.[sEntity],'')
or isnull(a.[wYear],0)<>isnull(b.[wYear],0)
or isnull(a.[wPeriod],0)<>isnull(b.[wPeriod],0)
or isnull(a.[wBracket],0)<>isnull(b.[wBracket],0)
or isnull(a.[wEntityId],0)<>isnull(b.[wEntityId],0)
or isnull(a.[wAccountId],0)<>isnull(b.[wAccountId],0)
or isnull(a.[sCurrency],'')<>isnull(b.[sCurrency],'')
or isnull(cast(a.[dValue] as float),0.00)<>isnull(cast(b.[dValue] as float),0.00)
or isnull(a.[bDirty],0)<>isnull(b.[bDirty],0)
or isnull(cast(a.[dFactValue] as float),0.00)<>isnull(cast(b.[dFactValue] as float),0.00)
or isnull(a.[wTimeId],0)<>isnull(b.[wTimeId],0)
or isnull(a.[wExtDimId1],0)<>isnull(b.[wExtDimId1],0)
)
then update set
a.[sAccount]=b.[sAccount],
a.[sEntity]=b.[sEntity],
a.[wYear]=b.[wYear],
a.[wPeriod]=b.[wPeriod],
a.[wBracket]=b.[wBracket],
a.[wEntityId]=b.[wEntityId],
a.[wAccountId]=b.[wAccountId],
a.[sCurrency]=b.[sCurrency],
a.[dValue]=b.[dValue],
a.[bDirty]=b.[bDirty],
a.[dFactValue]=b.[dFactValue],
a.[wTimeId]=b.[wTimeId],
a.[wExtDimId1]=b.[wExtDimId1],
a.modifieddate=getdate(),
a.modifytype='UPDATE';

Test data would look something like this:

insert into [dbo].[slowmergetbl_source] values ('1234',2024,9,'Test1',0,'USD',500,1,500,4567,1788,202409,0,505489215);
insert into [dbo].[slowmergetbl_source] values ('6795',2024,10,'Test2',0,'USD',100,1,100,0986,8897,202410,0,515456210);

Seems to be running very slowly, sometimes few hours.

Any recommendations on how to make this better and speed up the merge?

Thanks.


Solution

  • A couple of suggestions: