sqlsql-serverperformancet-sqlcross-apply

When should I use CROSS APPLY over INNER JOIN?


What is the main purpose of using CROSS APPLY?

I have read (vaguely, through posts on the Internet) that cross apply can be more efficient when selecting over large data sets if you are partitioning. (Paging comes to mind)

I also know that CROSS APPLY doesn't require a UDF as the right-table.

In most INNER JOIN queries (one-to-many relationships), I could rewrite them to use CROSS APPLY, but they always give me equivalent execution plans.

Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?


Edit:

Here's a trivial example, where the execution plans are exactly the same. (Show me one where they differ and where cross apply is faster/more efficient)

create table Company (
    companyId int identity(1,1)
,   companyName varchar(100)
,   zipcode varchar(10) 
,   constraint PK_Company primary key (companyId)
)
GO

create table Person (
    personId int identity(1,1)
,   personName varchar(100)
,   companyId int
,   constraint FK_Person_CompanyId foreign key (companyId) references dbo.Company(companyId)
,   constraint PK_Person primary key (personId)
)
GO

insert Company
select 'ABC Company', '19808' union
select 'XYZ Company', '08534' union
select '123 Company', '10016'


insert Person
select 'Alan', 1 union
select 'Bobby', 1 union
select 'Chris', 1 union
select 'Xavier', 2 union
select 'Yoshi', 2 union
select 'Zambrano', 2 union
select 'Player 1', 3 union
select 'Player 2', 3 union
select 'Player 3', 3 


/* using CROSS APPLY */
select *
from Person p
cross apply (
    select *
    from Company c
    where p.companyid = c.companyId
) Czip

/* the equivalent query using INNER JOIN */
select *
from Person p
inner join Company c on p.companyid = c.companyId

Solution

  • Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?

    See the article in my blog for detailed performance comparison:

    CROSS APPLY works better on things that have no simple JOIN condition.

    This one selects 3 last records from t2 for each record from t1:

    SELECT  t1.*, t2o.*
    FROM    t1
    CROSS APPLY
            (
            SELECT  TOP 3 *
            FROM    t2
            WHERE   t2.t1_id = t1.id
            ORDER BY
                    t2.rank DESC
            ) t2o
    

    It cannot be easily formulated with an INNER JOIN condition.

    You could probably do something like that using CTE's and window function:

    WITH    t2o AS
            (
            SELECT  t2.*, ROW_NUMBER() OVER (PARTITION BY t1_id ORDER BY rank) AS rn
            FROM    t2
            )
    SELECT  t1.*, t2o.*
    FROM    t1
    INNER JOIN
            t2o
    ON      t2o.t1_id = t1.id
            AND t2o.rn <= 3
    

    , but this is less readable and probably less efficient.

    Update:

    Just checked.

    master is a table of about 20,000,000 records with a PRIMARY KEY on id.

    This query:

    WITH    q AS
            (
            SELECT  *, ROW_NUMBER() OVER (ORDER BY id) AS rn
            FROM    master
            ),
            t AS 
            (
            SELECT  1 AS id
            UNION ALL
            SELECT  2
            )
    SELECT  *
    FROM    t
    JOIN    q
    ON      q.rn <= t.id
    

    runs for almost 30 seconds, while this one:

    WITH    t AS 
            (
            SELECT  1 AS id
            UNION ALL
            SELECT  2
            )
    SELECT  *
    FROM    t
    CROSS APPLY
            (
            SELECT  TOP (t.id) m.*
            FROM    master m
            ORDER BY
                    id
            ) q
    

    is instant.