sql-serverssissql-server-2017

Long loading time when opening OLE DB Source in SSIS Package


I have an SSIS (SQL Server Integration Services) package in Visual Studio that contains a Data Flow Task. The Data Flow Task reads data from an OLE DB Source and loads it into an OLE DB Destination. The OLE DB Source is connected to a view in my database, and the SELECT statement for that view takes around 3 minutes and 20 seconds to execute.

The issue I'm facing is that when I try to open the OLE DB Source component in Visual Studio, it takes an extremely long time, sometimes over an hour, to load. This makes it very frustrating to edit the source and work on my SSIS package.

I have tried a few things to resolve this issue, but none of them have provided a satisfactory solution.

At this point, I'm not sure what else I can do to address this issue. I'm open to any suggestions or advice on how to improve the loading time when opening the OLE DB Source component in Visual Studio.

Is there anything else I can try to speed up the loading time? Are there any additional settings or optimizations that I might have missed? Any guidance would be greatly appreciated.


Solution

  • I'm facing the same issue as the OP and found the underlying cause in my case.

    Summary

    Before finding this post and diving deeper in the investigation, I simply updated my query with TOP 1 while developing the ETL. Although, @Tom Thomson's workaround of using a SQL Command for the Data access mode and prefixing it with SET ROWCOUNT 0 works and is less of an assle, investigation and testing showed that the SET ROWCOUNT is simply not needed. The simple fact of using SQL Command is enough for SSIS to issue a SP_PREPARE command instead of the SELECT * query and retrieve the metadata quickly.

    So the shortest answer if you're having an issue with a long loading time when opening an OLE DB data source is: Use 'SQL Command' instead of 'Table or view'.

    For more details, continue reading.

    Investigation details

    But to understand the issue I did look at the activities behind the validation. It's executing:

    SET ROWCOUNT 1 
    SELECT * FROM myView
    

    It's not performing the sys.dm_exec_describe_first_result_set query as stated by another user.

    The SSIS default method works quite well with most of the tables and views I've tested. However, I'm having issues with my recent views.

    Without ROWCOUNT my view takes 1 second, with it, it takes forever (I cancelled before completing). The SET ROWCOUNT 1 is affecting the execution plan.

    But the biggest difference is that, with ROWCOUNT, a Table Spool is inserted in the plan and just keep chugging in rows:

    Execution plan showing Table Spool step

    Without ROWCOUNT, there's no Table Spool in the plan and no excessive row processing: Execution plan showing absence of Table Spool step

    With that information I removed the Linked Server part of my query and tested again and this time both executions returned quickly, with and without ROWCOUNT.

    I have other complex queries as source and didn't have issues with them in the past, and did remember seeing SSIS performing sys.dm_exec_describe_first_result_set commands so I took a look into them and noticed that it's because I'd wrapped them in Stored Procedure (SP) and use SQL command as the Data access mode in my OLE DB Source. The validation step issues a sp_prepare statement on the SP followed by a sys.dm_exec_describe_first_result_set command. I embedded my problematic query in a SP and it worked just fine. It loaded very quickly.

    After further testing, I found out that the underlying reason it works well with a SP is simply because SSIS issues a SP_PREPARE statement when using ANY SQL Command as the Data access mode. Therefore, the other answer that adds SET ROWCOUNT 0 is not totally accurate, as that prefix is simply not needed.

    With that, I'm wondering why, if other more effecient methods like sys.dm_exec_describe_first_result_set or sp_prepare exists and works to retrieve the metadata, SSIS uses SET ROWCOUNT 1 to retrieve the information when "Table or view" is selected?