In an ssis Dataflow there is a lookup component ,which lookup on a table with 18 million records.I have configured the lookup with full cache.
Default buffer size :20485760
Default Buffer Max rows: 100000
The lookup join is based on an ID column of varchar(13)type
It gives an error as shown below.What lookup configuration is suitable to cache these many records
Error: The buffer manager cannot write 8 bytes to file "C:\Users\usrname\AppData\Local\Temp\16\DTS{B98CD347-1EF1-4BC1-9DD9-C1B3AB2B8D73}.tmp". There was insufficient disk space or quota.
what would be the difference in performance if i use a lookup with no cache? I did understand that in full cache mode ,the data is cached before pre execute stage and do not have to go back to database.This full cache memory takes large amount of memory and add aditional startup time for the data flow.My question is what configuration do i have to setup in order to handle large amount of data in full cache mode
Whats the solution if the lookup table has million records (and they dont fit in a full cache)
Use a Merge Join component instead. Sort both inputs on the join key, specify inner/left/full joins based on your specification. Use the different outputs to get functionality like the lookup component.
Merge Join usually performs better on larger datasets.