There are many aspects of what I want to do but I think learning one piece will let me derive the rest.
I have an SSIS package that uses powershell to download a publicly available zip file, an execute script to unzip with 7zip and then data flows to load the unzipped files to corresponding tables.
What I want to do is add the file name (and eventually other aspects of the file like creation date, record counts and so on) from any one of the unzipped files to a log table that keeps track of the summary level details of the files.
How do I dynamically store this type of information as part of the package? Derived columns? But what's the input? Thanks!
There are many options for dynamically working with files through SSIS. Below is an overview of one method. Of course this can vary, depending on your specific needs and requirements.
Directory
expression. Searching sub folders can also be set by checking the "Traverse subfolders" checkbox or using the Recurse
expression like the Folder field.*
will match any number of characters. For
example, *.csv
will get all csv files regardless of name and
Test*.txt
will return all .txt files with names beginning Test,
regardless of how many or which characters follow. To limit this to
a single character, use ?
. The FileSpec
expression will allow
this to be set dynamically similar to the directory by variable or parameter.Fully Qualified
will hold the complete file path, Name and Extension
will return the file name with extension, and Name Only
is just the file name.FileInfo
class provides options for finding details about the file such as the creation date, last time the file was accessed, and when the file was most recently written to. Additonal information on this can be found here.ConnectionString
expression of the Connection Manager will allow you to dynamically loop through each file. Make sure that the Fully Qualified
option is used for the "Retrieve file name" field as earlier if you decide to do this. You will also want to configure the correct columns and data types for the Connection Manager ahead of time. This same process can be followed for Excel files, however the variable with the file name will be used on the ExcelFilePath
expression instead.?
is the parameter marker (assuming this is an OLE DB connection). After this, map the same variable/parameter that stores the file name using the Parameter Mapping pane. Set the direction to Input, appropriate data type (likely VARCHAR
/NVARCHAR
), 0 in the Parameter Name field to indicate this is the first parameter in the SQL statement (additional ?
can be used for subsequent parameters in the SQL statement, then increment this field in accordance), and the default Parameter Size can be left at -1. Again, this is a simple example and you'll probably want store more about the files and their contents, but this can get you started.Sample SQL Insert:
INSERT INTO YourDataBase.YourSchema.YourTable (ColumnToHoldFileName)
VALUES (?)