[SOLVED] How to skip first n rows in U-SQL job?

How to skip first n rows in U-SQL job?

I want to run a U-SQL job to load the data from .txt file in the SQL table on Azure Data Lake store. I already have created database, schema and table in Azure data lake analytics.

Data in txt file are tab-limited, and I need to skip 2 first row. I think that I should use Extractors.Text() built-in extractor, but how to add skipFirstNRows parameter in it to extract the data ?

Solution

You just pass it to the extractor like this:

@searchlog =
 EXTRACT UserId          int,
         Start           DateTime,
         Region          string,
         Query           string,
         Duration        int?,
         Urls            string,
         ClickedUrls     string
 FROM "/Samples/Data/SearchLog.tsv"
 USING Extractors.Tsv(skipFirstNRows: 2);

I based the example on the TSV extractor as that one defaults to a tab as the delimiter.

(source)