sql-serversqlcmd

sqlcmd, single row product multiple lines


I am using sqlcmd command to export a SQL Server table to CSV file.

sqlcmd -S EC2AMAZ-5UNBD90 -d   miadmfggp_live  -Q "SELECT SDESC FROM dbo.pmdocs WHERE HMY=95613900;" -s "," -o "C:\pmdocs_ms_v1.csv" -E -y0

I noticed an issue here. Some of the row in the table can produce two lines in the CSV file. I noticed that it always happen on the SDESC field. such as if the column value is "Tier 547 door is broken Tier 543 does not lock latch broken 5883_1179695_0.jpg", then the CSV file may looks like:

Tier 547 door is broken
Tier 543 does not lock latch broken 5883_1179695_0.jpg

What I want is one CSV line for each row in the table. Please advise how to fix the issue.


Solution

  • Please try the following solution. It will remove whitespaces, i.e. all invisible TAB, Carriage Return, and Line Feed characters.

    SQL #1

    UDF udf_tokenize() and its test harness.

    USE tempdb;
    GO
    
    DROP FUNCTION IF EXISTS dbo.udf_tokenize;
    GO
    
    /*
    1. All invisible TAB, Carriage Return, and Line Feed characters will be replaced with spaces.
    2. Then leading and trailing spaces are removed from the value. 
    3. Further, contiguous occurrences of more than one space will be replaced with a single space.
    */
    CREATE FUNCTION dbo.udf_tokenize(@input VARCHAR(MAX))
       RETURNS VARCHAR(MAX)
    AS
    BEGIN 
       RETURN (SELECT CAST('<r><![CDATA[' + @input + ' ' + ']]></r>' AS XML).value('(/r/text())[1] cast as xs:token?','VARCHAR(MAX)'));
    END
    GO
    
    -- DDL and sample data population, start
    DECLARE @mockTbl TABLE (ID INT IDENTITY(1,1), col_1 VARCHAR(100), col_2 VARCHAR(100));
    INSERT INTO @mockTbl (col_1, col_2) VALUES 
    (CHAR(13) + '  FL   ' + CHAR(9), CHAR(13) + CHAR(10) + '  Miami'),
    ('  FL   ', '  Fort       Lauderdale   '),
    ('  NY   ', '  New           York   '),
    ('  NY   ', ''),
    ('  NY   ', NULL);
    -- DDL and sample data population, end
    
    -- before
    SELECT *, LEN(col_2) AS [col_2_len] 
    FROM @mockTbl;
    
    -- remove invisible white space chars
    UPDATE @mockTbl
    SET col_1 = dbo.udf_tokenize(col_1)
    , col_2 = dbo.udf_tokenize(col_2);
    
    -- after
    SELECT *, LEN(col_2) AS [col_2_len] 
    FROM @mockTbl;
    

    SQL #2

    So, your SQL statement will become as follows:

    SELECT dbo.udf_tokenize(SDESC) FROM dbo.pmdocs WHERE HMY=95613900;