I am working on writing a UDF that searches a string of text for different combinations of white space delimiters (i.e. Carriage Returns, Line Feeds, Spaces, etc.) and replaces them with a different non-white space delimiter.
I inserted all the possible combinations of delimiters in a table variable along with the new replacement delimiter; however, when I used the replace()
function it wasn't matching my delimiters as I expected. It turns out if one row on my delimiters table had a delimiter that was 3 characters long, and other rows only had delimiters that were 2 characters long, it was inserting a trailing space on the rows with only 2 character delimiters.
Also I noticed if I added just one TRIM()
around any single or combination of CHAR()
s on just one row during the insert, it would remove the trailing space for all rows.
I am just trying to understand why the trailing spaces are being added during the insert, as well as why using the TRIM()
on just one row during the insert is affecting all of the rows with trailing whitespace.
Using MSSQL Server 14.0.3445 RTM with SQL_Latin1_General_CP1_CI_AS Collation
Here is an example with comments that explains my issue:
DECLARE @delimiters TABLE (Id INT, delim VARCHAR(10), newDelim CHAR)
INSERT INTO @delimiters
VALUES
(1, CHAR(10) + CHAR(32) + CHAR(13), '$'),
(2, CHAR(10) + CHAR(13), '#'),
(3, CHAR(13) + CHAR(10), '@')
DECLARE @strings TABLE (Id INT, [str] VARCHAR(100))
INSERT INTO @strings
VALUES
(1, 'This is paragraph 1.' + CHAR(10) + CHAR(32) + CHAR(13) + 'This is paragraph 2.'),
(2, 'This is paragraph 1.' + CHAR(10) + CHAR(13) + 'This is paragraph 2.'),
(3, 'This is paragraph 1.' + CHAR(13) + CHAR(10) + 'This is paragraph 2.')
-- 2 & 3 dont match, but they should.
-- For some reason a trailing space was added to Id 2 & 3
-- in the @delimiters table at the time of the insert clause.
SELECT
REPLACE(s.[str], d.delim, d.newDelim) AS NewStr
FROM @strings s
LEFT JOIN @delimiters d ON s.Id = d.Id
-- Adding a TRIM() fixes it
SELECT
REPLACE(s.[str], TRIM(d.delim), d.newDelim) AS NewStr
FROM @strings s
LEFT JOIN @delimiters d ON s.Id = d.Id
-- Clear the @delimiters table
DELETE FROM @delimiters WHERE delim IS NOT NULL
-- Now Im adding a TRIM() during the insert.
-- You can literally add it around any
-- single or combination of CHAR()s in the delim column.
INSERT INTO @delimiters
VALUES
(1, TRIM(CHAR(10)) + CHAR(32) + CHAR(13), '$'),
(2, CHAR(10) + CHAR(13), '#'),
(3, CHAR(13) + CHAR(10), '@')
-- Now they all match, I only put the TRIM()
-- on the first CHAR() for ID=1 during the insert,
-- but it trimmed the trailing space for all of them.
SELECT
REPLACE(s.[str], d.delim, d.newDelim) AS NewStr
FROM @strings s
LEFT JOIN @delimiters d ON s.Id = d.Id
Result set I get:
The VALUES
clause essentially represents a table, that is, all values in each column must be of the same type - the one that covers the widest of the provided values.
The type of CHAR(10) + CHAR(32) + CHAR(13)
is CHAR(3)
. It is the widest of the provided values, thus, the entire column is typed as char(3)
.
char
, unlike varchar
, requires padding with spaces, so your first VALUES
clause inserts the second and the third delimiters padded with a space to the length of 3.
TRIM()
, on the other hand, returns a varchar
, which does not require padding with spaces. So including the TRIM
in the VALUES
makes the entire column varchar(3)
, and no spaces are added to the end of the last two lines.