We got a requirement in the redshift table to split the full name column into first_name, middle_name, last_name columns with the help of space in the string using the SQL script. Below is the sample data and expected output for the columns. The first_name and last_name are deriving fine but getting issues for the middle_namewe with the below SQL, it removes the strings in middle name which is the common in other two columns hence it is not working for a few scenarios below are examples
Can you please help us fix this issue?
SQL Query: "Select fullname , SUBSTRING(fullname , 1, CHARINDEX(' ', fullname) - 1) as FirstName, RTRIM(LTRIM(REPLACE(REPLACE(fullname,SUBSTRING(fullname , 1, CHARINDEX(' ', fullname) - 1),''), REVERSE( LEFT( REVERSE(fullname), CHARINDEX(' ', REVERSE(fullname))-1 ) ),'')))as MiddleName, REVERSE( LEFT( REVERSE(fullname), CHARINDEX(' ', REVERSE(fullname))-1 ) ) as LastName From (select 'john johnson' fullname)"
I'd use REGEXP_SUBSTR
for this. In my opinion this gets this whole thing more readable:
select
fullname,
regexp_substr(full_name, '^[^ ]+') as first_name,
trim(' ' from regexp_substr(full_name, ' .* ')) as middle_name,
regexp_substr(full_name, '[^ ]+$') as last_name
from mytable;
Explanation of the regular expressions:
'^[^ ]+'
= non-blank characters directly after string start' .* '
= first blank and last blank and all characters inbetween'^[^ ]+'
= non-blank characters directly before string endhttps://docs.aws.amazon.com/de_de/redshift/latest/dg/REGEXP_SUBSTR.html https://docs.aws.amazon.com/de_de/redshift/latest/dg/r_TRIM.html