sqlsql-serverstring-matchingstreet-address

Split full address to contain only street name


I have a table with address1, city, state, and postal code. However, some address1 will also contains city, state and postal code (separated by either comma or space or both). Example:

Address1: 9999 western Rd, Los Angeles, CA, 90001
City: Los Angeles
State: CA
Postal: 90001
I want Address1 to be: 9999 western Rd

I've tried the following SQL to fix those addresses: (To simplify here I assume all fields are not null or empty, in the real system for countries without states it will be either empty or same as the country name.)

SELECT LEFT(address1, PATINDEX('%[, ]'+city+'%', billingAddress) - 1)
FROM addresses
WHERE address1 like '%[, ]'+city+'%'+state+'%'+postal_code+'%'
    AND PATINDEX('%[, ]'+City+'%', address1) < 12

The problem here is many street names include the city they are in. For example, with 9999 KIRKLAND WAY in city KIRKLAND, this query will cause the remaining street name from this query to be 9999.

Is there anyway to solve this problem in SQL?


Solution

  • First, I will agree with other comments that there is no flawless system - working with addresses can be a giant pain. That being said...

    If your Address1 values are sanitized enough to only use commas to separate the various address parts, you might try using string_split to break it apart and then filter out all but the street information. If it's not that clean, maybe this can at least be a starting point. And this definitely doesn't comprehensively cover international information or any other weirdness that may crop up.

    The fiddle example provided below uses some dummy addresses where I've overpopulated a few of the Address1 values and then filter them down to streets with string_split.

    SQL Fiddle example

    CREATE TABLE [dbo].[addresses](
        [Address1] [nvarchar](max) NULL,
        [City] [nvarchar](max) NULL,
        [State] [char](2) NULL,
        [Postal] [varchar](20) NULL
    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
    GO
    INSERT [dbo].[addresses] ([Address1], [City], [State], [Postal]) VALUES (N'204 Cowley Avenue', N'Corpus Christi', N'FL', N'94075-8586')
    GO
    INSERT [dbo].[addresses] ([Address1], [City], [State], [Postal]) VALUES (N'273 Green Milton Road', N'Albuquerque', N'ND', N'46038-4384')
    GO
    INSERT [dbo].[addresses] ([Address1], [City], [State], [Postal]) VALUES (N'662 Clarendon Street', N'Detroit', N'RI', N'27706-7936')
    GO
    INSERT [dbo].[addresses] ([Address1], [City], [State], [Postal]) VALUES (N'57 Oak Boulevard', N'St. Petersburg', N'UT', N'25607-1143')
    GO
    INSERT [dbo].[addresses] ([Address1], [City], [State], [Postal]) VALUES (N'53 South Rocky Fabien Blvd., Des Moines, IL, 53316-8523', N'Des Moines', N'IL', N'53316-8523')
    GO
    INSERT [dbo].[addresses] ([Address1], [City], [State], [Postal]) VALUES (N'38 Shreveport Way, Shreveport, NC, 72394-4783', N'Shreveport', N'NC', N'72394-4783')
    GO
    INSERT [dbo].[addresses] ([Address1], [City], [State], [Postal]) VALUES (N'938 South First Road', N'Arlington', N'MN', N'83467-8269')
    GO
    INSERT [dbo].[addresses] ([Address1], [City], [State], [Postal]) VALUES (N'674 South Cowley Avenue, Grand Rapids, KY, 51374-4173', N'Grand Rapids', N'KY', N'51374-4173')
    GO
    
    SELECT 
        LTRIM(RTRIM(t.value)) AS fixedAddress1,
        a.* from addresses a
    CROSS APPLY string_split(a.Address1, ',') t
    WHERE LTRIM(RTRIM(t.value)) <> a.City
        AND LTRIM(RTRIM(t.value)) <> a.State
        AND LTRIM(RTRIM(t.value)) <> a.Postal
    

    enter image description here