excel-formula

Excel formula to remove unwanted character


How to remove unwanted characters on the most right that the back of the word that contain number (from left side) to the right side using Excel formula? sample word

I have tried using Left(cellA, Len(cellA)-7). It works fine to fulfill my needs but it also remove some character when the sentence does not contain unwanted. If using the Find and Replace method will take much time to check line by line and copy to do.


Solution

  • Assuming that the problem word will:

    with Office 365, you can do:

    =LET( sentence, B2,
            lastWord, RIGHT( sentence, 7 ),
            w, CODE( MID( lastWord, SEQUENCE(,7),1)),
            IF( AND((w >= 65)*( w <= 90 )*{1,1,1,0,0,0,0} + (w >= 48)*( w <= 57 )*{0,0,0,1,1,1,1} ),
                  TRIM( LEFT( sentence, LEN( sentence ) - 7 ) ),
                  sentence ) )
    

    Where the sentence is in B2.

    enter image description here

    Another Approach I just realized it could have been a little simpler:

    =LET( sentence, B2,
            lastWord, RIGHT( sentence, 7 ),
            w, MID( lastWord, SEQUENCE(,7),1),
            IF( AND((w >= "A")*( w <= "Z" )*{1,1,1,0,0,0,0} + (w >="0")*( w <= "9" )*{0,0,0,1,1,1,1} ),
                  TRIM( LEFT( sentence, LEN( sentence ) - 7 ) ),
                  sentence ) )
    

    How it works (in response to the comments)

    The formula first breaks the last 7 characters into a 7 element array of characters using:

    MID( lastWord, SEQUENCE(,7),1)

    So, "ABC0123" becomes an array of: enter image description here

    Then each character is tested to see if it is a capital letter with this part: ( w >= "A")*( w <= "Z" ) which produces an array of 1's and 0's through boolean arithmetic. enter image description here You can see that the first three characters were indeed capital letters and the last four were not. But what if we had ABC012A? That would put a 1 in the last array cell. We don't want that one because it needs to be a number.

    enter image description here

    There are a number of ways to fix this, but I chose to apply a mask to force it back to zero. I will later test if that character was a number, but for now, I don't want that one in there. When you multiply the mask times the result, it forces the last four cells to zero. Like this:

    enter image description here

    NB: Boolean arithmetic works like this =TRUE*TRUE will result in 1 and =TRUE*FALSE results in 0. Multiplying is the same as AND. To OR, you add so that =TRUE+TRUE results in 2 (which is the same as TRUE), =TRUE+FALSE results in 1 and =FALSE+FALSE results in 0.

    Going back to the original string of ABC0123, if we now apply the number test, you can see the result.

    enter image description here

    And when we apply it to ABC012A, we get:

    enter image description here

    Now we need to know that both sides are all true. So we can add the arrays to each other. If we just added the arrays together without the masks, it would produce some unintended results. The case of ABC012A would pass because it is adding 1110001 to 0001110 which would be 1111111. By masking the two sides, we get 1111110 for ABC012A. So the last character failed the test which is what we want. ABC0123 is adding 1110000 to 0001111 which would be 1111111. This is what we want. We want all characters to be 1 or TRUE.

    So to find that out, we just apply AND to the array. So here is the end result: enter image description here