excelexcel-formula

Formula in Excel to count occurrences of substrings within strings


I am trying to count the number of times a sub-string appears within a column of string data in Excel. Please see the below example.

The column of string data (tweets) looks like this:

   A
1  An example string with @username in it
2  RT @AwesomeUser says @username is awesome

The column with "substrings" (Twitter screen names) looks like this:

   B
1  username
2  AwesomeUser

I want to use a formula to count the number of times that a substring from B1, B2, etc. appears in the strings in column A. For example: a formula searching for B1 would return "2" and a search for B2 would return "1".

I can't do it this way:

=COUNTIF(A:A, "username")

because COUNTIF only looks for strings, not substrings. This formula would always return "0".

Here's a formula I thought might do it:

=SUMPRODUCT((LEN(A:A)-(LEN(SUBSTITUTE(A:A,"username",""))))/LEN("username"))

Unfortunately, I have 16,000 entries in column B and tens of thousands in A, so counting characters won't work even on a high power PC (also, the result returned by the function is suspect).

I thought about using:

=COUNTIF(A:A, "*username*")

but COUNTIF requires a string with the star operators; I need to use cell references due to the volume of data.

My question: does anyone know how I can use a formula for this? If using COUNTIF, how do I get a cell reference in the conditional part of the statement (or use a function to substitute the string in the cell referenced within the conditional part of a COUNTIF statement)?

I know that I could parse the data, but I would like to know how to do it in Excel.


Solution

  • You are nearly there, use

    =COUNTIF(A:A, "*"&B1&"*")
    

    Caveat:
    This counts the number of cells in A:A that contain the string one or more times. It does not count the total number of instances of the string.