textexcel-formulacountrangeextract

Excel: Count text occurrences within a Range, when two or more text items are present


Excel screenshot I have a range (A1:G6) with 5 transactions, one in each row. Each row has multiple columns with specifics alphabets entered, representing categories of products bought, for e.g. F means fruits, M means Milk, etc.

Now am trying to find the count (in column G) of occurrences of these categories in the range.

To find only one category of product i.e. to find 1 alphabet, it is easy: =SUMPRODUCT(($B$2:$G$2=B10)+($B$3:$G$3=B10)+($B$4:$G$4=B10)+($B$5:$G$5=B10)+($B$6:$G$6=B10)).

But when we have to find two categories together i.e. to we have to find 2 or more alphabets together (And, not OR) , then I am unable to figure out the code. Could someone help.

Data and expected output is as follows:

Transction ID Purchase History of Products
T01 C V M
T02 P K D M B F
T03 M B J V N
T04 P V F M
T05 V K M F
Itemset Identification
No. Item 1 Item 2 Item 3 Item 4 Item 5 Count
1 B 2
2 C 1
3 D 1
4 F 3
5 J 1
6 K 2
7 M 5
8 N 1
9 P 2
10 V 4
11 B F 1
12 B K 1
13 B M 2
14 B P 1
15 B V 1
16 F K 2
17 F M 3
18 F P 2
19 F V 2
20 K M 2
21 K P 1
22 K V 1
23 M P 2
24 M V 4
25 P V 1
25 B M F 1

Solution

  • Instead of sumproduct I use XMATCH combined with BYROW in this approach:

    =LET(values,FILTER(B10:E10,B10:E10<>""),
    SUM(
        BYROW($B$2:$G$6,LAMBDA(rowrange,
    IFERROR(
            N(
              SUM(
                  XMATCH(values,rowrange))>0),
            0)))))
    

    enter image description here It first takes the non-empty values from B10:E10 and names that values.

    Than it uses range B2:G6 to perform a row by row calculation: the row by row range being named rowrange.

    It then checks for a match of values in rowrange and sums the match results of all the values in values. If one or more of the values has no match it returns an error, therefore the sum would cause an error and the IFERROR returns 0. Otherwise it checks if the sum is greater than 0 (which is the case if no errors where found) and returns 1 (TRUE converted to number is 1) for each succesfull row in rowrange.

    This would spill the result for each row from rowrange, but we want to know the total number of rows, so we wrap it in another SUM.

    Edit: Of course we could also spill it:

    =BYROW(B10:F35,LAMBDA(a,LET(b,FILTER(a,a<>""),SUM(BYROW(B2:G6,LAMBDA(c,N(IFERROR(SUM(XMATCH(b,c))>0,0))))))))