I have data like
type | plan_name
918 | Plan 10
918 | PLAN 10
918 | PLAN10
918 | plan 10
918 | Plan10
918 | Plan 11
918 | PLAN 11
918 | PLAN11
918 | plan 11
918 | Plan11
920 | Plan 10
920 | PLAN 10
920 | PLAN10
920 | plan 10
920 | Plan10
920 | Plan 11
920 | PLAN 11
920 | PLAN11
920 | plan 11
920 | Plan11
Now i want to know the ids and count where plan_name number matches.
eg: Plan10,Plan 10,PLAN10,PLAN 10,plan10,plan 10
similar to 11
How to use group by and get this
the out put will be
type plan_no count
918 10 5
918 11 5
920 10 5
920 11 5
I think a regex can help you as long as you only have one number grouping
Okay I made a bucket "b1" and loaded your data into the default scope and collection
UPSERT INTO b1 (KEY,VALUE)
VALUES ( "001", { "type": 918, "plan_name": "Plan 10" } ),
VALUES ( "002", { "type": 918, "plan_name": "PLAN 10" } ),
VALUES ( "003", { "type": 918, "plan_name": "PLAN10" } ),
VALUES ( "004", { "type": 918, "plan_name": "plan 10" } ),
VALUES ( "005", { "type": 918, "plan_name": "Plan10" } ),
VALUES ( "006", { "type": 918, "plan_name": "Plan 11" } ),
VALUES ( "007", { "type": 918, "plan_name": "PLAN 11" } ),
VALUES ( "008", { "type": 918, "plan_name": "PLAN11" } ),
VALUES ( "009", { "type": 918, "plan_name": "plan 11" } ),
VALUES ( "010", { "type": 918, "plan_name": "Plan11" } ),
VALUES ( "011", { "type": 920, "plan_name": "Plan 10" } ),
VALUES ( "012", { "type": 920, "plan_name": "PLAN 10" } ),
VALUES ( "013", { "type": 920, "plan_name": "PLAN10" } ),
VALUES ( "014", { "type": 920, "plan_name": "plan 10" } ),
VALUES ( "015", { "type": 920, "plan_name": "Plan10" } ),
VALUES ( "016", { "type": 920, "plan_name": "Plan 11" } ),
VALUES ( "017", { "type": 920, "plan_name": "PLAN 11" } ),
VALUES ( "018", { "type": 920, "plan_name": "PLAN11" } ),
VALUES ( "019", { "type": 920, "plan_name": "plan 11" } ),
VALUES ( "020", { "type": 920, "plan_name": "Plan11" } );
Now try either of these queries:
SELECT
type, REGEXP_MATCHES(plan_name, "[0-9]+")[0] as plan_no, count(*) as count
FROM b1 group by REGEXP_MATCHES(plan_name, "[0-9]+")[0], type
or
Select t.plan_no, t.type, count(*) as count
FROM (
SELECT
type, REGEXP_MATCHES(plan_name, "[0-9]+")[0] as plan_no
FROM b1 ) as t
group by t.plan_no, type
In either case you should get something like
count plan_no type
5 "10" 918
5 "11" 918
5 "10" 920
5 "11" 920
If needed you can even create your index on the regex