How can I leverage bloomfilters on array(string) in Clickhouse.
For example, I have this data structure:
create table test_schema.test_table (
partition_time DateTime64(9
,keys Array(String)
) ENGINE = ReplicatedReplacingMergeTree('some/path', '{replica}')
PARTITION BY toYYYYMMDD(partition_time)
TTL toDateTime(partition_time) + toIntervalDay(1)
SETTINGS index_granularity = 8192;
and I build a token bloomfilter on keys
alter table test_schema.test_table INDEX test_bf_idx `keys` TYPE tokenbf_v1(4096, 2, 0) GRANULARITY 1;
When I use the condition has(keys,'some_key')
, I can see it leveraging the bloom filter. But I could not use the bloomfilter for like queries. I have tried array filter and array join, but still it is not using bloomfilter:
where notEmpty(arrayFilter(x -> x LIKE '%some_key%', keys) )
and
select * from
(select arrayJoin(keys) as val from schema.table)
where val like '%some_key%'
Can someone show me how I can use the token bloomfilter built on an array for LIKE statements, (i.e. use bloomfilter for like statement on an array)
maybe an index over arrayStringConcat ?
create table test (
x Int64,
k Array(String),
INDEX test_bf_idx arrayStringConcat(k, ',') TYPE tokenbf_v1(4096, 2, 0) GRANULARITY 1
) ENGINE = MergeTree
order by x as
select number, ['a', 'b'] k
from numbers(1e6);
select * from test
where hasToken( arrayStringConcat(k, ',') , 'xxx')