I have a column family called odds_api with the number of rows with the rowkey: /bt=??/bm=??/mk=??/se=??:
/bt=1/bm=MN/mk=344/se=23394/odds_api
/bt=1/bm=BY/mk=344/se=23394/odds_api
/bt=1/bm=SN/mk=344/se=23394/odds_api
/bt=1/bm=BB/mk=344/se=23394/odds_api
/bt=1/bm=SF/mk=344/se=23394/odds_api
/bt=1/bm=XY/mk=344/se=23394/odds_api
I want to filter based on a list of bm values aka filter based on bm=SF,BB,MN.
To do this I created a filterList of MUST_PASS_ONE with 1 to many rowFilters (depending on how many values the user requests)
public ResultScanner scan(String id , List<String> bms) throws BigTableGetException {
try{
Table table = connection.getTable(TableName.valueOf(this.tableName));
Scan scan = new Scan();
scan.addFamily(Bytes.toBytes("odds_api"));
FilterList mainFilterList = new FilterList(FilterList.Operator.MUST_PASS_ONE);
bms.stream()
.forEach(bm -> {
mainFilterList.addFilter(new RowFilter(CompareOp.EQUAL, new RegexStringComparator("/bt="+id+"/bm="+bm+".*")));
});
System.out.println("this is the filter list " + mainFilterList.toString());
scan.setFilter(mainFilterList);
return table.getScanner(scan);
}catch (IOException ex){
throw new BigTableGetException("Failed to get rows in BigTable", ex);
}
}
(I know -- pointless stream for a forEach!)
This works fine when only one bm is specified, print statement:
this is the filter list FilterList OR (1/1): [RowFilter (EQUAL, /bt=1/bm=B4.*)]
however if more than one is specified then it returns everything, print statement:
this is the filter list FilterList OR (2/2): [RowFilter (EQUAL, /bt=1/bm=B4.*), RowFilter (EQUAL, /bt=1/bm=PP.*)]
In fact, if I enter two or more 'incorrect' bm values it still returns everything!
I have read: https://www.oreilly.com/library/view/hbase-the-definitive/9781449314682/ch04.html
I have also tried to move the bt=? filter out to a separate MUST_PASS_ALL filter and then have another list of row filters for bm=?
this is the filter list FilterList AND (2/2): [FilterList AND (1/1): [RowFilter (EQUAL, .*bt=1)], FilterList OR (2/2): [RowFilter (EQUAL, .*/bm=B4.*), RowFilter (EQUAL, .*/bm=PP.*)]]
Same problems.
I must be missing something obvious, any help would be greatly appreciated.
hbase versions:
<dependency>
<groupId>com.google.cloud.bigtable</groupId>
<artifactId>bigtable-hbase-1.x</artifactId>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>com.google.cloud.bigtable</groupId>
<artifactId>bigtable-hbase-1.x-hadoop</artifactId>
<version>1.4.0</version>
</dependency>
Have you considered using a single regex? "/bt=FOO/bm=(A|B|C|D).*" should work.
Also, please raise an issue in https://github.com/GoogleCloudPlatform/cloud-bigtable-client. There does seem to be a bug somewhere in the client code.