I have a person table in HBase like below -
ROW_KEY COLUMN+CELL
dinesh column='details:code',value=dr-01
dinesh column='status:is_error',value=false
dinesh column='time:date_created',value=1553747864740
dinesh column='time:last_updated',value=1553747864740
alex column='details:code',value=al-01
alex column='time:date_created',value=1553747786521
alex column='time:last_updated',value=1553747786521
I want to fetch only the records where is_error field is false. This attribute will be present only in certain rows. I tried to fetch it using SingleColumnValueFilter but its giving me all the records.
Query:
scan 'person', {FILTER=>"SingleColumnValueFilter('status','is_error',=,'binary:false')"}
Output:
ROW_KEY COLUMN+CELL
dinesh column='details:code',value=dr-01
dinesh column='status:is_error',value=false
dinesh column='time:date_created',value=1553747864740
dinesh column='time:last_updated',value=1553747864740
alex column='details:code',value=al-01
alex column='time:date_created',value=1553747786521
alex column='time:last_updated',value=1553747786521
The expected output should be only one row matching the given condition but it returns two rows where is_error field is not present.
You need to use a different constructor for your filter:
protected SingleColumnValueFilter(byte[] family,
byte[] qualifier,
CompareOperator op,
ByteArrayComparable comparator,
boolean filterIfMissing,
boolean latestVersionOnly)
filterIfMissing
will ensure that rows not containing your column aren't returned. I have no idea why this isn't default behaviour.
Your scan should be:
scan 'person', {FILTER=>"SingleColumnValueFilter('status','is_error', =, 'binary:false', true, true)"}