bigtablegoogle-cloud-bigtable

Unable to count the number of rows in BigTable


https://cloud.google.com/bigtable/docs/go/cbt-reference

As in this reference, I tried the following command

cbt count <table>

for three different tables.

For one of them I got what I expected: the number of rows, a bit shy of 1M.

For the second table, I got the following error:

[~]$ cbt count prod.userprofile
2016/10/23 22:47:48 Reading rows: rpc error: code = 4 desc = Error while reading table 'projects/focal-elf-631/instances/campaign-stat/tables/prod.userprofile'
[~]$ cbt count prod.userprofile
2016/10/23 23:00:23 Reading rows: rpc error: code = 4 desc = Error while reading table 'projects/focal-elf-631/instances/campaign-stat/tables/prod.userprofile'

I tried it several times, but I got the same error every time.

For the last one, I got a different error (the error code is the same as above, but its description is different):

[~]$ cbt count prod.appprofile
2016/10/23 22:45:17 Reading rows: rpc error: code = 4 desc = Error while reading table 'projects/focal-elf-631/instances/campaign-stat/tables/prod.appprofile' : Response was not consumed in time; terminating connection. (Possible causes: row size > 256MB, slow client data read, and network problems)
[~]$ cbt count prod.appprofile
2016/10/23 23:11:10 Reading rows: rpc error: code = 4 desc = Error while reading table 'projects/focal-elf-631/instances/campaign-stat/tables/prod.appprofile' : Response was not consumed in time; terminating connection. (Possible causes: row size > 256MB, slow client data read, and network problems)

I also tried this one several times, and nothing changed.

I googled and searched on stackoverflow with the 'rpc error code 4' as keywords, but did not find anything useful.

I'm really curious why this command would fail, and what I can do to resolve this (by the way, these two tables are being used in production 24/7 and we have several dozens of big table nodes working just fine, so I don't think it has to do with bandwidth or QPS).


Solution

  • Getting a count on a large table requires reading something from every single row in Bigtable. There isn't a notion of just getting a single value that represents a count.

    This type of problem requires something like a map/reduce, unfortunately. Fortunately, it's quite straight forward to do count with Dataflow.