google-cloud-bigtable

What is the appropriate way to store a Python string for the column qualifier in Google Bigtable?


In Bigtable quickstart for "Perform a simple write" in Python (https://cloud.google.com/bigtable/docs/samples/bigtable-quickstart#bigtable_quickstart-python), it shows for example:

row.set_cell(column_family_id, "connected_cell", 1, timestamp)
row.set_cell(column_family_id, "connected_wifi", 1, timestamp)
row.set_cell(column_family_id, "os_build", "PQ2A.190405.003", timestamp)

The string names for the column qualifiers are not stored using bytes() or specifying an encoding (i.e. .encode("utf-8")) etc.

In the Python reference (https://cloud.google.com/python/docs/reference/bigtable/latest/row#setcellcolumnfamilyid-column-value-timestampnone), under "Parameters" we see that column is of type bytes.

Does the API handle storing the string as bytes? Or do I need to convert the column qualifier to bytes?


Solution

  • As per my first comment, based on this documentation on class ColumnFamily, column_family_id parameter is a string and values must be [_a-zA-Z0-9][-_.a-zA-Z0-9]\*.

    Also, @Adam C. Scott added that column = _to_bytes(column) this command will convert string to bytes.