c++postgresqllibpqxx

Efficiently iterating through a database row (libpqxx), assigning values to a struct


I'm grabbing a row from a database using libpqxx and assigning the fields within the pqxx::row to a struct specifically designed to hold those values:

struct driveOperationRecord
{
    long int id = 0;
    long int drive_operator_id = 0;
    long int operator_operation_index = 0;
    long int specific_operation_id = 0;
    bool operation_outcome = false;
    std::string error_code = "";
    long int operation_seconds = 0;
    std::string operation_date = "";
    std::string serial_number = "";
    std::string model = "";
    long long int size = 0;
    std::string firmware = "";
};

To do this I'm currently using the following code:

void driveOperationEntryToRecord(pqxx::row entry, driveOperationRecord& record)
{
    try
    {
        if (!entry["id"].is_null()) record.id = entry["id"].as<int>();
        if (!entry["drive_operator_id"].is_null()) record.drive_operator_id = entry["drive_operator_id"].as<int>();
        if (!entry["operator_operation_index"].is_null()) record.operator_operation_index = entry["operator_operation_index"].as<int>();
        if (!entry["specific_operation_id"].is_null()) record.specific_operation_id = entry["specific_operation_id"].as<int>();
        if (!entry["operation_outcome"].is_null()) record.operation_outcome = entry["operation_outcome"].as<bool>();
        if (!entry["error_code"].is_null()) record.error_code = entry["error_code"].as<std::string>();
        if (!entry["operation_seconds"].is_null()) record.operation_seconds = entry["operation_seconds"].as<long int>();
        if (!entry["operation_date"].is_null()) record.operation_date = entry["operation_date"].as<std::string>();
        if (!entry["serial_number"].is_null()) record.serial_number = entry["serial_number"].as<std::string>();
        if (!entry["model"].is_null()) record.model = entry["model"].as<std::string>();
        if (!entry["size"].is_null()) record.size = entry["size"].as<long long int>();
        if (!entry["firmware"].is_null()) record.firmware = entry["firmware"].as<std::string>();
    }
    catch (const std::exception& e)
    {
        std::cerr << e.what() << std::endl;
    }
}

Which is absolutely horrible and I hate it. I've thought about using a switch with an appropriate enum but iterating through a switch 12 times feels even worse. Ideally I'd be able to use something like:

for (auto field : entry)
{
    if (!field.is_null()) record.<if only i could dynamically reference using field.name()> = field.as<somehow magically find appropriate type>();
}

But these things are impossible. Any ideas?

EDIT (In response to Useless):

The current database contains roughly 4,000 entries which will need to pass through this function on a fairly regular basis. It is expected to grow exponentially from here. In order to future proof, operational efficiency is also important. I am fairly new to c++ and I know that there is a lot that I don't know! I was just really hoping to find something slightly more elegant and have really been struggling to do so.


Solution

  • for (auto field : entry)
    {
        if (!field.is_null()) record.<if only i could dynamically reference using field.name()> = field.as<somehow magically find appropriate type>();
    }
    

    This is entirely feasible, although not quite like this.

    Let's just assume we can have some record of what fields driveOperationRecord (henceforth just Record, for brevity) supports, along with their names, and types, and which data member they correspond to. Then we can walk over them, like:

    for (auto& field : record_fields)
    {
      field.assign(row, record);
    }
    

    Now we need some field-describing type like

    struct FieldDescriptor
    {
      std::string name_;
    
      void assign(pqxx::row const& row, Record& record)
      {
        auto entry = row[name_];
        if (!entry.is_null()) {
          // do something
        }
      }
    };
    

    So far so good. The "do something" step is where we need to use something like virtual dispatch, because we're handling different types.

    We could enumerate all the types we care about and write a big switch statement, but it's tedious. I'd start with the simplest thing first, and do something more complex if it turns out you really need it to be faster:

    struct FieldDescriptor
    {
      std::string name_;
      std::function<void(pqxx::field const&, Record&)> assign_;
    
      void assign(pqxx::row const& row, Record& record)
      {
        auto entry = row[name_];
        if (!entry.is_null()) {
          assign_(entry, record);
        }
      }
    };
    

    OK, so now we just need some way to build a function or callable object similar to

    void assignFieldName(pqxx::field const& entry, Record& record) {
        record.fieldName = entry.as<decltype(record.fieldName)>();
    }
    

    For example, we could write

    template <typename RecordType, typename FieldType>
    FieldDescriptor describe(const char *name, FieldType RecordType::*member)
    {
        return
        { std::string{name},
          [=member](pqxx::field const& entry, Record& record)
          {
            record.*member = entry.as<FieldType>();
          }
        };
    }
    

    and even wrap it up in a macro to save a little more typing:

    #define DESCRIBE_RECORD_FIELD(F) describe(#F, &Record:: F)
    
    FieldDescriptor record_fields[] = {
        DESCRIBE_RECORD_FIELD(id),
        DESCRIBE_RECORD_FIELD(drive_operator_id),
        DESCRIBE_RECORD_FIELD(operator_operation_index),
        DESCRIBE_RECORD_FIELD(specific_operation_id),
        DESCRIBE_RECORD_FIELD(operation_outcome),
        DESCRIBE_RECORD_FIELD(error_code),
        DESCRIBE_RECORD_FIELD(operation_seconds),
        DESCRIBE_RECORD_FIELD(operation_date),
        DESCRIBE_RECORD_FIELD(serial_number),
        DESCRIBE_RECORD_FIELD(model),
        DESCRIBE_RECORD_FIELD(size),
        DESCRIBE_RECORD_FIELD(firmware)
    };
    

    Now, we have sort of shadowed the whole definition of Record again, which is unfortunate. It's possible to avoid this ... but only by replacing the whole definition with a bunch of macros that do double duty declaring the members, and registering their descriptors.

    For a small number of records, I'd probably stick with what we have here. If there are a lot, it's worth making more effort to automate (and to make sure the definition can't get out of sync with the registered descriptors).