I am trying to use MonetDBLite C in an application. According to the PDF (https://arxiv.org/pdf/1805.08520.pdf), I would benefit from a boost in speed in loading massive amount of data using monetdb_append function. From PDF:
In addition to issuing SQL queries, the embedded process can efficiently bulk append large amounts of data to the database using the monetdb_append function. This function takes the schema and the name of a table to append to, and a reference to the data to append to the columns of the table. This function allows for efficient bulk insertions, as there is significant overhead involved in parsing individual INSERT INTO statements, which becomes a bottleneck when the user wants to insert a large amount of data.
This is the declaration in embedded.h
char* monetdb_append(monetdb_connection conn, const char* schema, const char* table, append_data *data, int ncols);
Has anybody an example how to use this function? I assume that batid of the append_data structure is the identification of a BAT structure. But it is not clear how that can be used with the existing API.
The binary append indeed requires construction of as many BAT structures as you have columns to append. Some additional MonetDBLite headers need to be included (monetdb_config.h
and gdk.h
). The important parts are:
COLnew
with the correct type and countbat->theap.base[i]
BATsetcount
, BATsettrivprop
and BBPkeepref
) for the appendappend_data
data structure.monetdb_append
.Below is a short example how to append 42 values to a one-column table containing integers (CREATE TABLE test (my_column INTEGER);
)
// startup, connect etc. before
size_t n = 42;
BAT* b = COLnew(0, TYPE_int, n, TRANSIENT);
for (size_t i = 0; i < n; i++) {
((int*)b->theap.base)[i] = i; // or whatever
}
BATsetcount(b, n);
BATsettrivprop(b);
BBPkeepref(b->batCacheid);
append_data *ad = NULL;
ad = malloc(1 * sizeof(append_data));
ad[0].colname = "my_column";
ad[0].batid = b->batCacheid;
if (monetdb_append(conn, "sys", "test", ad, 1) != NULL) { /* handle error */}