I have a batch DuckDB script which sets a custom memory_limit
. But when it OOMs on one statement, instead of exiting, it seems to continue and try the rest of the statements in the batch script (which all fail or make no sense because of the failed statement). This leaves very confusing messages in my log file and potentially could lead to errors if say it failed on an UPDATE
command, but then continued to the rest of the file.
Is there a way to force DuckDB to exit immediately if a statement failed (ex. by OOM as above)?
Example script:
SET memory_limit = '100GB';
SET threads TO 64;
CREATE TABLE summary AS
SELECT
point_id,
COUNT(pred_occ) AS ensemble_support,
AVG(pred_range) AS pat_mean,
QUANTILE_CONT(pred_occ, [0.1, 0.5, 0.9]) AS occurrence_quantiles,
QUANTILE_CONT(pred_count, [0.1, 0.5, 0.9]) AS count_quantiles,
QUANTILE_CONT(pred_occ * pred_count, [0.1, 0.5, 0.9]) AS abundance_quantiles
FROM predictions
GROUP BY point_id;
CREATE TABLE erd AS
SELECT
CAST(SPLIT_PART(point_id, '-', 2) AS BIGINT) AS checklist_id,
SPLIT_PART(point_id, '-', 1) AS type,
ensemble_support, pat_mean,
occurrence_quantiles[2] AS occurrence_median,
occurrence_quantiles[1] AS occurrence_lower,
occurrence_quantiles[3] AS occurrence_upper,
count_quantiles[2] AS count_median,
count_quantiles[1] AS count_lower,
count_quantiles[3] AS count_upper,
abundance_quantiles[2] AS abundance_median,
abundance_quantiles[1] AS abundance_lower,
abundance_quantiles[3] AS abundance_upper
FROM summary
WHERE point_id LIKE 'test-%';
COPY (
SELECT
s.*,
e.latitude, e.longitude, e.year, e.day_of_year, e.observer_id
FROM erd as s
INNER JOIN '{input_erd_pq}' as e
ON s.checklist_id = e.checklist_id
) TO '{predictions_erd_pq}' (FORMAT 'parquet');
COPY (
SELECT
CAST(SPLIT_PART(point_id, '-', 2) AS BIGINT) AS srd_id,
CAST(NULLIF(SPLIT_PART(point_id, '-', 3), '') AS INTEGER) AS day_of_year,
ensemble_support, pat_mean,
occurrence_quantiles[2] AS occurrence_median,
occurrence_quantiles[1] AS occurrence_lower,
occurrence_quantiles[3] AS occurrence_upper,
count_quantiles[2] AS count_median,
count_quantiles[1] AS count_lower,
count_quantiles[3] AS count_upper,
abundance_quantiles[2] AS abundance_median,
abundance_quantiles[1] AS abundance_lower,
abundance_quantiles[3] AS abundance_upper
FROM summary
WHERE point_id LIKE 'srd-%') TO '{predictions_srd_pq}' (FORMAT 'parquet');
Error log:
Out of Memory Error: Failed to allocate block of 2048 bytes (bad allocation)
Catalog Error: Table with name summary does not exist!
Did you mean "temp.information_schema.schemata"?
LINE 15: FROM summary
^
Catalog Error: Table with name erd does not exist!
Did you mean "temp.information_schema.tables"?
LINE 5: FROM erd as s
^
Catalog Error: Table with name summary does not exist!
Did you mean "temp.information_schema.schemata"?
LINE 15: FROM summary
^
Which looks to me like it has tried to run all 4 commands even though every previous command failed!
Use the -bail
flag in the duckdb CLI.
$ duckdb -help 2>&1 | grep bail
-bail stop after hitting an error
Tested on DuckDB 1.0.0 with a.sql
:
set memory_limit = '1mb';
select 1;
select * from range(1000) a, range(1000) b order by a.range + b.range;
select 2;
(using -csv
only for terse output)
$ duckdb -csv < a.sql
1
1
Out of Memory Error: could not allocate block of size 256.0 KiB (784.0 KiB/976.5 KiB used)
2
2
$ duckdb -csv -bail < a.sql
1
1
Out of Memory Error: could not allocate block of size 256.0 KiB (784.0 KiB/976.5 KiB used)