multithreadingmultiprocessinggdalautocaddxf

GDAL ogr2ogr alternative that uses multiprocessing


I am using GDAL's ogr2ogr for converting AutoCAD DXF files to GeoJSON. Some of the files I need to convert are up to 1.4GB. Functionally, ogr2ogr can do the job fine, but these large DXF files can each take minutes to convert.

Is there an alternative to ogr2ogr which has multiprocessing or multithreading capability? I imagine that this would speed up the conversion quite significantly.

Thanks


Solution

  • When you use the GDAL python bindings for ogr2ogr, you can use the standard python multiprocessing options.

    Basic sample to use gdal.VectorTranslate, the python binding to ogr2ogr:

    from osgeo import gdal
    gdal.UseExceptions()
    
    src_ds = gdal.OpenEx("input.dxf")
    options = gdal.VectorTranslateOptions(format="GEOJSON")
    ds = gdal.VectorTranslate("output.geojson", src_ds, options=options)
    ds = None
    

    In general, geojson is also not a trivial and slow format to write large files to, e.g. GPKG would be a faster and more logical option, but I suppose there are reasons for that choice...