I installed and configured Geomesa using Docker containers. The versions I used for the various applications are:
To ingest a file I use the following command
geomesa-accumulo ingest --force -i accumulo -z zookeeper -u username -p myPassword -c myCatalog -f myFeature /path/to/shapefile
I have been dealing with this error for days while trying to ingest a file.
2023-07-26 06:50:24,914 ERROR [org.locationtech.geomesa.tools.ingest.LocalConverterIngest] Fatal error running local ingest worker on /data/hdfs/file_geomesa/CD_Toscana/Gasdotto-45840-CD-Maggio-centroidi.shp
java.io.IOException: Error occurred trying to reproject data
at org.geotools.data.store.ContentFeatureSource.getReader(ContentFeatureSource.java:723)
at org.locationtech.geomesa.convert.shp.ShapefileConverter.parse(ShapefileConverter.scala:74)
at org.locationtech.geomesa.convert2.AbstractConverter.process(AbstractConverter.scala:151)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$4(LocalConverterIngest.scala:172)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$4$adapted(LocalConverterIngest.scala:168)
at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:575)
at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:573)
at org.locationtech.geomesa.utils.collection.CloseableIterator$CloseableSingleIterator.foreach(CloseableIterator.scala:85)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$3(LocalConverterIngest.scala:168)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$3$adapted(LocalConverterIngest.scala:167)
at org.locationtech.geomesa.utils.io.package$WithClose$.apply(package.scala:64)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$2(LocalConverterIngest.scala:167)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$2$adapted(LocalConverterIngest.scala:166)
at org.locationtech.geomesa.utils.io.CloseablePool$CommonsPoolPool.borrow(CloseablePool.scala:68)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.run(LocalConverterIngest.scala:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Nothing to be reprojected! (check before using wrapper)
at org.geotools.data.crs.ReprojectFeatureReader.<init>(ReprojectFeatureReader.java:152)
at org.geotools.data.crs.ReprojectFeatureReader.<init>(ReprojectFeatureReader.java:117)
at org.geotools.data.store.ContentFeatureSource.getReader(ContentFeatureSource.java:719)
... 19 more
Initially I thought the error was due to the large number of features, because on an older installation of Geomesa (version 3.1.1), trying to ingest the same file caused an error that seemed quite explicitly related to the number of columns (java.lang.ArrayIndexOutOfBoundsException). However, guessing that on Geomesa 4.0.1 the error was of a different nature, I ran numerous tests, either by changing environment configurations or by modifying the file in question with other tools.
I eventually found another file, having a very small number of features, that causes the same error when trying to ingest it, confirming that the error is not due to the large number of features. The real surprise was to find that instead, on the older version of Geomesa, this file is ingested correctly. The file is handled correctly up to Geomesa version 3.5.2, while it causes the above error starting from version 4.0.0. This leads me to the following question. Is it possible that in the new versions of Geomesa some bug was introduced that was not present in the older versions, or is it more likely that the problem is related to some configuration that should be done for the new versions?
Finally I was able to find a way to overcome the above stacktrace error. It was enough to delete the file with the .prj extension. Now the file with few columns is also ingested correctly in Geomesa 4.0.1, while the one with many columns causes the same error found in Geomesa 3.1.1. The corresponding stacktrace is as follows. I am not authorized to share the original shapefile, however, I was able to create one myself with many columns capable of triggering the same error. Possibly I can share it if you need it to reproduce the experiment, however from what I could see, the error should occur with any file having more than 600 columns.
at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter.writeFeature(GeoMesaFeatureWriter.scala:56)
at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter.writeFeature$(GeoMesaFeatureWriter.scala:46)
at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter$TableFeatureWriter.writeFeature(GeoMesaFeatureWriter.scala:151)
at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter$GeoMesaAppendFeatureWriter.write(GeoMesaFeatureWriter.scala:239)
at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter$GeoMesaAppendFeatureWriter.write$(GeoMesaFeatureWriter.scala:235)
at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter$$anon$3.write(GeoMesaFeatureWriter.scala:111)
at org.locationtech.geomesa.utils.geotools.FeatureUtils$.write(FeatureUtils.scala:147)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$8(LocalConverterIngest.scala:181)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$8$adapted(LocalConverterIngest.scala:179)
at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:575)
at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:573)
at org.locationtech.geomesa.utils.collection.CloseableIterator$FlatMapCloseableIterator.foreach(CloseableIterator.scala:132)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$7(LocalConverterIngest.scala:179)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$7$adapted(LocalConverterIngest.scala:173)
at org.locationtech.geomesa.utils.io.CloseablePool$CommonsPoolPool.borrow(CloseablePool.scala:68)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$6(LocalConverterIngest.scala:173)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$6$adapted(LocalConverterIngest.scala:172)
at org.locationtech.geomesa.utils.io.package$WithClose$.apply(package.scala:64)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$4(LocalConverterIngest.scala:172)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$4$adapted(LocalConverterIngest.scala:168)
at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:575)
at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:573)
at org.locationtech.geomesa.utils.collection.CloseableIterator$CloseableSingleIterator.foreach(CloseableIterator.scala:85)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$3(LocalConverterIngest.scala:168)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$3$adapted(LocalConverterIngest.scala:167)
at org.locationtech.geomesa.utils.io.package$WithClose$.apply(package.scala:64)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$2(LocalConverterIngest.scala:167)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.$anonfun$run$2$adapted(LocalConverterIngest.scala:166)
at org.locationtech.geomesa.utils.io.CloseablePool$CommonsPoolPool.borrow(CloseablePool.scala:68)
at org.locationtech.geomesa.tools.ingest.LocalConverterIngest$LocalIngestWorker.run(LocalConverterIngest.scala:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 2136
at com.esotericsoftware.kryo.io.Output.writeByte(Output.java:226)
at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.locationtech.geomesa.features.serialization.WkbSerialization.serializeWkb(WkbSerialization.scala:44)
at org.locationtech.geomesa.features.serialization.WkbSerialization.serializeWkb$(WkbSerialization.scala:42)
at org.locationtech.geomesa.features.kryo.serialization.KryoGeometrySerialization$.serializeWkb(KryoGeometrySerialization.scala:14)
at org.locationtech.geomesa.features.kryo.impl.KryoFeatureSerialization$KryoGeometryWkbWriter$.apply(KryoFeatureSerialization.scala:229)
at org.locationtech.geomesa.features.kryo.impl.KryoFeatureSerialization.writeFeature(KryoFeatureSerialization.scala:71)
at org.locationtech.geomesa.features.kryo.impl.KryoFeatureSerialization.serialize(KryoFeatureSerialization.scala:43)
at org.locationtech.geomesa.features.kryo.impl.KryoFeatureSerialization.serialize$(KryoFeatureSerialization.scala:41)
at org.locationtech.geomesa.features.kryo.KryoFeatureSerializer$MutableActiveSerializer.serialize(KryoFeatureSerializer.scala:75)
at org.locationtech.geomesa.index.api.WritableFeature$FeatureLevelWritableFeature.$anonfun$values$2(WritableFeature.scala:153)
at org.locationtech.geomesa.index.api.package$KeyValue.value$lzycompute(package.scala:183)
at org.locationtech.geomesa.index.api.package$KeyValue.value(package.scala:183)
at org.locationtech.geomesa.accumulo.data.AccumuloIndexAdapter$AccumuloIndexWriter.$anonfun$write$1(AccumuloIndexAdapter.scala:397)
at org.locationtech.geomesa.accumulo.data.AccumuloIndexAdapter$AccumuloIndexWriter.$anonfun$write$1$adapted(AccumuloIndexAdapter.scala:396)
at scala.collection.immutable.Vector.foreach(Vector.scala:1895)
at org.locationtech.geomesa.accumulo.data.AccumuloIndexAdapter$AccumuloIndexWriter.write(AccumuloIndexAdapter.scala:396)
at org.locationtech.geomesa.index.api.IndexAdapter$BaseIndexWriter.write(IndexAdapter.scala:153)
at org.locationtech.geomesa.index.geotools.GeoMesaFeatureWriter.writeFeature(GeoMesaFeatureWriter.scala:50)
... 34 more
From a quick look at the code it seems that the reprojection code thinks you don't have any geometry columns or they are already in the correct projection and so concludes there is nothing for it to do.
An ArrayIndexOutOfBounds
would not be related to the number of features as GeoTools never reads the whole file to memory, it's more likely to be a mismatch between the expected and observed number of attributes, but with out the actual error log it's hard to say for sure.