I am using the rgeo ruby library to parse out geojson polygons. The behavior is to return nil when calling decode on a polygon with duplicate points as in the following example:
geom = {:geom=>{"type"=>"Polygon", "coordinates"=>[[[-82.5721, 28.0245], [-82.5721, 28.0245] ... }
geo_factory = RGeo::Cartesian.factory(:srid => 4326)
rgeo_geom = RGeo::GeoJSON.decode(geom, json_parser: :json, geo_factory: geo_factory)
Due to the repeated point at the beginning, rgeo_geom will be nil after this code is executed.
What is the most efficient way to clean this polygon? Is there a built in rgeo feature or should I roll my own?
To be clear I would like to remove only consecutive duplicate points as this is what causes the library to return nil for the above code. I am also not looking for in db solutions such as postgis st_removerepeatedpoints, but am essentially looking for this behavior executed in ruby.
I'm not familiar with rgeo
, but from a pure Ruby standpoint I would think you could do the following.
h = { :geom=>{
"type"=>"Polygon",
"coordinates"=>[
[-80.1234, 28.1234], [-82.5721, 28.0245], [-82.5721, 28.0245],
[-83.1234, 29.1234], [-82.5721, 28.0245], [-83.1234, 29.1234],
[-83.1234, 29.1234], [-83.1234, 29.1234]
]
}
}
The question shows "coordinates"=>[[[-82.5721, 28.0245],...
with no right bracket matching the middle left bracket. I've assumed there should only be two left brackets. If that is not the case my answer would have to be modified.
The following does not mutate h
. To show that's true, first compute the hash of h
.
hhash = h.hash
#=> -4413716877847662410
h.merge({ :geom=>(h[:geom].merge("coordinates"=>
h[:geom]["coordinates"].chunk_while(&:==).map(&:first))) })
#=> { :geom=>{
# "type"=>"Polygon",
# "coordinates"=>[
# [-80.1234, 28.1234], [-82.5721, 28.0245], [-83.1234, 29.1234],
# [-82.5721, 28.0245], [-83.1234, 29.1234]
# ]
# }
# }
h.hash == hhash
#=> true
See Hash#merge, Object#tap, Enumerable#chunk_while, Enumerable#flat_map and Enumerable#uniq.