I have an output from an ArcPy arcpy.Union_analysis operation and for the overlap polygons that were created I need to select out the oldest feature (along with the attributes). The result is 75740 rows.
Normally, I would just use FindIdentical_management, but I found the results unreliable for 10.2.2 (and I can't change the version I'm on). After rolling my own in arcpy.da and not getting the performance I wanted, I am trying to select out via a PostGIS query, using this answer to Delete duplicate geometry in postgis tables as template.
While this is a big improvement, I am not as experienced with PostGIS + window functions and I think there might be a faster approach?
So far, I have:
-- dtime , numeric lower values are olderWITH unique_geoms ( rowid ,geom )AS ( SELECT row_number() OVER ( PARTITION BY geom ORDER BY dtime ASC ) AS rowid ,geom ,oid ,dtime ,dval FROM x_dupes )SELECT rowid ,geom ,oid ,dtime ,dvalFROM unique_geomsWHERE rowid=1;Explain: http://explain.depesz.com/s/9cV
"CTE Scan on unique_geoms (cost=10312.46..12016.61 rows=379 width=104) (actual time=701.221..2049.404 rows=40389 loops=1)"" Filter: (rowid = 1)"" Rows Removed by Filter: 35351"" CTE unique_geoms"" -> WindowAgg (cost=8797.66..10312.46 rows=75740 width=154) (actual time=701.200..1577.202 rows=75740 loops=1)"" -> Sort (cost=8797.66..8987.01 rows=75740 width=154) (actual time=701.171..970.353 rows=75740 loops=1)"" Sort Key: x_dupes.geom, x_dupes.dtime"" Sort Method: quicksort Memory: 24671kB"" -> Seq Scan on x_dupes (cost=0.00..2659.40 rows=75740 width=154) (actual time=0.011..272.713 rows=75740 loops=1)"http://explain.depesz.com/s/9cV
PostGIS Version: USE_GEOS=1 USE_PROJ=1 USE_STATS=1
أكثر...
Normally, I would just use FindIdentical_management, but I found the results unreliable for 10.2.2 (and I can't change the version I'm on). After rolling my own in arcpy.da and not getting the performance I wanted, I am trying to select out via a PostGIS query, using this answer to Delete duplicate geometry in postgis tables as template.
While this is a big improvement, I am not as experienced with PostGIS + window functions and I think there might be a faster approach?
So far, I have:
-- dtime , numeric lower values are olderWITH unique_geoms ( rowid ,geom )AS ( SELECT row_number() OVER ( PARTITION BY geom ORDER BY dtime ASC ) AS rowid ,geom ,oid ,dtime ,dval FROM x_dupes )SELECT rowid ,geom ,oid ,dtime ,dvalFROM unique_geomsWHERE rowid=1;Explain: http://explain.depesz.com/s/9cV
"CTE Scan on unique_geoms (cost=10312.46..12016.61 rows=379 width=104) (actual time=701.221..2049.404 rows=40389 loops=1)"" Filter: (rowid = 1)"" Rows Removed by Filter: 35351"" CTE unique_geoms"" -> WindowAgg (cost=8797.66..10312.46 rows=75740 width=154) (actual time=701.200..1577.202 rows=75740 loops=1)"" -> Sort (cost=8797.66..8987.01 rows=75740 width=154) (actual time=701.171..970.353 rows=75740 loops=1)"" Sort Key: x_dupes.geom, x_dupes.dtime"" Sort Method: quicksort Memory: 24671kB"" -> Seq Scan on x_dupes (cost=0.00..2659.40 rows=75740 width=154) (actual time=0.011..272.713 rows=75740 loops=1)"http://explain.depesz.com/s/9cV
PostGIS Version: USE_GEOS=1 USE_PROJ=1 USE_STATS=1
أكثر...