[postgis-users] Delete duplicated records

Gregory Williamson Gregory.Williamson at digitalglobe.com
Mon Jun 18 20:26:36 PDT 2007


Alvaro asked:

> Hello all.
> 
> Hope somebody could help me. I have edited a huge amount of data into a
> table and suddenly I have realized that there is quite a lot of
> duplicated data (I mean duplicated entities: Linestrings) into de table.
> Does anybody could give a hint in how could I delete this duplicated
> geometries/records?.
> 
> Thanks a lot.
> 

If you have OIDs in the table you can use them to distinguish otherwise identical records.

If you don't have them, you might be able to use a variant of the code below, which I used recently to remove duplicate rows after someone managed to do a partial re-load ... in these cases only a small number of rows (~1-5%) were duplicates.

In this case, gid is a int key. The db in question was in light use so I put the core operations inside a transaction ... if you're the only user you might still want to do so as a safety belt, but consistancy issues with other users wouldn't be a problem.

HTH,

Greg Williamson
Senior DBA
Globexplorer, a DigitalGlobe company

-- sql for 8.1.x follows

create table foo as select g_id,count(g_id) as count from data_table group by 1 order by 2 desc,1;
delete from foo where count < 2;
select * from foo where count > 2; -- nothing we hope!
create table foowrk as select * from data_table where g_id in (select g_id from foo) order by g_id;
alter table foowrk add serid serial;
update foowrk set serid = nextval('foowrk_serid_seq');
delete from foowrk where (serid % 2) = 0;
create table data_table_bu as select * from data_table order by g_id;
alter table foowrk drop serid;
begin;
delete from data_table where g_id in (select g_id from foowrk);
insert into data_table (<your column list here>) select <your column list here> from foowrk;
reindex table data_table;
commit;
vacuum verbose analyze data_table;
drop table foo;
drop table foowrk;
-- drop the data_table_bu when all is well.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20070618/4379d24e/attachment.html>


More information about the postgis-users mailing list