[postgis-users] Raster time-series data

Tom Cook tom.k.cook at gmail.com
Wed Aug 31 09:46:08 PDT 2016


I'm trying to import (what I think of as) large timeseries of gridded data
in postgis 2.0 (on PostgreSQL.  The data comes as HDF-EOS files (basically
HDF4).  Each file has the whole grid, which is 22680 points, and 24 bands,
one for each hour of the day.  Each file covers one day, and I'm trying to
import 16 years (5,844 files).

My strategy at present is to put each day into a raster with 24 bands,
really because this is the easiest to implement.  So for the first file:

raster2pgsql -I -t auto -c 'HDF4_EOS:EOS_GRID:"file1.hdf":EOSGRID:SWGDN'
swgdn | psql
psql -c 'alter table swgdn add raster_date date;
psql -c 'update swgdn set raster_date = '20000101';

And then for subsequent files:

raster2pgsql -I -t auto -a 'HDF4_EOS:EOS_GRID:"fileX.hdf":EOSGRID:SWGDN'
swgdn | psql
psql -c 'update swgdn set raster_date = 'XXXXXXXX' where raster_date is
null;

In a word, it's slow.  I've so far been running the import script for about
an hour and it's processed 146 input files.  I can't really quantify this,
but it feels like it's getting slower.

Is this a reasonable strategy for storing this data, and my performance
expectations just unrealistic?  Or is there a better structure to use for
this?

Thanks for any suggestions,
Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20160831/1a45d937/attachment.html>


More information about the postgis-users mailing list