Tips and Tricks: Managing GeoWebCache via ReST
Dear Readers,
Today we would like to talk about our experience with GeoWebCache and how we use it in some of our deployments. Before moving on we would like to remind you about this post we wrote a couple of years ago that contains a tweaks checklist for GeoWebCache , if you have not read it yet I would do so as it is quite useful indeed.
GeoWebCache is a powerful way to speed up your Web Map Services. By avoiding to continuously perform the rendering process for your data you can make your infrastructure much more responsive, reduce the load on your servers while increasing the throughput of your map services. It is also worth to point out that you can reduce the response time for individual requests by one order of magnitude or even more for complex rendering operations.
In case you wonder, GeoWebCache does no magic to achieve such improvements, it simply caches on disk or memory every mapping request that match it’s configuration (in terms of layer, style, coordinate reference system, etc.) and reuses it for serving successive identical requests without hitting the rendering pipeline of your web map service. To perform such task it requires specific protocols to be used by your clients where the degrees of freedom available to it are reduced with respect to pure WMS in order to maximize cacheability of request; examples of supported protocols are OSGEO TMS or OGC WMTS. A complete discussion on what tiled mapping is outside the scope of this post, so we are assuming you know the basics and you are interested in getting some tips about GeoWebCache (a.k.a GWC).
GWC provides, likewise GeoServer, a ReST api that exposes features like seeding and truncating that allow to populate, update and delete the cache in batches programmatically. This ReST API is extremely useful in an automated data publishing environment since you can easily automate tasks like:
- Seeding tiles for new data published (even for time series)
- Deleting obsolete tiles (e.g. after an update)
- Reseeding tiles after updates ( by zone or other parameters)
- Reseeding only the area with updates
Lets us now provide an example using curl where we seed a layer that covers the entire world for EPSG:4326 for levels 1 to 6 (this is a typical situation when serving a background layer).
We can do it like this:
curl -u admin:password -H “Content-type: text/xml” -H “Accept: text/xml” -XPOST -i -d “<seedRequest> <name>topp:states</name> <type>seed</type> <gridSetId>epsg:4326</gridSetId> <bounds> <coords> <double>-180</double> <double>-90</double> <double>180</double> <double>90</double> </coords> </bounds> <zoomStart>5</zoomStart> <zoomStop>6</zoomStop> <format>image/png8</format> <threadCount>06</threadCount> </seedRequest>” “http://localhost:8080/geoserver/gwc/rest”
In this example we use the xml body to pass the options to gwc, that will start 6 threads for this seeding operation. I believe the parameters are pretty obvious, however, you can find more information about at the following link:
http://docs.geoserver.org/stable/en/user/geowebcache/rest/index.html
Here at GeoSolutions, we use GWC a lot to speed up map serving for our clients, therefore we wrote a few bash scripts to automate certain operations (they are available here). The main script is the gwc.sh script which allows to run various ReST operations towards a single a server. This is the usage guide:
usage: gwc.sh <type> <layerName> <gridsetName> [options]
usage: gwc.sh masstruncate <layerName> [options]
- type :The Operation Type. One of “seed”,” reseed”,”truncate”,” masstruncate”
-
layerName :The name of the layer
-
gridsetName: the name of the gridSet (mandatory,except for masstruncate type)
e.g. gwc.sh seed layer epsg:4326
This script launch seeding and truncate tasks for GeoWebCache
OPTIONS:
-a | –auth : Administrator credentials (user:password)
-b | –bounds : the bounds in this format: -b minX minY maxX maxY
-f | –format : format (default image/png8)
-p | –parameter : -p name value (allow multiple values)
-t | –threadCount : default 01
-zs | –zoomStart : lower zoom level to seed/truncate (default 00)
-ze | –zoomEnd : greater zoom level to seed/truncate (default 04)
-v | –verbose : show debug messages
-u | –url : GWC rest url
-h | –help : display this message
EOF
}
Using this script we can perform the same operation of the initial example using the following command line:
./gwc.sh seed topp:states epsg:4326
-b -180 -90 180 90
-zs 1
-ze 6
Another useful example is how to seed a newly added time granule in a time series mosaic. You can pass to the script parameters like style, time, etc… using the option -p <PARAM_NAME> <PARAM_VALUE>
/gwc.sh seed topp:states epsg:4326 -p TIME 2015-07-12T18:00:00.0Z
In the same way you can truncate obsolete data using the truncate operation (e.g. removing old granules from a raster time series)
/gwc.sh truncate topp:states epsg:4326 -p TIME 2001-12-12T18:00:00.0Z
You can also run the script with multiple parameters, like in the following example where we also truncating a specific style in a vector time series:
/gwc.sh seed topp:states epsg:4326 -p TIME 2015-07-12T18:00:00.0Z -p STYLES polygon
This is only a sample of what you can do; as an instance you can schedule seeding operation using CRON, truncate/seed only specific areas for a layers, automate the publishing of time series removing the old caches and seeding the new data published and so on.
You can find an higher level example of exploiting this script in a publishing infrastructure in the cleancache.sh script. The objective of this script is to truncate the local cache of a list of GWC instances (this is fairly special use case when for performance reasons we duplicate cache in a clustered environment rather than sharing them, this usually happens over distribute storage for performance reasons).
In order to ply with it, you should update the layers to work on by managing the list of layers in a file (e.g.) updated_layers.txt file and run this script to clear the cache.
cat updated_layers.txt | ./clearcache.sh
A more complex script could take from a file the layers and the parameters to clear, for instance to update the tiles for a particular time (e.g. updates to forecasts).
In a future blog post we will discuss more in detail the various deployment layouts for GWC and why we choose one or the other. It would also be interesting to know more about your own experience and tricks, so if you want, let us know.
The GeoSolutions Team,