Georeferencing QGIS output rasters

Some of the design packages I rely on use very crude GIS facilities. In fact, all they can support is a georeferenced raster as a background image, so it’s more of a rough map than GIS. It helps if these rasters are at a decent resolution, typically 1m/pixel or better.

A while back, I asked on the QGIS forum if the package could output high resolution georeferenced rasters. I received a rather terse response that no, it couldn’t (and I inferred from the tone that the poster thought that it shouldn’t, and I was wrong to want such a thing). I shelved the idea at the time.

After having to fix a lot of paths in a QGIS project file I’d moved to a new system, I noticed that all the map composer attributes are rather neatly defined in the XML file structure. Some messing around with Perl, XML::Simple and Data::Dumper::Simple, and I had a little script that would spit out an ESRI World File for the map composer raster defined in the project.

To run this, you have to create a project with just one Print Composer page, save the composed map as an image, save the project, then run the script like this:

./ project.qgs > image.pngw

There are some caveats:

  • This probably won’t work for projects with multiple print composers
  • It doesn’t quite get the scale right, but it’s within a pixel or so. I may not have corrected for image borders.

Though there’s some fairly hideous XML-mungeing in the code, what the script does is entirely trivial. If you feel you can use it, good; if you feel you can improve it, be my guest.

#!/usr/bin/perl -w
# geoprint - georef a QGIS output image by creating a world file
# one arg: qgis project file (xml)
# $Id:,v 1.3 2012/04/06 03:32:01 scruss Exp $

use strict;
use XML::Simple;
use constant MM_PER_INCH => 25.4;

my $qgis = $ARGV[0];
die "$qgis must exist\n" unless ( -f $qgis );

my $q = XMLin($qgis) or die "$!\n";
my %composer = %{ $q->{Composer} };
my $image_width =
  int( $composer{Composition}->{paperWidth} *
    $composer{Composition}->{printResolution} /
    MM_PER_INCH );
my $image_height =
  int( $composer{Composition}->{paperHeight} *
    $composer{Composition}->{printResolution} /
    MM_PER_INCH );

# we need xpixelsize, ypixelsize, ulx and uly
my $xpixelsize =
  ( $composer{ComposerMap}->{Extent}->{xmax} -
    $composer{ComposerMap}->{Extent}->{xmin} ) /
  int( $composer{ComposerMap}->{ComposerItem}->{width} *
    $composer{Composition}->{printResolution} /
    MM_PER_INCH );
my $ypixelsize =
  -1.0 *
  ( $composer{ComposerMap}->{Extent}->{ymax} -
    $composer{ComposerMap}->{Extent}->{ymin} ) /
  int( $composer{ComposerMap}->{ComposerItem}->{height} *
    $composer{Composition}->{printResolution} /
    MM_PER_INCH );
my $ulx =
  $composer{ComposerMap}->{Extent}->{xmin} -
  $xpixelsize *
  int( $composer{ComposerMap}->{ComposerItem}->{x} *
    $composer{Composition}->{printResolution} /
    MM_PER_INCH ) - $xpixelsize;
my $uly =
  $composer{ComposerMap}->{Extent}->{ymax} -
  $ypixelsize *
  int( $composer{ComposerMap}->{ComposerItem}->{y} *
    $composer{Composition}->{printResolution} /
    MM_PER_INCH ) - $ypixelsize;

printf( "%.12f\n%.12f\n%.12f\n%.12f\n%.12f\n%.12f\n",
  $xpixelsize, 0.0, 0.0, $ypixelsize, $ulx, $uly );

# FIXME? pixel scale seems a tiny bit off - allow for border?

image georeferencing with QGIS

Quantum GIS (QGIS) has a very powerful image georeferencing module. What that allows you to do is convert screen pixels to a map of an area. The pixels could be from a scanned map or from a screen image. Scanned maps can sometimes be a little distorted, but QGIS’s georeferencer can handle that, within limits.

For an example, say I live in Redickville, ON (I don’t, but some folks do). I’ve heard from North Dufferin Agricultural & Community Taskforce (NDACT) that a large quarry is planned in my area. How close am I going to be from it?

NDACT has a helpful map (2.4 MB PNG image) had a helpful map (which I’ve kept here so you can work through this: MelancthonAerial), but it has no scale. It’s clearly derived from something like Google Maps, so it’s not exactly a free image I can throw about. One thing about georeferencing is that both the source image and the map from which you take you control points affect the licensing of the final map. You’re ending up with a derived work.

There are lots of sources of coordinates for your control points. You could always use Google Maps, but then you’re well down the derived work sinkhole. GeoGratis has a bunch of good data sources, and they are free to use. I’m going to use Toporama‘s digital image maps, as they’re clear and fairly accurate.

Redickville is on Toporama sheet 041A01. I’ve downloaded it as UTM, as it’s easier to measure distances that way. You want to set up your project so it uses the coordinate reference system (CRS) that the control point map uses. Toporama uses EPSG 26917 in the area (easily checked with gdalinfo; it’ll come up with something like AUTHORITY[“EPSG”,”26917″]]), so you should set the project to use that CRS:

And here’s the raster map loaded into QGIS;

Opening up the georeferencer plugin (which is now in the Raster menu, as of QGIS 1.8) gives you a whole lot of blank:

If we open the raster, you can zoom into the area into which you want to put control points. You want to have the highest zoom that the map’s still clear, as the accuracy of your final map depends on how well you placed control points. I’ve chosen road intersections as my control points. Here are the ones I’m going to use:

Point   E-W Road        N-S Road        Note
1       County Rd 21    2nd Line W      Honeywood
2       County Rd 21    4th Line       
3       20th Side Rd    5th Line       
4       15th Side Rd    County Rd 124   just S of where 124 straightens
5       15th Side Rd    Mulmur Townline
6       4th Line        5th Line        4th Line really runs SE-NW

These intersections are fairly well spaced apart, and are clear on both maps. So I choose a pixel on the map in the georeferencer, and a dialogue comes up:

(If you’ve downloaded the MelancthonAerial archive from here, the georeferencing points should load automatically from the “Melancthon Aerial July 22 09.png.points” file. If you want to go through the exercise of manually adding reference points, delete the points file.)

Here I’ve already clicked on the corresponding point on the Toporama map, and the coordinates have been filled in. If you know the coordinates, you can enter them in the boxes. Once you click OK, you have your first control point:

We can add the rest one by one. QGIS’s “Zoom Previous” is really useful for flipping between scales on both the plugin window and the main map. It’s probably a good idea to “Save GCP Points As …” every now and again so you don’t lose your work. Here are all of the points in the georeferencer plugin:

Now you want to modify the Transformation Settings; it’s the little wrench/spanner in the toolbar:

There are lots of options here:

  • Transformation Type: Linear is useful if you’re just adding georeference data to an already computer generated map. Helmert is a simple shear/scale/rotate transformation. The various Polynomial types will correct more gross distortions, but need many control points and can be processor intensive. Thin Plate Spline will distort your map locally to match GCPs; this can work well if your map’s a bit “vernacular”, but if your control points are wrong, your map will end up hilariously melty.
  • Resampling Method: This controls how the output pixels are calculated. Nearest Neighbour is quick and blocky, but useful if you’re mapping an image that has sharp transitions. Cubic maintains more detail, while applying some smoothing at the cost of some detail loss and a fair bit of processing power. The other options can look nice, but eat CPU. This is worth experimenting with many options, as there’s no one solution for all maps.
  • Compression: This controls the file size of the output GeoTiff file. JPEG and Deflate can result in small files, but there’s a chance that other GIS systems can’t read the data. Note that JPEG is lossy, and will lose some detail.

Don’t forget to set the output raster file, and make sure that the target SRS matches the CRS (EPSG 26917) you chose earlier.

Helmert is a useful transformation type to check your work. The plugin plots the residual difference between the two sets of map control points as red lines. Here, I’ve clearly made an error in point ID 2:

If you unselect the bad point, the plugin quickly calculates the residuals again. When this one bad point is removed, the residuals drop down to less than 1.0 for each point. I’ve got enough points for a Helmert transformation with 5 GCPs, but I’d probably want some more points for more complex transformations.

Once you hit “Start Transformation”, QGIS will create your referenced image. If you’ve chosen a simple linear transformation, it won’t create a GeoTiff file, but just a world file for your image.

So here’s the georeferenced image overlaid on the Toporama map, with a bit of transparency. It’s quite a good match:

(the above’s an image saved straight from QGIS. It helpfully creates a world file too, so here it is: redickville.jpgw)

So to answer the hypothetical question, Redickville’s pretty well surrounded by this proposed quarry.


current tools of my trade

Mark asked: What kind of GIS software are you using?
Well, since you asked:-

  • SpatiaLite: spatial awesome built on SQLite. I love it because I don’t need to play DBA.
  • QGIS: for maps
  • ogr: for file format futzing
  • proj: for scrupulously correct (well, if I knew what I was doing …) conversion between projected and otherwise.
  • OpenOffice: for those tedious calculations
  • … and about 20 years of unix experience to mash all the results together.

All of the above are free. I’m doing this because I want to learn. Asking elsewhere hasn’t turned up anything useful.


ward maps: kinda working, sorta

Now I’ve sorted out formatting the labels and scraping the data, I should be almost ready to produce a pretty map.

Well, almost. The DBF component of a shapefile seems somewhat resistant to adding a column, and SQLite doesn’t seem very happy with its ALTER TABLE ADD COLUMN ... syntax.

As usual, I needed to create the database table from the shapefile. I’m not bothered about CRS, so I used -1.

.read init_spatialite-2.3.sql ASCII

.loadshp TCL3_ICITW Wards CP1252 -1

alter table wards add column candidates integer

I had mixed success getting data to load into this new column. So I improvised.


(Sensitive readers are advised to look away)

There’s a seeming unused numeric column SHAPE_LEN in the table. As my new candidates column was coming up with occasional nulls, I cheated:

UPDATE Wards set shape_len=3 where scode_name="1"

UPDATE Wards set shape_len=1 where scode_name="2"

UPDATE Wards set shape_len=0 where scode_name="3"


UPDATE Wards set shape_len=3 where scode_name="44";

I then added SHAPE_LEN as the label, and defined a range based colour gradient for the wards in QGIS’s layer properties:

And this is how it looks:

Another partial success, as Professor Piehead would say.


closer to ward maps: scraping the data

Toronto publishes its candidates here in a kind of tabular format. All I want to do is count the number of candidates per ward, remembering that some wards have no candidates yet.

Being lazy, I’d far rather have another program parse the HTML, so I work from the formatted output of W3M. It’s relatively easy to munge the output using Perl. From there, I hope to stick the additional data either into a new column in the shapefile, or use SpatiaLite. I’m undecided.

My dubious Perl script:

#!/usr/bin/perl -w
# ward_candidates - mimic mez ward map
# created by scruss on 02010/03/01
# RCS/CVS: $Id$

use strict;
my $URL =
my $stop = 1;

my %wards;
for ( 1 .. 44 ) {
 $wards{$_} = 0;    # initialise count to zero for each ward

open( IN, "w3m -dump \"$URL\" |" );
while (<IN>) {
 next if (/^$/);
 $stop = 1 if (/^Withdrawn Candidate/);
 unless ( 1 == $stop ) {
 my ($ward) = /(\d+)$/;
 $wards{$ward}++;    # increment candidate for this ward
 $stop = 0 if (/^City Councillor/);

foreach ( sort { $a <=> $b } ( keys(%wards) ) ) {
 printf( "%2d\t%2d\n", $_, $wards{$_} );


which outputs the following (header added for clarity):

Ward Candidates
==== ==========
 1     3
 2     1
 3     0
 4     0
 5     1
 6     1
 7     7
 8     3
 9     2
10     3
11     2
12     3
13     1
14     4
15     3
16     1
17     2
18     4
19     6
20     2
21     1
22     1
23     1
24     0
25     2
26     3
27    12
28     3
29     6
30     3
31     3
32     2
33     1
34     0
35     5
36     2
37     2
38     2
39     1
40     2
41     1
42     5
43     3
44     3

Labelling: harder than it looks

I’m rather taken with Mez’s rather neat Toronto ward candidate maps. I wonder if I could reproduce them (semi-)automatically?

As a start, here’s the Toronto Wards layer, rendered in QGIS with the ward number as a label:

You’ll notice that something is quite off. It looks like QGIS uses the centre of the minimum bounding rectangle of a polygon as the label point. While this is okay for nice regular shapes, weird glaikit shapes end up with the label outside the boundary. Not good.

I was about to give up on this completely, when I saw QGIS’s “Labeling” [sic] plugin. What it does is work out a variety of better visual positions for your labels. Here’s the setting I chose:

The result is much more pleasing:

Much better.


tale of two cities: coordinate reference systems, and what on earth is the maywood tot lot?

For reasons that are not particularly clear, the|Open data is in two different coordinate reference systems (CRS), MTM 3 Degree Zone 10, NAD27 (EPSG 2019) and UTM 6 Degree Zone 17N NAD27 (EPSG 26717). This confuses QGIS even if you’ve input the proper SRIDs into SpatiaLite. The image above shows two apparent Torontos, one in each of the CRSs.

What you have to do is go to to the Project Properties, select the Coordinate Reference System (CRS) tab, and “Enable ‘on the fly’ CRS transformation”. This will line those city layers right back up.

Once we do that, things align as they should. Here’s my neighbourhood, with its parks

But things are still off if you’re querying the SQL directly:

select Distance(Parks.geometry, Neighbourhoods.geometry)/1000
 as Distance_km
 from Parks, Neighbourhoods
 and Neighbourhoods.hood='Kennedy Park'

which returns a distance of over 314 km. That’s not right.

So we need to transform the geometries to the same CRS.

!!! NB: I might be doing the next bit wrong. CRS transformation is subtle. I’m not, particularly.

The OGR Simple Feature Library is your friend. It can convert pretty much any geo format to another, and can transform coordinates between systems. In exchange for this power, it wants your soul is rather complex.

I’ve chosen to use NAD83(CSRS) / UTM zone 17N (EPSG 2958) for my Toronto maps. It’s fairly accurate over the whole city area. To convert the Parks and Neighbourhoods shape files:

ogr2ogr -s_srs EPSG:2019 -t_srs EPSG:2958 dest/2958/parks/TCL3_UPARK.shp src/2019/parks/TCL3_UPARK.shp
ogr2ogr -s_srs EPSG:26717 -t_srs EPSG:2958 dest/2958/neighbourhoods/Neighbourhoods.shp src/26717/neighbourhoods/Neighbourhoods.shp

Note that it wants the destination file first, then the source. Haven’t seen that order since PIP under CPM/2.2. I was also a bit nerdly, and arranged the files in directories by SRID:


If we load the transformed shapefiles into Spatialite, and run that query again, it comes out with the correct distance: 0.0 km, as Corvette Park is in the Kennedy Park Neighbourhood.

Now we can run a proper query: what parks are in Kennedy Park, and what are their areas?

select, round(Area(tp.geometry)/10000,1) as Area_ha
 from Parks as tp, Neighbourhoods as tn
where tn.hood='Kennedy Park'
 and within(tp.geometry, tn.geometry)
order by Area_ha
NAME Area_ha

(note how I sneakily used the round() function to avoid too many decimal places?)