Category Archives: GIS

maps are just one way of looking at it

Sometimes, you just have to roll your own…

It was Doors Open Toronto last weekend, and the city published the locations as open data: Doors Open Toronto 2013. I thought I’d try to geocode it after Richard suggested we take a look. OpenStreetMap has the Nominatim geocoder, which you can use freely as long as you accept restrictions on bulk queries.

As a good and lazy programmer, I first tried to find pre-built modules. Mistake #1; they weren’t up to snuff:

  • Perl’s Geo::Coder::Many::OSM would only read from OSM‘s server. MapQuest run their own mirror as part of their great MapQuest Open Platform Web Services suite, and they have almost no limitation on query volume. OSM runs their operation on a shoestring, and too many queries gets you the disapproval face, or worse.
  • Python’s geopy gave spurious results amid copious whiny error messages.
    (Standard operational procedure for python, then… ☺)

So I rolled my own, using nowt but the Nominatim Search Service Developer’s Guide, and good old simple modules like URI::Escape, LWP::Simple, and JSON::XS. Much to my surprise, it worked!

Much as I love XML, it’s a bit hard to read as a human, so I smashed the Doors Open data down to simple pipe-separated text: dot.txt. Here’s my code, ever so slightly specialized for searching in Toronto:

#!/usr/bin/perl -w
# - geocode pipe-separated addresses with nominatim
# created by scruss on 02013/05/28

use strict;
use URI::Escape;
use LWP::Simple;
use JSON::XS;

# the URL for OpenMapQuest's Nominatim service
use constant BASEURI => '';

# read pipe-separated values from stdin
# two fields: Site Name, Street Address
while (<>) {
    my ( $name, $address ) = split( '\|', $_, 2 );
    my %query_hash = (
        format  => 'json',
        street  => cleanaddress($address),    # decruft address a bit
                                              # You'll want to change these ...
        city    => 'Toronto',                 # fixme
        state   => 'ON',                      # fixme
        country => 'Canada',                  # fixme
        addressdetails => 0,                  # just basic results
        limit          => 1,                  # only want first result
             # it's considered polite to put your e-mail address in to the query
             # just so the server admins can get in touch with you
        email => '',    # fixme

        # limit the results to a box (quite a bit) bigger than Toronto
        bounded => 1,
        viewbox => '-81.0,45.0,-77.0,41.0'    # left,top,right,bottom - fixme

    # get the result from Nominatim, and decode it to a hashref
    my $json = get( join( '?', BASEURI, escape_hash(%query_hash) ) );
    my $result = decode_json($json);
    if ( scalar(@$result) > 0 ) {             # if there is a result
        print join(
            '|',    # print result as pipe separated values
            $name, $address,
    else {          # no result; just echo input
        print join( '|', $name, $address ), "\n";

sub escape_hash {

    # turn a hash into escaped string key1=val1&key2=val2...
    my %hash = @_;
    my @pairs;
    for my $key ( keys %hash ) {
        push @pairs, join( "=", map { uri_escape($_) } $key, $hash{$key} );
    return join( "&", @pairs );

sub cleanaddress {

    # try to clean up street addresses a bit
    # doesn't understand proper 'Unit-Number' Canadian addresses tho.
    my $_ = shift;
    s/Unit.*//;     # shouldn't affect result
    s/Floor.*//;    # won't affect result
    s/\s+/ /g;      # remove extraneous whitespace
    s/ $//;
    s/^ //;
    return $_;

It quickly became apparent that the addresses had been entered by hand, and weren’t going to geocode neatly. Here are some examples of the bad ones:

  • 200 University Ave St W — it’s an avenue, not a street, and it runs north-south
  • 2087 Davenport Road (Rear House) Rd Unit: Rear — too many rears
  • 21 Colonel Sameul Smith Park Dr — we can’t fix typos
  • 0 Construction Trailer:Lower Simcoe at Lakeshore Blvd — 0? Zero??? What are you, some kinda python programmer?

Curiously, some (like the address for Black Creek Pioneer Village) were right, but just not found. Since the source was open data, I put the right address into OpenStreetMap, so for next year, typos aside, we should be able to find more events.

Now, how accurate were the results? Well, you decide:

QGIS: Getting your ClickFu back

Since QGIS 1.8 took the rather boneheaded move of hiding all the plugin sources except the official one, here’s how to get the other repositories back:

Now you can add all the plugins!

ClickFu is kind of magic. It allows you to choose an online map service, click anywhere on your GIS viewport, and the correct location link opens in your browser. Very handy for sharing locations with people who don’t have a GIS setup. It properly takes coordinate reference systems into account, too, so no messing about with datum shifts and the like.

Another plugin that hasn’t made it to the official repo is Luiz Motta’s Zip Layers. It still lurks at

Click Fu runs just fun under QGIS 2.2 – as long as you remember to delete any pre-2.0 versions and then reinstall.

GeoURIs for fun and profit

Behold the GeoURI:


It’s just a simple lat/long pair which some browsers (mostly mobile) will parse as locations. It probably won’t work in your desktop browser, but here’s an example as a link: Askinosie Chocolate — Springfield, MO.

You can encode case-insensitive URIs very efficiently as QR Codes. If you’re okay with ~10m resolution on your locations, you can make very tiny QR Codes indeed:

qrencode -i -o Askinosie.png 'geo:37.2298,-93.2866'

Askinosie locationI’ve half a mind to make a little GPS-enable box that prints QR Code stickers with the current location, and the caption You are here.

Styling GeoBase land cover data

GeoBase provides the free Land Cover, Circa 2000 – Vector product which contains 1:250,000 land cover categories (water, ice, crops, urban, trees, …) for all of Canada. It has some use in wind energy development, as you can use it to classify the roughness length of terrain for flow modelling.

The data are provided as very large shape files, indexed by National Topographic System (NTS) codes. These aren’t the easiest to remember, so I keep finding myself going back to the Vector Indexes of the National Topographic System of Canada, especially the handy Google Earth version. Toronto is in NTS 030M, so I downloaded that. Here is what a cropped part looks like loaded over the local Toporama WMS tiles:

The fun stuff is hidden in the attribute table:

So the COVTYPE field is clearly holding something about the land cover type. But what?

GeoBase helpfully provide a PDF key and a style file — which only works with ArcGIS products. As a dedicated (okay, yes, cheap) user of Open GIS systems, this could not stand! So I dug through the Styled Layer Descriptor (SLD) format, and came up with style files (in both English & French) that work with QGIS:

To use them with QGIS, open up the layer properties, and go to the Style tab. You want to use the New Symbology option, then Load Style … From here, you can open the SLD file (changing the file type from QGIS’s default QML to SLD), which should appear like this:

This makes the layer look much better:

Making weird composite shapes with Shapely

Let’s say you operate microwave radio links, and you want to see if you could build a link between, say, the huge comms tower at Clear Creek, ON and your tower just south of Aylmer. You know that Erie Shores Wind Farm is in the middle of the route. Do the 77m diameter turbines attenuate your signal to nothing?

This is a real issue I have to deal with every day, except in reverse: when I’m planning wind farms, I don’t want to be blocking existing licensed links. I use Spectrum Direct to find these links (and precisely two years ago, I showed you how to write Spectrum Direct data as CSV), but up until now I worked out affected zones (called “consultation zones”) by hand.

The Radio Advisory Board of Canada and the Canadian Wind Energy Association developed a joint protocol on sizing consultation zone for wind turbines, documented in the RABC CANWEA Guidelines. The one for a microwave link comprises:

  • A 1km buffer around the start and end points of the link;
  • A distance-, frequency- and wind-turbine diameter-specific path width between the points.

Here I’m only going to consider the 2D geometry of the path. Lots of people in the radio industry get paid money to do line-of-sight terrain analysis, and I’m ignoring this for now. All I’m doing here is using Python’s Shapely library to make the dumb-bell shaped buffer around the path and endpoints. I learned of Shapely from Erik Westra’s book Python Geospatial Development. Here’s some code that outputs pipe-separated well-known text that imports nicely into QGIS:

# -*- coding: utf-8 -*-
# Calculate RABC/CanWEA Microwave Link Consultation Zone
# scruss - 2012-10-29

import pyproj
import shapely.geometry
from shapely.wkt import dumps

# hard-coded details (sorry)
startlat = 42.582892  # Clear Creek
startlong = -80.603875
endlat = 42.731776  # South Aylmer
endlong = -80.98768
frequency = 6e9  # 6 GHz
rotor = 77.0  # wtg rotor diameter, m

# calculate distance between points
g = pyproj.Geod(ellps='WGS84')
(az12, az21, dist) = g.inv(startlong, startlat, endlong, endlat)

# Width of microwave link fresnel zone
Lc = rotor + 52 * ((dist / 1000) / (frequency / 1e9)) ** 0.5

# calculate line string along path with segments <= 1 km
lonlats = g.npts(startlong, startlat, endlong, endlat,
                 1 + int(dist / 1000))

# npts doesn't include start/end points, so prepend/append them
lonlats.insert(0, (startlong, startlat))
lonlats.append((endlong, endlat))

# translate line string into projected coordinates
utms = []
srcp = pyproj.Proj(init='epsg:4326')  # WGS84
destp = pyproj.Proj(init='epsg:32617')  # WGS84 UTM Zone 17N
for (lon, lat) in lonlats:
    (utmx, utmy) = pyproj.transform(srcp, destp, lon, lat)
    utms.append((utmx, utmy))

# turn start and end points into Shapely point objects
startpt = shapely.geometry.Point(utms[0])
endpt = shapely.geometry.Point(utms[-1])

# draw a 1km buffer around start/end points
startb = startpt.buffer(1000)
endb = endpt.buffer(1000)

# convert microwave path to a Shapely LineString
path = shapely.geometry.LineString(utms)
pathb = path.buffer(Lc / 2) # buffer it to fresnel zone width

# join the buffers into one shape
zone = startb.union(pathb).union(endb)

# output the shape as well known text
print 'id|wkt'
print '|'.join(('1', dumps(zone)))

That little snippet creates the buffer in projected coordinates from two geographic coordinates as inputs. Grabbing the Erie Shores coordinates from OpenStreetMap and the USGS WMS, let’s see if the beam will pass:

Uh, no. That’s three turbines in the path in that little section alone. Back to the drawing board.

Awesome USGS orthoimagery … in Canada?

Update, 2015: I don’t think this coverage works any more. The WMS data was carefully trimmed to the US border last time I looked.

Detail of Erie Shores Wind Farm showing USGS orthophotos from WMS

I was trying to do some OpenStreetMap edits in rural southern Ontario, but the default Bing background aerial imagery is very poor. Clicking around in semi-plokta mode to find better images, I happened upon “OSM US USGS Large Scale Aerial Imagery”, which loaded up some beautiful and recent pictures. I guess that since we’re close to the border, the USGS doesn’t bother to trim its images too close.

Ian Dees, who admins the particular OSM server, confirmed that these are from The National Map‘s TNM_Large_Scale_Imagery service; thanks, Ian! These images are at least as good for my purposes as the SWOOP data — except not $50/km2, and without elaborate usage restrictions.’s excellent database provides a crowdsourced postal code geocoder under the ODbL. You can download the database as CSV directly. Here’s a bash script to convert that text file into a (very large) point shapefile:

# - convert CSV to a shape file
# NB: input CSV is UTF-8; it is passed through unchanged
# Needs >= v1.7 of GDAL
# scruss - 2012/04/15

if [ ! -f Canada.csv.gz ]
    echo ""
    echo " " Download \"Canada.csv.gz\" into the current directory from
    echo "  "
    echo " " and try again.
    echo ""
    exit 1

# make input file with header
echo PostalCode,Latitude,Longitude,City,Province > Canada2.csv
gunzip -c Canada.csv.gz >> Canada2.csv

# create GDAL VRT file
cat > Canada2.vrt <<EOF
  <!-- note that OGRVRTLayer name must be basename of source file -->
  <OGRVRTLayer name="Canada2">
    <GeometryField encoding="PointFromColumns" x="Longitude" y="Latitude"/>
    <Field name="PostalCode" type="String" width="6" />
    <Field name="Latitude" type="Real" />
    <Field name="Longitude" type="Real" />
    <Field name="City" type="String" width="60" />
    <Field name="Province" type="String" width="2" />

# create shapefile
ogr2ogr PostalCodes.shp Canada2.vrt

# clean up
rm -f Canada2.csv	Canada2.vrt

Though the script is a bit Unix-centric, it’s just a simple list of instructions which could be run on any command line. What it does is add some headers to the file, then sets up an OGR Virtual Format to convert the text into a fairly well-defined shapefile. When you use this shapefile, you should credit as the ODbL requires.

Eek! has been sued by Canada Post! (News responses: Michael Geist, Boing Boing, CBC) I’ve donated to defend this useful service.