geocoder.ca’s excellent database

geocoder.ca provides a crowdsourced postal code geocoder under the ODbL. You can download the database as CSV directly. Here’s a bash script to convert that text file into a (very large) point shapefile:

#!/bin/bash
# geocoder2shp.sh - convert geocoder.ca CSV to a shape file
# NB: input CSV is UTF-8; it is passed through unchanged
# Needs >= v1.7 of GDAL
# scruss - 2012/04/15

if [ ! -f Canada.csv.gz ]
then
    echo ""
    echo " " Download \"Canada.csv.gz\" into the current directory from
    echo "  " http://geocoder.ca/onetimedownload/Canada.csv.gz
    echo " " and try again.
    echo ""
    exit 1
fi

# make input file with header
echo PostalCode,Latitude,Longitude,City,Province > Canada2.csv
gunzip -c Canada.csv.gz >> Canada2.csv

# create GDAL VRT file
cat > Canada2.vrt <<EOF
<OGRVRTDataSource>
  <!-- note that OGRVRTLayer name must be basename of source file -->
  <OGRVRTLayer name="Canada2">
    <SrcDataSource>Canada2.csv</SrcDataSource>
    <GeometryType>wkbPoint</GeometryType>
    <LayerSRS>EPSG:4326</LayerSRS>
    <GeometryField encoding="PointFromColumns" x="Longitude" y="Latitude"/>
    <Field name="PostalCode" type="String" width="6" />
    <Field name="Latitude" type="Real" />
    <Field name="Longitude" type="Real" />
    <Field name="City" type="String" width="60" />
    <Field name="Province" type="String" width="2" />
  </OGRVRTLayer>
</OGRVRTDataSource>
EOF

# create shapefile
ogr2ogr PostalCodes.shp Canada2.vrt

# clean up
rm -f Canada2.csv	Canada2.vrt

Though the script is a bit Unix-centric, it’s just a simple list of instructions which could be run on any command line. What it does is add some headers to the geocoder.ca file, then sets up an OGR Virtual Format to convert the text into a fairly well-defined shapefile. When you use this shapefile, you should credit geocoder.ca as the ODbL requires.

Eek! geocoder.ca has been sued by Canada Post! (News responses: Michael Geist, Boing Boing, CBC) I’ve donated to defend this useful service.

Leave a Reply

Your email address will not be published. Required fields are marked *