closer to ward maps: scraping the data

Toronto publishes its candidates here  http://app.toronto.ca/vote2010/findByOffice.do?officeType=2&officeName=Councillor in a kind of tabular format. All I want to do is count the number of candidates per ward, remembering that some wards have no candidates yet.

Being lazy, I’d far rather have another program parse the HTML, so I work from the formatted output of W3M. It’s relatively easy to munge the output using Perl. From there, I hope to stick the additional data either into a new column in the shapefile, or use SpatiaLite. I’m undecided.

My dubious Perl script:


#!/usr/bin/perl -w
# ward_candidates - mimic mez ward map
# created by scruss on 02010/03/01
# RCS/CVS: $Id$

use strict;
my $URL =
'http://app.toronto.ca/vote2010/findByOffice.do?officeType=2&officeName=Councillor';
my $stop = 1;

my %wards;
for ( 1 .. 44 ) {
 $wards{$_} = 0;    # initialise count to zero for each ward
}

open( IN, "w3m -dump \"$URL\" |" );
while (<IN>) {
 chomp;
 s/^\s+//;
 next if (/^$/);
 $stop = 1 if (/^Withdrawn Candidate/);
 unless ( 1 == $stop ) {
 my ($ward) = /(\d+)$/;
 $wards{$ward}++;    # increment candidate for this ward
 }
 $stop = 0 if (/^City Councillor/);
}
close(IN);

foreach ( sort { $a <=> $b } ( keys(%wards) ) ) {
 printf( "%2d\t%2d\n", $_, $wards{$_} );
}

exit;

which outputs the following (header added for clarity):

Ward Candidates
==== ==========
 1     3
 2     1
 3     0
 4     0
 5     1
 6     1
 7     7
 8     3
 9     2
10     3
11     2
12     3
13     1
14     4
15     3
16     1
17     2
18     4
19     6
20     2
21     1
22     1
23     1
24     0
25     2
26     3
27    12
28     3
29     6
30     3
31     3
32     2
33     1
34     0
35     5
36     2
37     2
38     2
39     1
40     2
41     1
42     5
43     3
44     3

No related posts.

Related posts brought to you by Yet Another Related Posts Plugin.

This entry was posted in GIS and tagged , , , . Bookmark the permalink.

One Response to closer to ward maps: scraping the data

  1. Pingback: ward maps: kinda working, sorta « Numpty's Progress

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>