countculture

Open data and all that

Archive for the ‘mapping’ Category

Planning Alerts: first fruits

with 13 comments

PlanningAlerts is coming soon

Well, that took a little longer than planned…

[I won’t go into the details, but suffice to say our internal deadline got squeezed between the combination of a fast-growing website, the usual issues of large datasets, and that tricky business of finding and managing coders who can program in Ruby, get data, and be really good at scraping tricky websites.]

But I’m pleased to say we’ve now well on our way to not just resurrecting PlanningAlerts in a sustainable, scalable way but a whole lot more too.

Where we’re heading: a open database of UK planning applications

First, let’s talk about the end goal. From the beginning, while we wanted to get PlanningAlerts working again – the simplicity of being able to put in your postcode and email address and get alerts about nearby planning applications is both useful and compelling – we also knew that if the service was going to be sustainable, and serve the needs of the wider community we’d need to do a whole lot more.

Particularly with the significant changes in the planning laws and regulations that are being brought in over the next few years, it’s important that everybody – individuals, community groups, NGOs, other websites, even councils – have good and open access to not just the planning applications in their area, but in the surrounding areas too.

In short, we wanted to create the UK’s first open database of planning applications, free for reuse by all.

That meant not just finding when there was a planning application, and where (though that’s really useful), but also capturing all the other data too, and also keep that information updated as the planning application went through the various stages (the original PlanningAlerts just scraped the information once, when it was found on the website, and even then pretty much just got the address and the description).

Of course, were local authorities to publish the information as open data, for example through an API, this would be easy. As it is, with a couple of exceptions, it means an awful lot of scraping, and some pretty clever scraping too, not to mention upgrading the servers and making OpenlyLocal more scalable.

Where we’ve got to

Still, we’ve pretty much overcome these issues and now have hundreds of scrapers working, pulling the information into OpenlyLocal from well over a hundred councils, and now have well over half a million planning applications in there.

There are still some things to be sorted out – some of the council websites seem to shut down for a few hours overnight, meaning they appear to be broken when we visit them, others change URLs without redirecting to the new ones, and still others are just, well, flaky. But we’ve now got to a stage where we can start opening up the data we have, for people to play around with, find issues with, and start to use.

For a start, each planning application has its own permanent URL, and the information is also available as JSON or XML:

There’s also a page for each council, showing the latest planning applications, and the information here is available via the API too:

There’s also a GeoRSS feed for each council too allowing you to keep up to date with the latest planning applications for your council. It also means you can easily create maps or widgets for the council, showing the latest applications of the council.

Finally, Andrew Speakman, who’d coincidentally been doing some great stuff in this area, has joined the team as Planning editor, to help coordinate efforts and liaise with the community (more on this below).

What’s next

The next main task is to reinstate the original PlanningAlert functionality. That’s our focus now, and we’re about halfway there (and aiming to get the first alerts going out in the next 2-3 weeks).

We’ve also got several more councils and planning application systems to add, and this should bring the number of councils we’ve got on the system to between 150 and 200. This will be an ongoing process, over the next couple of months. There’ll also be some much-overdue design work on OpenlyLocal so that the increased amount of information on there is presented to the user in a more intuitive way – please feel free to contact us if you’re a UX person/designer and want to help out.

We also need to improve the database backend. We’ve been using MySQL exclusively since the start, but MySQL isn’t great at spatial (i.e. geographic) searches, restricting the sort of functionality we can offer. We expect to sort this in a month or so, probably moving to PostGIS, and after that we can start to add more features, finer grained searches, and start to look at making the whole thing sustainable by offering premium services.

We’ll be working too on liaising with councils who want to offer their applications via an API – as the ever pioneering Lichfield council already does – or a nightly data dump. This not only does the right thing in opening up data for all to use, but also means we don’t have to scrape their websites. Lichfield, for example, uses the Idox system, and the web interface for this (which is what you see when you look at a planning application on Lichfield’s website) spreads the application details over 8 different web pages, but the API makes this available on a single URL, reducing the work the server has to do.

Finally, we’re going to be announcing a bounty scheme for the scraper/developer community to write scrapers for those areas that don’t use one of the standard systems. Andrew will be coordinating this, and will be blogging about this sometime in the next week or so (and you can contact him at planning at openlylocal dot com). We’ll also be tweeting progress at @planningalert.

Thanks for your patience.

PlanningAlerts is dead, long-live PlanningAlerts

with 29 comments

Planning Alerts screengrab

One of the first and best examples of how data could make a difference to ordinary people’s lives was the inspirational PlanningAlerts.com, built by Richard Pope, Mikel Maron, Sam Smith, Duncan Parkes, Tom Hughes and Andy Armstrong.

In doing one simple thing – allowing ordinary people to subscribe to an email alert when there was a planning application near them, regardless of council boundaries – it showed that data mattered, and more than that had the power to improve the interaction between government and the community.

It did so many revolutionary things and fought so many important battles that everyone in the open data world (and not just the UK) owes all those who built it a massive debt of gratitude. Richard Pope and Duncan Parkes in particular put masses of hours writing scrapers, fighting the battle to open postcodes and providing a simple but powerful user experience.

However, over the past year it had become increasingly difficult to keep the site going, with many of the scrapers falling into disrepair (aka scraper rot). Add to that the demands of a day job, and the cost of running a server, and it’s a tribute to both Richard and Duncan that they kept PlanningAlerts going for as long as they did.

So when Richard reached out to OpenlyLocal and asked if we were interested in taking over PlanningAlerts we were both flattered and delighted. Flattered and delighted, but also a little nervous. Could we take this on in a sustainable manner, and do as good a job as they had done?

Well after going through the figures, and looking at how we might architect it, we decided we could – there were parts of the problem that were similar to what we were already doing with OpenlyLocal – but we’d need to make sustainability a core goal right from the get-go. That would mean a business plan, and also a way for the community to help out.

Both of those had been given thought by both us and by Richard, and we’d come to pretty much identical ideas, using a freemium model to generate income, and ScraperWiki to allow the community help with writing scrapers, especially for those councils didn’t use one of the common systems. But we also knew that we’d need to accelerate this process using a bounty model, such as the one that’s been so successful for OpenCorporates.

Now all we needed was the finance to kick-start the whole thing, and we contacted Nesta to see if they were interested in providing seed funding by way of a grant. I’ve been quite critical of Nesta’s processes in the past, but to their credit they didn’t hold this against us, and more than that showed they were capable and eager to working in a fast, lightweight & agile way.

We didn’t quite manage to get the funding or do the transition before Richard’s server rental ran out, but we did save all the existing data, and are now hard at work building PlanningAlerts into OpenlyLocal, and gratifyingly making good progress. The PlanningAlerts.com domain is also in the middle of being transferred, and this should be completed in the next day or so.

We expect to start displaying the original scraped planning applications over the next few weeks, and have already started work on scrapers for the main systems used by councils. We’ll post here, and on the OpenlyLocal and PlanningAlert twitter accounts as we progress.

We’re also liaising with PlanningAlerts Australia, who were originally inspired by PlanningAlerts UK, but have since considerably raised the bar. In particular we’ll be aiming to share a common data structure with them, making it easy to build applications based on planning applications from either source.

And, finally, of course, all the data will be available as open data, using the same Open Database Licence as the rest of OpenlyLocal.

Ward maps on OpenlyLocal (& how I did it)

with 6 comments

Yesterday we added a feature to OpenlyLocal that I’ve been wanting to do since the beginning: ward maps. Why is this important? Simple, because hardly anyone knows what council ward they’re in, and nobody knows where the boundary lies, and the ward is the most basic unit of democratic accountability.

True, some councils have outlines of the wards (but without placing them on a map) and it is visible on the ONS’s Neighbourhood Statistics site, though it’s a pretty challenging user experience:

ONS Neighbourhood Statistics Ward Page

But to me, this should just be something that is there on the OpenlyLocal ward page, where the links to relevant councillors, stats and other data sit… and now it is. Here’s the page for Wembley Central ward in north London, for example, complete with zoomable maps with all the usual Google extras (satellite view etc):

Ward details and map for Wembley Central

Since tweeting about this, I’ve had a few people ask me how I did it, and so here are the details.

Much of the credit goes to the guys at #maptitude (writeup here), who put me in a room with the excellent Stuart Harrison from Lichfield District Council in a room and asked us to do something over the course of an afternoon, and to Stuart himself, who did most of the work.

Over that afternoon (on the suggestion of @danslee and with input from everyone else at the event) we did a ward comparison proof-of-concept using some boundaries Stuart had already imported into a database. The key thing, I think for both of us, was getting a few hours of focus just on the ward mapping problem, checking up on the various bits of code and interface with Google maps, and working out how to draw the outlines. Stuart used a Windows command line program and some PHP to get the boundaries; when I did it for the wards I used Ruby, the language OpenlyLocal is programmed in.

How I did it

The data comes from the newly opened-up OS BoundaryLine dataset (easiest to download from MySociety here), specifically the May 2010 data. The first problem is that this comes in the form of ESRI Shapefiles, which are standard for geo geeks, but not for data mashups, or mixing with online maps. The first stage was to import these into the database, and for this I used the GeoRuby library. Specifically I did this:

shpfile = File.join(RAILS_ROOT, "db/boundary_line/district_borough_unitary_ward_region")
GeoRuby::Shp4r::ShpFile.open(shpfile) do |shp|
    geom = shape.geometry
    attribute_data = shape.data
    #do something with the data
end

This just reads the shapefile (and associated data files), goes through the shapes, extracts the geometries and associated attributes (e.g. IDs, area size etc), and lets you do something with them.

If you’re using the Ruby Spatial Adapter library you should then be able to store them in the database easily. Except…

Except, the geometries are polygons made up not of latitude/longitude points but of OS northings/eastings. More than that, they are using a different model for the shape of the Earth (OSGB36) rather than the more normal (although arguably US-centric WGS84). Now doing these two conversions is not trivial, and while there are quite a few libraries for other languages I didn’t find one for Ruby, and so I wrote one, basically converting a Javascript version line by line into Ruby (gist here).

Like all direct code conversions, it’s not pretty, it isn’t fast, but I have tested it and it seems to work well. After that it was a simple matter to add this to the loop, put it in a ruby command line script called a rake task and let it run (took about an hour or so). Specifically this is what I put in in place of the ‘do something with the data’ line:

wgs84_lat_long_groupings = geom.geometries.first.rings.collect do |ring|
   ring.points.collect{|pt| OsCoordsNewUtilities.convert_os_to_wgs84(pt.x,pt.y).reverse }
   # when creating from collections of coords supplied as x,y (where x is long, y lat)
end
wsg84_polygon = Polygon.from_coordinates(wgs84_lat_long_groupings)
# if you're using the Spatial Adapter you should be able to save this polygon in a 'polygon'
# type field in the database.

So far I’ve done all the district wards; I’ll do the Unitary and County Councils next, but showing them as polygons on a map you quickly start to add a lot of bulk to the page, which for people using screenreaders or mobile devices is not good. So I’m investigating other options, including Google Fusion tables (though I’d like to steer clear of Google-only solutions), and running my own tile server. Suggestions, comments welcome.

In the meantime, hope this helps, and if people will find it helpful, I’ll make the boundaries available via the API as a KML file.

Written by countculture

June 2, 2010 at 7:25 pm