countculture

Open data and all that

Posts Tagged ‘politics

An open letter to Vince Cable

leave a comment »

Dear Mr Cable

I read with interest yesterday your letter to the Prime Minister about some of the issues facing the UK in the future, and in particular the need for a vision and for a connected approach across government. This struck me as timely and useful, as it hopefully signalled the intention of a change in policy at one of the main roadblocks to innovation in improving government and fostering innovation.

I am referring to the policy of your own department – the Department for Business, Innovation and Skills – to restricting access to core reference datasets, such as the Ordnance Survey mapping data, postcodes, and company data, and thus not just stifling innovation and growth but preventing a consistent and connected approach across government.

Though much about the future is unclear one thing is certain, that we are increasingly living in a data world. In that world innovation – and democracy – depends on the ability to access and reuse data, particularly the core reference data on which other data is based: what area a postcode refers to, where something is located, who runs and owns the companies for which we work or which receive government money.

In fact, opening non-personal government data forms part of the government’s growth agenda, and it has already published a considerable amount. Yet much of this data is almost useless without the core reference to tie it together – data which is under the control of your department.

When I met with your then junior minister Ed Davey a couple of months ago on this subject, I asked him point blank whether the government was going to publish huge amounts of data under a licence which allowed free reuse, but was going to restrict access to the core datasets which tied these together, that were in fact the core infrastructure for our digital world? He said, ‘We’ve got some ideas for innovative charging models.’

Let’s put aside the fact that government departments aren’t the right people to come up with ‘innovative charging models’ – they don’t have the right skills, experience, and unlike entrepreneurs like myself they aren’t risking their personal money, but the nation’s future. Let’s focus instead on a ‘connected approach across government’. This would seem a perfect example of a relatively minor source of revenue (maybe as little as £50 million, according to the report published yesterday by Policy Exchange) preventing such an approach, and with it a route to how the UK will ‘earn our living in the future’.

In my own area, OpenCorporates has in a year grown to be the largest open database of corporate data in the world – without, I should add, any help, encouragement or cooperation from BIS. We have just released a new feature that allows search for directors across multiple jurisdictions, massively increasing the ability of journalists, fraud investigators, investors, civil society, customers and suppliers to understand companies. Needless to say, UK companies aren’t included in this list because this data is restricted to those who pay.

One vision for the future would include making the UK a genuinely open and transparent place to do business, for example making UK Companies House as open as that in New Zealand, where all data is available openly and without charge. It would include making the UK leaders in the field of open data, not just generating a world-leading ecosystem of companies such as we have in motorsport, but pioneering the use of open data by companies of all types and sizes. And it would include the government being able to reuse and publish its own data without the corrosive and restrictive licences placed upon it by the likes of Ordnance Survey, and thus have a truly connected approach.

You have it within your power to help enable that vision – I hope you will act on it.

Chris Taggart

Co-founder & CEO, OpenCorporates, founder OpenlyLocal.com
Member  of Local Public Data Panel

Written by countculture

March 7, 2012 at 5:00 pm

The economics of open data & the big society

with 8 comments

Yesterday I received an email from a Cabinet Office civil servant in preparation for a workshop  tomorrow about the Open Data in Growth Review, and in it I was asked to provide:

an estimation of the impact of Open Data generally, or a specific data set, on UK economic growth…  an estimation of the economic impact of open data on your business (perhaps in terms of increase in turnover or number of new jobs created) of Open Data or a specific data set, and where possible the UK economy as a whole

My response:

How many Treasury economists can I borrow to help me answer these questions? Seriously.

Because that’s the point. Like the faux Public Data Corporation consultation that refuses to allow the issue of governance to be addressed, this feels very much like a stitch-up. Who, apart from economists, or those large companies and organisations who employ economists, has the skill, tools, or ability to answer questions like that.

And if I say, as an SME, that we may be employing 10 people in a year’s time, what will that count against Equifax, for example (who are also attending), who may say that their legacy business model (and staff) depends on restricting access to company data. If this view is allowed to prevail, we can kiss goodbye to the ‘more open, more fair and more prosperous‘ society the government says it wants.

So the question itself is clearly loaded, perhaps unintentionally (or perhaps not). Still, the question was asked, so here goes:

I’m going to address this in a somewhat reverse way (a sort of proof-by-contradiction). That is, rather than work out the difference between an open data world and a closed data one by estimating the increase from the current closed data world, I’m going to work out the costs to the UK incurred by having closed data.

Note that extensive use is made of Fermi estimates and backs of envelopes

  • Increased costs to the UK of delays and frustrations. Twice this week I have waited around for more than 10 minutes for buses, time when I could have stayed in the coffee shop I was working in and carried on working on my laptop had I known when the next bus was coming.
    Assuming I’m fairly unremarkable here and the situation happens to say 10 per cent of the UK’s working population through one form of transport or another, that means that there’s a loss of potential productivity of approx 0.04% (2390 minutes/2400 mins x 10%).
    Similar factors apply to a whole number of other areas, closely tied to public sector data, from roadworks (not open data) to health information to education information (years after a test dump was published we still don’t have access to Edubase) – just examine a typical week and think of the number of times you were frustrated by something which linked to public information (strength of mobile signal?). So, assuming that the transport is a fairly significant 10% of the whole, and applying it to the UK $2.25 trillion GDP we get £9000 million. Not included: loss of activity due to stress, anger, knock-on effects (when I am late for a meeting I make attendees who are on time unproductive too), etc
  • Knock-on cost of data to public sector and associated administration. Taking the Ordnance Survey as an example of a Shareholder Executive body, of its £114m in revenue (and roughly equivalent costs), £74m comes from the public sector and utilities.
    Although there would seem to be a zero cost in paying money from one organisation to another, this ignores the public sector staff and administration costs involved in buying, managing and keeping separate this info, which could easily be 30% of these costs, say 22 million. In addition, it has had to run a sales and marketing operation costing probably 14% of its turnover (based on staff numbers), and presumably it costs money collecting, formatting data which is only wanted by the private sector, say 10% of its costs.
    This leads to extra costs of £22m + £16m + £14m = £52 million or 45%. Extrapolating that over the Shareholder Executive turnover of £20 billion, and discounting by 50% (on the basis that it may not be representative) leads to additional costs of £4500 million. Not included: additional costs of margin paid on public sector data bought back from the private (i.e. part of the costs when public sector buys public-sector-based data from the private sector is the margin/costs associated with buying the public sector data).
  • Significant decreases in exchange of information, and duplication of work within the public sector (not directly connected with purchase of public sector data). Let’s say that duplication, lack of communication, lack of data exchange increases the amount of work for the civil service by 0.5%. I have no idea of the total cost of the local & central govt civil service, but there’s apparently 450,000 of them, earning, costing say £60,000 each to employ, on the basis that a typical staff member costs twice their salary. That gives us an increased cost of £1350 million. Not included: cost of legal advice, solving licence chain problems, inability to perform its basic functions properly, etc.
  • Increased fraud, corruption, poor regulation. This is a very difficult one to guess, as by definition much goes undetected. However, I’d say that many of the financial scandals of the past 10 years, from mis-selling to the FSA’s poor supervision of the finance industry had a fertile breeding ground in the closed data world in which we live (and just check out the FSA’s terms & conditions if you don’t believe me). Not to mention phoenix companies, one hand of government closing down companies that another is paying money to, and so on. You could probably justify any figure here, from £500 million to £50 billion. Why don’t we say a round billion. Not included: damage to society, trust, the civic realm
  • Increased friction in the private sector world. Every time we need a list of addresses from a postcode, information about other companies, or any other public sector data that is routinely sold, we not only pay for it in the original cost, but for the markups on that original cost from all the actors in the chain. More than that, if the dataset is of a significant capital cost, it reduces the possible players in the market, and increases costs. This may or may not appear to increase GDP, but it does so in the same way that pollution does, and ultimately makes doing business in the UK more problematic and expensive. Difficult to put a cost on this, so I won’t.
  • I’m also going to throw in a few billion to account for all the companies, applications and work that never get started because people are put off by the lack of information, high barriers to entry, or plain inaccessibility of the data (I’m here taking the lead from the planning reforms, which are partly justified on the basis that many planning applications are not made because of the hassle in doing them or because they would be refused, or otherwise blocked by the current system.)

What I haven’t included is reduced utilisation of resources (e.g empty buses, public sector buildings – the location of which can’t be released due to Ordnance Survey restrictions, etc), the poor incentives to invest in data skills in the public sector and in schools, the difficulty of SMEs understanding and breaking into new markets, and the inability of the Big Society to argue against entrenched interests on anything like and equal footing.

And this last point is crucial if localism is going to mean more rather than less power for the people.

So where does that leave us. A total of something like:

£17,850 million.

That, back of the envelope-wise, is what closed data is costing us, the loss through creating artificial scarcity by restricting public sector data to only those pay. Like narrowing an infinitely wide crossing to a small gate just so you can charge – hey, that’s an idea, why not put a toll booth on every bridge in London, that would raise some money – you can do it, but would that really be a good idea?

And for those who say the figures are bunk, that I’ve picked them out of the air, not understood the economics, or simply made mistakes in the maths – well, you’re probably right. If you want me to do better give me those Treasury economists, and the resources to use them, or accept that you’re only getting the voice of those that do, and not innovative SMEs, still less the Big Society.

Footnote: On a similar topic, but taking a slightly different tack is the ever excellent David Eaves on the economics of Toronto’s transport data. Well worth reading.

Update 15/10/2011: Removed line from 3rd para: ” (it’s also a concern that we’re actually the only company attending that’s consuming and publishing open data)” . In the event it turned out there were a couple other SMEs too working with open data day-to-day, but we were massively outnumbered by parts of government and companies whose existing models were to a large degree based on closed data. Despite this there wasn’t a single good word to be heard in favour of the Public Data Corporation, and many, many concerns that it was going down the wrong route entirely. 

Written by countculture

October 13, 2011 at 5:39 pm

The Public Data Corporation vs Good Governance

with one comment

As I feared back when it was first announced, the proposed UK Public Data Corporation has got nothing to do with open data, and everything to do with protecting the interests of a few civil servants, turning back the open data clock to the dark ages of derived data and privileged access for the few.

However, the issue I’d like to focus on here, having last week attended a workshop on the PDC consultation is governance. [It's worth mentioning that I was the only one at the workshop without a stake in the existing public sector information structure, telling in itself.] And far from it being a dry, academic, wonkish subject, it is critical to the future of public data in the UK.

The reason this is so contentious is twofold:

  • The consultation on the PDC has been drawn very narrowly, trying to get respondants to choose between a set of options that are all bad for open data, and ultimately democracy. “So, open data, would you like a bullet to the back of the head, or to be slowly drained of blood?”
  • There are clear conflicts of interest between the wider interests of society, and those of the Shareholder Executive – the trading funds such as the Ordnance Survey and Land Registry who are the very roadblock that open data is supposed to clear, but yet who crucially seem to be driving the PDC.
    Now, from their perspective, I can see the appeal of keeping everything cosy and tight, particularly if there’s a chance the organisations being floated off, and with it considerable personal enrichment. But public policy shouldn’t be driven by the personal interests of civil servants, but what is in the interests of society as a whole.

In fact, the governance of the Public Data Corporation, and the rules by which it operates were the one thing that everyone at the workshop I attended agreed upon. In fact more than that, it was agreed that the delivery of its duties should be separate both from the principles by which it operates (which should be for the benefit of society) and the independent body that needs to ensure it sticks to those principles.

But here’s the kicker, the Transition Board for the PDC (which will oversee its membership, structure and governance) is, I understand, meeting on October 25, two days before the consultation ends.

When I asked this meeting, and whether the consultation was a done deal, I was told, “The governance of the PDC is not being consulted on.”

This is both rather shocking, and shameful, and for me means there’s only one viable option if the UK is serious about open data: to send the whole PDC concept back to the drawing board, and this time to come up with a solution that is focused not on civil servants’ narrow personal interests, but on building a ‘more open, more fair and more prosperous‘ society (to quote the Chancellor).

Written by countculture

October 10, 2011 at 11:53 am

Open Data: A threat or saviour for democracy?

with 2 comments

This is my presentation to the superb OKCON2011 conference in Berlin last week. It’s obviously openly licensed (CC-BY), so feel free to distribute widely. Comments also welcome.

Written by countculture

July 4, 2011 at 10:30 am

George Osborne’s open data moment: it’s the Treasury, hell yeah

with 2 comments

As a bit of an outsider, reading the government’s pronouncements on open data feels rather like reading official Kremlin statements during the Cold War. Sometimes it’s not what they’re saying, it’s who’s saying it that’s important.

And so it is, I think, with George Osborne’s speech yesterday morning at Google Zeitgeist, at which he stated, “Our ambition is to become the world leader in open data, and accelerate the accountability revolution that the internet age has unleashed“, and “The benefits are immense. Not just in terms of spotting waste and driving down costs, although that consequence of spending transparency is already being felt across the public sector. No, if anything, the social and economic benefits of open data are even greater.

This is strong, and good stuff, and that it comes from Osborne, who’s not previously taken a high profile position on open data and open government, leaving that variously to the Cabinet Office Minister, Francis Maude, Nick Clegg & even David Cameron himself.

It’s also intriguing that it comes in the apparent burying of the Public Data Corporation, which got just a holding statement in the budget, and no mention at all in Osborne’s speech.

But more than that it shows the Treasury taking a serious interest for the first time, and that’s both to be welcomed, and feared. Welcomed, because with open data you’re talking about sacrificing the narrow interests of small short-term fiefdoms (e.g. some of the Trading Funds in the Shareholder Executive) for the wider interest; you’re also talking about building the essential foundations for the 21st century. And both of these require muscle and money.

It also overseas a number of datasets which have hitherto been very much closed data, particularly the financial data overseen by the Financial Services Authority, the Bank of England and even perhaps some HMRC data, and I’ve started the ball rolling by scraping the FSA’s Register of Mutuals, which we’ve just imported into OpenCorporates, and tying these to the associated entries in the UK Register of Companies.

Feared, because the Treasury is not known for taking prisoners, still less working with the community. And the fear is that rather than leverage the potential that open data allows for a multitude of  small distributed projects (many of which will necessarily and desirably fail), rather than use the wealth of expertise the UK has built up in open data, they will go for big, highly centralised projects.

I have no doubt, the good intentions are there, but let’s hope they don’t do a Team America here (and this isn’t meant as a back-handed reference to Beth Noveck, who I have a huge amount of respect for, and who’s been recruited by Osborne), and destroy the very thing they’re trying to save.

Written by countculture

May 17, 2011 at 2:27 pm

Open data meme suggestion: Enabler or blocker?

with 2 comments

Are you a blocker or enabler?

Earlier today I gave a presentation at the Open Knowledge Conference on open local data, OpenlyLocal and the Open Election Data project. It was a slight update of the talk I gave to the Manchester Social Media Cafe earlier in the month, and one of the key additions was a simple idea I added on the final page, which was about where we should go from here.

I’d been using the idea in conversation for the past months ago (and I’m sure I didn’t invent it), but it seemed to resonate with the audience, and so I thought it’s worth repeating as a short blog post, and it’s this:

When dealing with government, with organizations, with public officials, with outsourcing companies we need to develop the meme:

Are you an enabler or a blocker?

It’s a blunt and somewhat unsophisticated weapon, but in the past few months of doing the Open Election Data project, it seems to have been far more effective that any other I’ve tried — better than appealing to the public good, better than engaging on an intellectual level, better than asking for it nicely, better even than talking about potential savings.

Maybe it’s because, as someone suggested to me after the first meeting of the UK government’s Local Public Data Panel on which I sit, civil servants and other public officials only do things because there’s a benefit to them (or a downside if they don’t). [I'm not sure they're any different than most people working in the private sector in this respect, by the way.] I don’t know, and I don’t really care. What I do care about is getting things done, and this seems to be working for me.

So, I offer it out there, not as an original idea (I’m sure it isn’t), but as a suggestion of both engaging with public bodies, and as a method of dealing with problems.

When you come across people or organisations given them the option: do you want to be an enabler or a blocker. If you’re an enabler, great, let’s see how we can make this work; if you’re a blocker, fine also — now we know we’ll just go around you and get on with it anyway.

Written by countculture

April 24, 2010 at 2:23 pm

How often do MPs turn up for work (Part 4): the ministerial effect

leave a comment »

[Note: Voting attendance is an imperfect proxy for actual attendance, as the figure may be depressed by silent abstentions (i.e. not voting in a division, rather than voting both ‘aye’ and ‘no’) and by just turning up to vote, but failing to attend the debate. However, until Parliament provides a better measure for attendance, or more transparency of MPs actions, this is the only one we have.]

A frequent arguments for low attendance of voting divisions by MPs is that the figure is depressed by ministers (and shadow spokespersons), whose other responsibilities prevent them from attending as many votes (as they’d like to), thus bringing down the overall average.

Seems reasonable, so let’s have a look at just how much of an influence this ‘ministerial effect’ has on the overall figures. First, let’s look at the average voting attendance for ministers and non-ministers (calculation details below):

Attendance rates May 97 – July 08
All MPs 65.1%
Non-Ministers 64.4%
Ministers 67.2%

Er, wait a minute, so the average voting attendance rate for ministers is higher than non-ministers? That’s not what we expected. However, basic averages (i.e. the mean) can hide a multitude of sins, so let’s have a look at the distribution of those attendance figures.

As you can see, while the peak of the ministerial attendance is around the 65% mark (less than that for the non-ministerial one), there were far more divisions in which 90%+ of ministers voted than there were for which 90%+ of non-ministers voted.

This makes sense, in a way, as ministers are far more likely than backbenchers to turn up en masse for votes their party sees as important. It’s this that largely accounts for the figures we saw in the table above. However, what the graph also shows is that when you take the ministers out of the equation, attendance definitely does not shoot up. There is, in short, no ‘ministerial effect’ to account for the low attendance of MPs.

[It's worth mentioning that the ministerial office records are slightly incomplete -- the record of Parliamentary Private Secretaries is missing during some periods -- so I've run the figures for ministers both including and excluding PPSs. As you can see, it doesn't make a lot of difference.]

The party lines

Having looked at the big picture, it’s time to look at the ministerial vs non-ministerial attendance by party, specifically the three main parties in Parliament.

As you can see, the relationship between ministerial and non-ministerial attendance is noticeably different for each of the parties. Labour ministers do indeed have noticeably lower attendance rates than their backbenchers, though not as much as I’d expected and not enough to alter the distribution massively.

However, for the Tories and LibDems, the surprising thing — for me, at least — was the attendance rates for their spokespersons are actually noticeably better than their backbenchers, raising rather than lowering the overall figures. What, I wonder, is the reason for this?

Finally, a couple of quick graphs to wrap this post up. One shows, perhaps not surprisingly, that Labour ministerial attendance rates are less than for the shadow spokespersons — presumably the time commitment for a governmental position is greater than that for the equivalent shadow position.

The other shows the distribution of backbenchers attendance figures, by party. I’ll leave that one without making any further comment.

C.


Notes on calculations

  • The Ministerial/non-ministerial attendance rates were calculated by looking at every Commons division between May 1997 and July 2008, and working out the number of ministers/non-ministers who could have voted in that division, and the number who actually did vote. The average attendance figures in the table were calculated by dividing the aggregate number of votes by the aggregate number of possible votes.
    To calculate the distribution of attendance rates I calculated the ministerial/non-ministerial attendance rate for each division, and plotted these on a graph to show how those attendance rates are distributed (as usual, I’ve made the underlying figures are available as a spreadsheet here and here if you want to examine them further).
  • Ministers are those holding any sort of ministerial office as per the PublicWhip database, including whips, but excluding select committee members (although it wouldn’t be hard to run the figures to include select committee members). The Parliamentary Private Secretaries record at the Public Whip is incomplete for several periods, and unfortunately (and ridiculously) there is no historical record of ministers available from Parliament’s own website.
  • The above calculations were derived from the voting record freely available from the Public Whip project, and cover the period from May 1997 to July 22, 2008 (when the house rose for the summer recess). The data can be downloaded in the form of a MySQL database, and this was used together with custom MySQL queries to generate the figures.
  • The graphs are visual representations of the density of the distribution, and were plotted using R using the kernel densityplot function.

Written by countculture

November 3, 2008 at 5:17 pm

Follow

Get every new post delivered to your Inbox.

Join 80 other followers