countculture

Open data and all that

The economics of open data & the big society

with 8 comments

Yesterday I received an email from a Cabinet Office civil servant in preparation for a workshop  tomorrow about the Open Data in Growth Review, and in it I was asked to provide:

an estimation of the impact of Open Data generally, or a specific data set, on UK economic growth…  an estimation of the economic impact of open data on your business (perhaps in terms of increase in turnover or number of new jobs created) of Open Data or a specific data set, and where possible the UK economy as a whole

My response:

How many Treasury economists can I borrow to help me answer these questions? Seriously.

Because that’s the point. Like the faux Public Data Corporation consultation that refuses to allow the issue of governance to be addressed, this feels very much like a stitch-up. Who, apart from economists, or those large companies and organisations who employ economists, has the skill, tools, or ability to answer questions like that.

And if I say, as an SME, that we may be employing 10 people in a year’s time, what will that count against Equifax, for example (who are also attending), who may say that their legacy business model (and staff) depends on restricting access to company data. If this view is allowed to prevail, we can kiss goodbye to the ‘more open, more fair and more prosperous‘ society the government says it wants.

So the question itself is clearly loaded, perhaps unintentionally (or perhaps not). Still, the question was asked, so here goes:

I’m going to address this in a somewhat reverse way (a sort of proof-by-contradiction). That is, rather than work out the difference between an open data world and a closed data one by estimating the increase from the current closed data world, I’m going to work out the costs to the UK incurred by having closed data.

Note that extensive use is made of Fermi estimates and backs of envelopes

  • Increased costs to the UK of delays and frustrations. Twice this week I have waited around for more than 10 minutes for buses, time when I could have stayed in the coffee shop I was working in and carried on working on my laptop had I known when the next bus was coming.
    Assuming I’m fairly unremarkable here and the situation happens to say 10 per cent of the UK’s working population through one form of transport or another, that means that there’s a loss of potential productivity of approx 0.04% (2390 minutes/2400 mins x 10%).
    Similar factors apply to a whole number of other areas, closely tied to public sector data, from roadworks (not open data) to health information to education information (years after a test dump was published we still don’t have access to Edubase) – just examine a typical week and think of the number of times you were frustrated by something which linked to public information (strength of mobile signal?). So, assuming that the transport is a fairly significant 10% of the whole, and applying it to the UK $2.25 trillion GDP we get £9000 million. Not included: loss of activity due to stress, anger, knock-on effects (when I am late for a meeting I make attendees who are on time unproductive too), etc
  • Knock-on cost of data to public sector and associated administration. Taking the Ordnance Survey as an example of a Shareholder Executive body, of its £114m in revenue (and roughly equivalent costs), £74m comes from the public sector and utilities.
    Although there would seem to be a zero cost in paying money from one organisation to another, this ignores the public sector staff and administration costs involved in buying, managing and keeping separate this info, which could easily be 30% of these costs, say 22 million. In addition, it has had to run a sales and marketing operation costing probably 14% of its turnover (based on staff numbers), and presumably it costs money collecting, formatting data which is only wanted by the private sector, say 10% of its costs.
    This leads to extra costs of £22m + £16m + £14m = £52 million or 45%. Extrapolating that over the Shareholder Executive turnover of £20 billion, and discounting by 50% (on the basis that it may not be representative) leads to additional costs of £4500 million. Not included: additional costs of margin paid on public sector data bought back from the private (i.e. part of the costs when public sector buys public-sector-based data from the private sector is the margin/costs associated with buying the public sector data).
  • Significant decreases in exchange of information, and duplication of work within the public sector (not directly connected with purchase of public sector data). Let’s say that duplication, lack of communication, lack of data exchange increases the amount of work for the civil service by 0.5%. I have no idea of the total cost of the local & central govt civil service, but there’s apparently 450,000 of them, earning, costing say £60,000 each to employ, on the basis that a typical staff member costs twice their salary. That gives us an increased cost of £1350 million. Not included: cost of legal advice, solving licence chain problems, inability to perform its basic functions properly, etc.
  • Increased fraud, corruption, poor regulation. This is a very difficult one to guess, as by definition much goes undetected. However, I’d say that many of the financial scandals of the past 10 years, from mis-selling to the FSA’s poor supervision of the finance industry had a fertile breeding ground in the closed data world in which we live (and just check out the FSA’s terms & conditions if you don’t believe me). Not to mention phoenix companies, one hand of government closing down companies that another is paying money to, and so on. You could probably justify any figure here, from £500 million to £50 billion. Why don’t we say a round billion. Not included: damage to society, trust, the civic realm
  • Increased friction in the private sector world. Every time we need a list of addresses from a postcode, information about other companies, or any other public sector data that is routinely sold, we not only pay for it in the original cost, but for the markups on that original cost from all the actors in the chain. More than that, if the dataset is of a significant capital cost, it reduces the possible players in the market, and increases costs. This may or may not appear to increase GDP, but it does so in the same way that pollution does, and ultimately makes doing business in the UK more problematic and expensive. Difficult to put a cost on this, so I won’t.
  • I’m also going to throw in a few billion to account for all the companies, applications and work that never get started because people are put off by the lack of information, high barriers to entry, or plain inaccessibility of the data (I’m here taking the lead from the planning reforms, which are partly justified on the basis that many planning applications are not made because of the hassle in doing them or because they would be refused, or otherwise blocked by the current system.)

What I haven’t included is reduced utilisation of resources (e.g empty buses, public sector buildings – the location of which can’t be released due to Ordnance Survey restrictions, etc), the poor incentives to invest in data skills in the public sector and in schools, the difficulty of SMEs understanding and breaking into new markets, and the inability of the Big Society to argue against entrenched interests on anything like and equal footing.

And this last point is crucial if localism is going to mean more rather than less power for the people.

So where does that leave us. A total of something like:

£17,850 million.

That, back of the envelope-wise, is what closed data is costing us, the loss through creating artificial scarcity by restricting public sector data to only those pay. Like narrowing an infinitely wide crossing to a small gate just so you can charge – hey, that’s an idea, why not put a toll booth on every bridge in London, that would raise some money – you can do it, but would that really be a good idea?

And for those who say the figures are bunk, that I’ve picked them out of the air, not understood the economics, or simply made mistakes in the maths – well, you’re probably right. If you want me to do better give me those Treasury economists, and the resources to use them, or accept that you’re only getting the voice of those that do, and not innovative SMEs, still less the Big Society.

Footnote: On a similar topic, but taking a slightly different tack is the ever excellent David Eaves on the economics of Toronto’s transport data. Well worth reading.

Update 15/10/2011: Removed line from 3rd para: ” (it’s also a concern that we’re actually the only company attending that’s consuming and publishing open data)” . In the event it turned out there were a couple other SMEs too working with open data day-to-day, but we were massively outnumbered by parts of government and companies whose existing models were to a large degree based on closed data. Despite this there wasn’t a single good word to be heard in favour of the Public Data Corporation, and many, many concerns that it was going down the wrong route entirely. 

Written by countculture

October 13, 2011 at 5:39 pm

8 Responses

Subscribe to comments with RSS.

  1. One must also look to organisations such as Google and how and why they have a “more” opendata policy. For example I can do a Geocode or a Reverse Geocode API call to turn street into location. I can download whole datasets via Fusion Tables….. with 3 clicks and 5 lines of code(ish).

    Trying to obtain the Department of Communities and Local Government Public Assets dataset took a OS PSMA licence requiring a “wet signature”, signed in duplicate, with two senior civil servants to countersign it, just to get access to the location data. (Because of OS licensing restrictions… and that’s the state now!)

    If Google, a private company, can not only make the opendata available but also the process of obtaining the opendata accessible why can’t a government !?

    Rick Seymour

    October 17, 2011 at 8:41 am

    • I agree about the ‘friction’, but I think the comparison with Google is important, but not just for the reason you state.

      First, the PSMA is an example of the additional costs that don’t appear in OS’s accounts, but do end up being paid for by the state — how much did it cost the state (DCLG civil servants’ time etc) to manage this process, and how many times does this sort of thing happen. That’s one reason why I think that it’s easy to justify saying that the OS pricing regime adds at least 50% of the ‘purchase price’ in costs to the public sector, meaning there’s really no net ‘profit’ to the public sector from selling this data.

      Second, the Google transaction may have been ‘easy’, but it’s almost as problematic. Yes, it takes just a few clicks of the mouse, but those clicks include tying you in to T&C’s that mean you can’t use the geocoded lat/longs for anything other than Google maps. One of the reasons for having a national mapping agency is — I would think — to avoid all the location data being in the hands of an entity over which the democratic process has no control. Unfortunately the current regime, which seems likely to be strengthened by the PDC unless we can stop it, pushes many people down that root…

      countculture

      October 17, 2011 at 9:11 am

  2. Lol. Absolutely fantastic! I thought you were going to conclude it’s an absurd question full stop! Great answer, certainly the theory. Have you received any response from any civil servants about your conclusions?

    Rahim Hankin

    December 21, 2011 at 3:20 pm

    • Privately been praised by some of the more senior ones, but nothing publicly.
      C

      countculture

      December 21, 2011 at 3:54 pm

      • Lol. Great stuff! You should be praised for sure. I can’t imagine too many civil servants were expecting such an answer. I hope by congratulating you that means they agree!

        Rahim Hankin

        December 22, 2011 at 7:14 pm

  3. Dear Chris,
    I heared you speaking at the Rotterdam Open Data conference last Friday. You made a similar remark there about that the true revenues of open data are actually in the opportunity costs of *not* opening up data. I’ve stated this already a couple of years ago at OECD in Paris (“[…] I suggest that we change the perspective and look at the cost of not giving it away to the civil society”), see http://books.nap.edu/openbook.php?record_id=12687&page=25 (the quote is actual on page 28).
    I do share your frustration about the big firms being able to hire the best brains (“Who, apart from economists, or those large companies and organisations who employ economists, has the skill, tools, or ability to answer questions like that.”) — although ironically i am actually one of those highly paid research consultants. But I would rather work for the other side of the table, if you get my drift.
    One thing you might consider about OpenCorporates (although I am not sure it is feasible) is to cut out the middle layer of government entirely by enabling firms to *directly* register at OpenCorporates. We just have to solve the problem about establising the true identity of the registrants. But this is a general problem to IT security and has already been solved at other places (i am also working on cyber security together with Delft Uni. of Technology). Anyway, please hang on there because you looked quite under pressure in Rotterdam. All the best from Utrecht, the Netherlands

    Robbin te Velde

    March 18, 2012 at 5:05 pm

    • Robbin
      Thanks for your kind comments — hadn’t come across your book, and will certainly investigate. Doesn’t surprise me someone got there before me — but glad now to have a source 😉

      We couldn’t skip out the middle layer for OpenCorporates, for a couple of reasons, mainly that corporate legal entities only exist when they are given legal personality by the state, which of course is why also this must be public information.

      Definitely hanging in there — mostly I feel it is the open data movement that is under threat by those in government and with existing legacy business models who don’t want it to succeed for their own selfish reasons, and not a few of these are from the UK (cough! Ordnance Survey cough!)

      countculture

      March 18, 2012 at 5:20 pm

      • Dear Chris,
        1. It is not my book too much honour. But in 2003/2004 i havei actually tried to convince (in vain) the money hungry EC (after Peter Weiss’ big US numbers) that most economic benefits are in the indirect effects of opening up data, not in the direct effects.
        2. I get your point on the legal monopoly of the government. I guess the essence is to always have a copy of government datasets in the public sphere. Thus never to have exclusive agreements with private firms. This is the true danger of Google as you rightfully mention. Yet exclusivity and scarcity is the foundation of modern economy. So you are being pretty revolutionary here.
        3. About your Ordnance Survey: i recently conducted a study about the Dutch Cadastre, commisioned by the trade association of geo info firms (who has a love-hate relationship with the Cadastre). The results of the study about the financial performance of the Cadastre (or rather the lack thereof) were so clear that at the end of the day they did not dare to make it public. And i’d to sign an NDA. Isn’t it due time to critically audit the OS?

        wortel

        March 18, 2012 at 9:47 pm


Leave a comment