Public Data Corporation

STOP PRESS Cabinet Office has started an open consultation process here [1]

Despite repeated attempts to get clarification on what the Public Data Corporation will look like, we have no idea. As it transpires, it may be that simply there is no idea, as different priorities within government vie with each other in an attempt to shape the agenda.

That's why at ORG we have decided to allow all the people with an interest in this topic to propose ideas for a VISION for the Public Data Corporation

Please help shape this important new body so it can increase welfare, innovation, transparency and participation.

All contributions are welcome: great and small, insights and caveats, gossip and experience, etc.

The original Cabinet Office business plan set April 2011 as deadline, but the public announcement says simply "2011" [2]

Some background reading with links to other articles [3]

Tony Hirst has set up a dashboard for viewing responses to the consultation here.

For further info contact Javier AT openrightsgroup.org


A Vision for the The Public Data Corporation

What is the main purpose and priority?

The Public Data Corporation should focus on ensuring the sustainable supply of public data in technically and legally open forms. It should promote the re-use of public data for public benefit.

It should focus on the development of the public data infrastructure for the UK, investing in the maintenance of key datasets of public value.

Where datasets are primarily of commercial value, the PDC may develop market mechanisms that support pooling of private investment in the creation, maintenance or improvement of public data resources, subject to the condition that all such data remains open public data both legally and technically.

Possible purposes and priorities

  • Earn revenue for the Exchequer from sales of data which cannot be collected without the authority of the State
  • Enable UK businesses to export to the world PSI-related services developed and demonstrated here (e.g. services based on semantic web technologies which turn raw data into useful information; translation; visualisation...)
  • Make data available from all [200,000?] public bodies which are subject to the Freedom of Information Act (which OPSI does for the few thousand central government PSI holders)
  • Avoid creating a bureaucratic bottleneck by monopolising the distribution channels from holders to re-users
  • Advocacy of Open Data within government. This task is currently carried out by the Cabinet Office but this does not look a sustainable solution, as the next government may have other priorities to promote. David Eaves from Canada has written a job description for an Open Data advocate which may fit this institution [4]:
    • Getting the laggards up and running
    • Getting governments to use standardized licenses that are truly open (be it the PDDL, CC-0 or one of the other available licenses out there
    • Cultivating/fostering an eco-system of external data users
    • Cultivating/fostering an eco-system of internal government user (and vendors) for open data (this is what will really make open data sustainable)
    • Pushing jurisdictions and vendors towards adopting standard structures for similar types of data (e.g. wouldn't it be nice if restaurant inspection data from different jurisdictions were structured similarly?)

Discussion

Allowing private investment Key datasets are often 'owned' by particular departments, yet in an open data world many different actors inside and outside government use them. Datasets can be seen as public goods, in need to continued state investment that create positive externalities (and, in Pollock's analysis, increased tax revenues when used commercially) that justify the investment. However (a) not all datasets necessarily have the right balance of costs to positive externality/increased tax revenue benefits to justify this; and (b) investment in datasets primarily valuable for commercial return could be seen as unfair subsidy to particular commercial entities. Providing mechanisms for private investment in maintaining certain datasets may be useful mechanism here.

It is also the case that some datasets that the market will generate in any case may be of use to government/society as 'public data'. Finding ways for such data to be adopted into a open data regime, without the state being overcharged for such data may be relevant.

Charging for data re-use Given the above - are there legitimate cases where a charging regime for data use can exist? Whilst the openness conventionally argued for in open data movements involves no restrictions on commercial use - this risks the unfair subsidy point. Does a vision for the PDC need to include (a) a general opposition to charging in most cases; but (b) a set of principles that any charging regime must take into account (e.g. no charge for non-commercial uses; sliding scale for commercial but social-benefit use; etc.)

Which existing trading funds should be covered by the PDC?

The PDC should cover all main trading funds plus private bodies with public administration duties: (please add)

Met Office

Ordnance Survey

Land Registry

Companies House

Network Rail

Others

For a complete list see wikipedia: Trading fund

Who would run it?

Governance

Who should be in the board?

  • Business and Innovation
  • HM Trasury
  • Cabinet Office
Question: Information Commissioner

Does the ICO have a role, for instance in promoting best practice to minimise / pre-empt the need for FOI requests in some areas?

Regulation

There should be independent regulation and a fast appeals process.

Who should regulate?

How would it function?

General notes

  • Exploiting data resources
    • Corporation would be subject to FOI
    • Profit target setting, if any?
    • Monopoly licensing as cost recovery is not an option.
    • PDC may be making decisions on datasets to be released, pricing and
    • Commercial partners?


  • Availability of data
    • The PDC should create simple APIs, linked data and simple means to facilitate the use of data by those with limited resources. Dumping large files for download is not enough.
      • It is not clear that APIs get widely used or are the best approach (also shifts costs of providing services onto API provider, rather than user - which in some cases can drive considerable cost of providing a public service onto the state). Providing people with the tools to make sense of data is important however - and opening up the whole workflow of data re-use, so that potential re-users of data are pointed to existing value-added versions of that data (e.g. API; Browse Interface etc.)


  • Licensing
    • Free data to be provided with an Open Government License that allows reuse


  • Services
    • PDC could be proving services into government to support planning for, and sustainable release of, large and complex datasets
    • What incentives will there be for civil servants to take data job seriously?
      • At present there is no interest and motivation for data to be released properly.


  • Fostering innovation
    • The PDC should help small and medium enterprises innovate as part of a wider drive to break the hold of large contractors of public sector.
    • The PDC should not be used to take ideas from independent innovators and feed them to large companies for large profits to be made by scaling.
    • Economic innovation cannot be the main criteria for data release (+1)

We have to avoid creating a new data divide

Single copyright for UK public sector data

At the moment the UK public sector includes Crown bodies whose works are subject to Crown copyright and non-Crown bodies. There is no real logic to this. There is an opportunity to create a single simple Crown copyright with right of re-use.

Discussion

Scaling There is no easy mechanism at the moment for government to 'adopt' services of public value built on top of open data which don't have a strong commercial business case, but are of social value, and the developers of which would be interested to see that social value realised at scale. Can the PDC play a role in investing in the infrastructure to support value added to datasets to be scaled?

What business models can be proposed?

The trading funds are currently operating at a profit.

How could trading funds be guaranteed income without direct sale of products and a target profit margin?

Free flow idea bucket

Things you can do

Open Tasks

Wikipedia page

A neutral person should create a Wikipedia page

Let people know

Please advertise and circulate this page from tomorrow (Tuesday 8 Feb)

External links

Blogs and pages about the PDC

Tony Hirst [5]

Sam Smith [6]

Michael Grimes [7]

The Register [8]

Shane O'Neill [9]