Category Archives: Openness

Shuttleworth Fellowship

This month, I’m starting a year long Shuttleworth Foundation Fellowship. Thanks to Shuttleworth Foundation’s support I’ll be able to dedicate myself full-time to open knowledge and the Open Knowledge Foundation.

I’ll be working to promote open knowledge and open data around the world — open knowledge being any kind of content or data from sonnets to statistics, genes to geodata, that can be freely used, reused and redistributed.

Specifically I’ll be:

  • Promoting open knowledge in different domains such as the governmental, scientific, economic and bibliographic. This will involve working to develop communities of advocates and practitioners – organising regular meetings, bringing people together for events, working on standards and consensus building. Initiating and sustaining independent and active communities, using, and promoting open data in different fields is key to advancing open knowledge around the world.

  • Helping to grow the open data ecosystem, for example by adapting the tools and methodologies of the free/open source software community for use with open data. For example, I’ll be working heavily to develop CKAN, an open source registry for datasets. CKAN, which I helped initiate as an Open Knowledge Foundation project, is being used by the UK in its official data catalogue, data.gov.uk, and already has community instances in many other countries around the world – including Austria, Canada, Finland, France, Germany, Hungary, Italy, New Zealand, and Norway. There’s lot of interesting work both to extend CKAN and to improve associated tools like datapkg which enable “data developers” to automate working with datasets.

  • Working on specific projects that exemplify the open knowledge development process from end to end – going from opening up the raw data, to cleaning and aggregation, to re-exporting for reuse or integration into end user applications that explore, analyze and present the data. For example, Where Does My Money Go?, a project to allow users to explore and visually represent UK public spending, and Open Biblio, which will be bringing together a substantial of open bibliographic data as well as tools for its use and reuse.

Public Sector Transparency Board

As announced on Friday on the UK Government’s data.gov.uk, I am one of the members of the UK Government’s newly formed Public Sector Transparency Board.

From the announcement:

The Public Sector Transparency Board, which was established by the Prime Minister, met yesterday for the first time.

The Board will drive forward the Government’s transparency agenda, making it a core part of all government business and ensuring that all Whitehall departments meet the new tight deadlines set for releasing key public datasets. In addition, it is responsible for setting open data standards across the whole public sector, listening to what the public wants and then driving through the opening up of the most needed data sets.

Chaired by Francis Maude, the Minister for the Cabinet Office, the other members of the Transparency Board are Sir Tim Berners-Lee, inventor of the World Wide Web, Professor Nigel Shadbolt from Southampton University, an expert on open data, Tom Steinberg, founder of mySociety, and Dr Rufus Pollock from Cambridge University, an economist who helped found the Open Knowledge Foundation.

In the words of Francis Maude:

“In just a few weeks this Government has published a whole range of data sets that have never been available to the public before. But we don’t want this to be about a few releases, we want transparency to become an absolutely core part of every bit of government business. That is why we have asked some of the country’s and the world’s greatest experts in this field to help us take this work forward quickly here in central government and across the whole of the public sector.”

UK Government Plans to Open Up Data

Yesterday, in a speech on “Building Britain’s Digital Future”, UK Prime Minister Gordon Brown announced wide-ranging plans to open up UK government data. In addition to a general promise to extend the existing commitments to “make public data public” the PM announced:

  • The opening up of a large and important set of transport data (the NaPTAN dataset)
  • A commitment to open up a significant amount of Ordnance Survey data from the 1st April (though details of which datasets not yet specified)
  • By the Autumn an online e-“domesday” book giving “an inventory of all non-personal datasets held by departments and arms-length bodies
  • A new “institute” for web science headed by Tim Berners-Lee and Nigel Shadbolt and with an initial £30m in funding

This speech is a significant indication of a further commitment to the “making public data public” policy announced in the Autumn.

It’s great to see this as, a year ago it seemed as if government policy was set to largely ignore the research in the Models of Public Sector Information Provision by Trading Funds report (authored by myself, David Newbery and Professor Bently back in 2008) whose basic conclusions was that that government data which was digital, bulk and ‘upstream’ should be made available at marginal cost.

More detailed excerpts (with emphasis added)

Opening up data

In January we launched data.gov.uk, a single, easy-to-use website to access public data. And even in the short space of time since then, the interest this initiative has attracted – globally – has been very striking. The site already has more than three thousand data sets available – and more are being added all the time. And in the past month the Office for National Statistics has opened up access for web developers to over two billion data items right down to local neighbourhood level.

The Department for Transport and the transport industry are today making available the core reference datasets that contain the precise names and co-ordinates of all 350 thousand bus stops, railway stations and airports in Britain.

Public transport timetables and real-time running information is currently owned by the operating companies. But we will work to free it up – and from today we will make it a condition of future franchises that this data will be made freely available.

And following the strong support in our recent consultation, I can confirm that from 1st April, we will be making a substantial package of information held by ordnance survey freely available to the public, without restrictions on re-use. Further details on the package and government’s response to the consultation will be published by the end of March.

e-Domesday Book

And I can also tell you today that in the autumn the Government will publish online an inventory of all non-personal datasets held by departments and arms-length bodies – a “domesday book” for the 21st century.

The programme will be managed by the National Archives and it will be overseen by a new open data board which will report on the first edition of the new domesday book by April next year. The Government will then produce its detailed proposals including how this work can be extended to the wider public sector.

To inform the continuing development of making public data public, the National Archives will produce a consultation paper on a definition of the “public task” for public data, to be published later this year.

The new domesday book will for the first time allow the public to access in one place information on each set of data including its size, source, format, content, timeliness, cost and quality. And there will be an expectation that departments will release each of these datasets, or account publicly for why they are not doing so.

Any business or individual will be free to embed this public data in their own websites, and to use it in creative ways within their own applications.

Mygov

So our goal is to replace this first generation of e-government with a much more interactive second generation form of digital engagement which we are calling Mygov.

Companies that use technology to interact with their users are positioning themselves for the future, and government must do likewise. Mygov marks the end of the one-size-fits-all, man-from-the-ministry-knows-best approach to public services.

Mygov will constitute a radical new model for how public services will be delivered and for how citizens engage with government – making interaction with government as easy as internet banking or online shopping. This open, personalised platform will allow us to deliver universal services that are also tailored to the needs of each individual; to move from top-down, monolithic websites broadcasting public service information in the hope that the people who need help will find it – to government on demand.

And rather than civil servants being the sole authors and editors, we will unleash data and content to the community to turn into applications that meet genuine needs. This does not require large-scale government IT Infrastructure; the ‘open source’ technology that will make it happen is freely available. All that is required is the will and willingness of the centre to give up control.

Talking at Cambridge University Library on Openness and Libraries

This Wednesday (27th of January) at 1pm I’m giving one of Cambridge University Library’s regular lunch-time talks on Openness and Libraries. Attendance is free and anyone can come along!

Update (28th Jan): talk is done and slides are now up.

Blurb

Over the past few years, open licensing (http://www.opendefinition.org/) has facilitated the explosive growth of a ‘knowledge commons’. To give a few prominent examples: Open Access journals, Open Educational Resources and Open Data in scientific research have all been enabled by licenses which permit material to be freely re-used and re-distributed. This outpouring of support for openness has led to an incredible rise in community-led development and innovative uses.

Bibliographic records are a key part of our shared cultural heritage and essential to anyone working with cultural materials (books, music, films etc). Opening up those records for access and re-use offer a variety of benefits.

First, it would allow libraries to share records more efficiently and improve quality more rapidly through better, easier feedback. Second, easier access to catalogue data would spur development of the multifarious services, technologies and research that use that data, including, for example, search engines, book or music websites, researchers working on information production, journalists writing on orphan works, as well as many other areas we cannot even imagine in advance.

With a growing number of Government agencies and public institutions making data open, is is now time for the library community to do likewise?

Open Notebook Social Science

The other day I posted up some work-in-progress on the subject of patterns of knowledge production.

That material is still in a fairly preliminary state. However, my decision to release it it in this form was a conscious decision and part of an ongoing attempt on my part to practice a more open “release early, release often” approach to research.

In doing this I’m drawing direct inspiration from the open source and open notebook (science) communities and seeking to engage in what might be termed open notebook social science!

I think most researchers (including myself) feel a reluctance to put out material that isn’t at a reasonable level of maturity. While there are some good reasons for this, I think the main motivations are less positive, and are primarily to do with fear: be it of criticism or that your ideas are “taken” by others. While such fears can have some basis, it seems to me the benefits of an open approach — in terms of visibility, dissemination, and potential for collaboration — significantly outweigh any of the associated risks.

Over the last year, I’ve already been making some effort to move in this direction but from this point on I’m aiming to do this more thoroughly and methodically. A first step in this will be to put all the “patterns” and data online.

The Knowledge Commons is Different

I was looking again recently at “Understanding the Knowledge Commons” which I had perused previously.

While reading the introductory chapter by Hess and Ostrom I came across:

People started to notice behaviors and conditions on the web-congestion, free riding, conflict, overuse, and “pollution” — that had long been identified with other types of commons. They began to notice that this new conduit of distributing information was neither a private nor strictly a public resource.

I think they are absolutely right to consider the analogies of “knowledge commons” with traditional commons. However, and at the same time, I think it essential to emphasize that “knowledge commons” are also fundamentally different.

The key difference here is in the nature of the underlying good that makes up the commons: in traditional cases the good is some physical resource — seas, rivers, land — to which usage is shared (either de facto or de jure), while in the knowledge case, well, it’s knowledge!

Now physical resources are by their nature ‘rival’ (or ‘subtractable’ as the authors put it), that is your usage and my usage are substitutes — your usage reduces the amount available for me to use and, when we are close to capacity, is strictly rival — either I use it or you use it. Knowledge, however, is a classic example of a non-rival resource: when you learn something from me I’ve lost nothing but you’ve gained something.

This means, for example, that the classic ‘tragedy’ of the commons where overuse leads to destruction of the resource is simply not possible for a knowledge commons — in fact, knowledge is like some magical food from a fairytale where the more its used the more of it there is!

The more useful ‘commons’ analogy for knowledge is not in relation to use but to production and the ‘free-rider’ problems that can arise where something must be done by a team or community. The issue here is that a separation appears between your effort (private) and the resulting outcome (shared) which may lead to an under-supply of effort and ‘free-riding’ on the efforts of others (if there are ten people on guard duty late at night, one can probably take a nap endangering the city but if all ten of them do it then it could be disastrous).

Postscript

1. Before any misunderstanding arises I should make clear that the authors also acknowledge the role of rival/non-rival distinction — Ostrom, in fact, was one of the ‘coiners’ of the term rivalry. However, the article’s overall focus is on the analogies with the traditional commons.

2. Jamie Boyle has talked about the “second enclosure movement”. Though interesting to make this analogy I think references to the original enclosure movement is unfortunate for two reasons. First, it reinforces the mistaken analogy between knowledge and physical goods. Second, the evidence that the original enclosure movement was bad isn’t very compelling (in fact, it probably delivered net benefits).

2009 Open Knowledge Conference (OKCon) This Saturday

The Open Knowledge Foundation’s 2009 Open Knowledge Conference (OKCon), which I help organize, will take place next Saturday 28th March – less than a week away.

Full details including programme can be found either in this blog post or on the OKCon home page.

As usual this will be a fun and informal day so if you’re free this Saturday and interested in “Open” stuff come along to UCL and take part.

I should also add that for the two days before (Thursday + Friday) there is also the 5th COMMUNIA Workshop which is about Accessing, Using, Reusing Public Sector Content and Data which is being co-organized by the Open Knowledge Foundation together with the London School of Economics and taking place at LSE (all thanks to the tireless work of Jonathan Gray and Prodromos Tsiavos!).

New Open Access Journals from the Econometric Society

As a member of the Econometric Society I received yesterday the following announce:

The Council and the Fellowship of the Econometric Society have both voted in favor of a plan for the Society to publish two open-access journals: Quantitative Economics (QE) and Theoretical Economics (TE). All voting Council members were in favor of the proposal. Among the active Fellows, 277 (66.4% of the total) cast their ballots, with 240 votes (86.6%) in favor, 30 (10.8%) against, and 7 (2.5%) abstentions. An announcement together with a description of the new journals may be found in http://www.econometricsociety.org/news1.asp?ref=81 .

QE will be started from scratch and its first issue is planned for 2010. TE has been published by the Society for Economic Theory (http://econtheory.org/ ), but is to be adopted by the Econometric Society later this year. The first issue in 2010 will be the first one as a Society journal.

This is great news.

Recent Work on Open Economics

Over the Christmas break I had a chance to make some substantial improvements/additions to our Open Economics including:

  1. Improved javascript graphing.
  2. Extend Millenium Development Goals package and added web interface.
  3. First efforts at ‘Where Does My Money Go’

More details on each of these can be found below. Also we’d be delighted to here from anyone interested in getting involved in this, especially with the last item, so if interested do get in touch.

1. Updated javascript graphing package to use flot.

This also allows us to use javascript make the graphing stuff more interactive, in particular to select chart type and the series to plot. See e.g. the data on Daily Wages of Thatchers in the Middle Ages or Wheat, barley, oat, mutton and wool prices, and agricultural wages, 1500-1849.

2. Improved Millenium Development Goals package/dataset and added a web interface.

Extended ‘packagization’ of the MDG data by creating a mini-domain model and an associated sql version of data in addition to the existing csv normalized-tabular version of the data:

http://knowledgeforge.net/econ/svn/trunk/econdata/mdg/db.py

This is much more convenient for analysis (e.g. finding all countries which have at least one entry for any of these 3 series between 1995 and 2005 …). It is also essential for:

New web interface for Millenium Development Goals

Using the sql version of the data is was easy to build a quick-and-dirty web interface to enables one to browse and view the data quickly:

http://www.openeconomics.net/mdg/

For example here’s chart and data showing “Children under 5 moderately or severely underweight, percentage” for Afghanistan, China, India, United States:

http://www.openeconomics.net/mdg/view?commit=Show+Values&series=559&countries=4&countries=156&countries=356&countries=840

3. First efforts at ‘Where Does My Money Go’

Two parts to this project a) getting the data on government revenue/expenditure b) displaying it nicely in a web interface.

Part (a) is encapsulated in a new ukgovfinances dataset:

http://knowledgeforge.net/econ/svn/trunk/econdata/ukgovfinances/

Using this data we have made a (small) start on the web interface:

http://www.openeconomics.net/wdmmg/