All of our lock-in fears prove justified – Twitter

Having acquired Gnip, Twitter is cutting off bulk access (the “firehose”) for everyone else – see e.g. Datasift announce and piece on recode.

Twitter have also been gradually shutting off / increasing control of access over the last few years. E.g. RSS shut down, then they changed API terms of use and got increasingly aggressive about that use.

It was always likely what the direction of travel would be for these “free” services – after all, somehow they’ve got to make money whilst providing “web-scale” service. But there’s nothing like an existence proof to give a distant predictable reality an immediacy that justifies action.

Of course the tough thing is the very reason we all use Facebook or Twitter or even Google is the immense direct and indirect network effects. That’s what makes it so tough for us individually to do much. However, as the need to monetise and protect their monopolies grow I think we are nearing the tipping point where we get some interesting innovation and disruption.

For a good review here see: whose final paras i esp like:

Twitter’s story in many respects makes me think of Google: both companies started out benefiting greatly from openness and the power of both connecting users to what they were interested in and opening up powerful APIs to developers. The monetization model is even similar: note the AdSense reference above. Over time, though, Google has pulled more and more of its utility onto its own pages (and the revenue balance in the company has followed), just as Twitter focused on its own apps, and now Google is even starting to eat its best customers like travel websites and insurance agents (members-only), just like Twitter ate Datasift.

Frankly, the arc of both companies is simultaneously understandable and saddening to me. I’ve loved them both for the ways they have connected me to truly new ideas and new people, and it’s frustrating to see the growth imperative push both companies to turn increasingly inwards. One does wonder if they might find salvation in each other.

Grey dawn, you welcome not my spirit to the day

Grey dawn, you welcome not my spirit to the day.
Locked deep in winter’s embrace, the depths of January
Are moribund of hope, and I can but think on Spring
To keep from despair and an endless sojourn in the soft arms of sleep.

The day does not begin but seeps in, in sluggish batches from the East.
The watery light of a half-begotten sun
Has barely strength enough to banish night and makes us only think
Ever of indoors, indoors!

Why weighs my spirit so this season’s lack?
There is good to take in it I’m sure, yet here,
Stood here, this Janus’d morn, with heaven swathed in grey
I cannot find it, and must survive with heavy heart
             these bleak mid-winter days.

Enlightened [TV Series]

I have nearly finished the first series of Enlightened, a TV Series created by Laura Dern and Mike White. The series is extraordinary – even in a world where TV series have become over the last ten years a leading entertainment and art form.

It is not an easy or “fun” series, which probably accounts for its cancellation after just two seasons – I’m sort of amazed it got made in the first place – I imagine Laura Dern had something to do with it.

In fact, it is often profoundly sad – and darkly funny – as we watch the small tragedies and ironies that attend upon Amy (Laura Dern) and those around her. Amy herself is a great tragi-comic creation who remains all too human and un-enlightened despite her initial “enlightenment” at the meditation retreat at the start of episode one.

The best way to describe the series is to imagine it is what Raymond Carver might have produced had he switched from writing short story miniatures of the small desolations and tragedies of suburban America and made TV instead.

Wanted – Data Curators to Maintain Key Datasets in High-Quality, Easy-to-Use and Open Form

Wanted: volunteers to join a team of “Data Curators” maintaining “core” datasets (like GDP or ISO-codes) in high-quality, easy-to-use and open form.

  • What is the project about: Collecting and maintaining important and commonly-used (“core”) datasets in high-quality, standardized and easy-to-use form - in particular, as up-to-date, well-structured Data Packages.
    The “Core Datasets” effort is part of the broader Frictionless Data initiative.
  • What would you be doing: identifying and locating core (public) datasets, cleaning and standardizing the data and making sure the results are kept up to date and easy to use
  • Who can participate: anyone can contribute. Details on the skills needed are below.
  • Get involved: read more below or jump straight to the sign-up section.

What is the Core Datasets effort?

Summary: Collect and maintain important and commonly-used (“core”) datasets in high-quality, reliable and easy-to-use form (as Data Packages).

Core = important and commonly-used datasets e.g. reference data (country codes) and indicators (inflation, GDP)

Curate = take existing data and provide it in high-quality, reliable, and easy-to-use form (standardized, structured, open)

What Roles and Skills are Needed

We need a variety of roles from identifying new “core” datasets to packaging the data to performing quality control (checking metadata etc).

Core Skills - at least one of these skills will be needed:

  • Data Wrangling Experience. Many of our source datasets are not complex (just an Excel file or similar) and can be “wrangled” in a Spreadsheet program. What we therefore recommend is at least one of:
    • Experience with a Spreadsheet application such as Excel or (preferably) Google Docs including use of formulas and (desirably) macros (you should at least know how you could quickly convert a cell containing ‘2014’ to ‘2014-01-01’ across 1000 rows)
    • Coding for data processing (especially scraping) in one or more of python, javascript, bash
  • Data sleuthing - the ability to dig up data on the web (specific desirable skills: you know how to search by filetype in google, you know where the developer tools are in chrome or firefox, you know how to find the URL a form posts to)

Desirable Skills (the more the better!):

  • Data vs Metadata: know difference between data and metadata
  • Familiarity with Git (and Github)
  • Familiarity with a command line (preferably bash)
  • Know what JSON is
  • Mac or Unix is your default operating system (will make access to relevant tools that much easier)
  • Knowledge of Web APIs and/or HTML
  • Use of curl or similar command line tool for accessing Web APIs or web pages
  • Scraping using a command line tool or (even better) by coding yourself
  • Know what a Data Package and a Tabular Data Package are
  • Know what a text editor is (e.g. notepad, textmate, vim, emacs, …) and know how to use it (useful for both working with data and for editing Data Package metadata)

Get Involved - Sign Up Now!

We are looking for volunteer contributors to form a “curation team”.

  • Time commitment: Members of the team commit to at least 8-16h per month (though this will be an average - if you are especially busy with other things one month and do less that is fine)
  • Schedule: There is no schedule so you can contribute at any time that is good for you - evenings, weekeneds, lunch-times etc
  • Location: all activity will be carried out online so you can be based anywhere in the world
  • Skills: see above

To register your interest fill in the following form. Any questions, please get in touch directly.

Want to Dive Straight In?

Can’t wait to get started as a Data Curator? You can dive straight in and start packaging the already-selected (but not packaged) core datasets. Full instructions here:

Thank You to Our Outgoing CEO

This is a joint blog post by Open Knowledge CEO Laura James and Open Knowledge Founder and President Rufus Pollock.

In September we announced that Laura James, our CEO, is moving on from Open Knowledge and we are hiring a new Executive Director.

From Rufus: I want to express my deep appreciation for everything that Laura has done. She has made an immense contribution to Open Knowledge over the last 3 years and has been central to all we have achieved. As a leader, she has helped take us through a period of incredible growth and change and I wish her every success on her future endeavours. I am delighted that Laura will be continuing to advise and support Open Knowledge, including joining our Advisory Council. I am deeply thankful for everything she has done to support both Open Knowledge and me personally during her time with us.

From Laura: It’s been an honour and a pleasure to work with and support Open Knowledge, and to have the opportunity to work with so many brilliant people and amazing projects around the world. It’s bittersweet to be moving on from such a wonderful organisation, but I know that I am leaving it in great hands, with a smart and dedicated management team and a new leader joining shortly. Open Knowledge will continue to develop and thrive as the catalyst at the heart of the global movement around freeing data and information, ensuring knowledge creates power for the many, not the few.

Amazon Twitch Acquisition – Paying 70x Sales

Just an aside from reading the recent Amazon 10-Q. In Note 4 on acquisitions they state:

On September 25, 2014, we acquired Twitch Interactive, Inc. (“Twitch”) for approximately $842 million in cash, as adjusted for the assumption of options and other items. During the nine months ended September 30, 2014, we acquired certain other companies for an aggregate purchase price of $20 million. Acquisition activity for the nine months ended September 30, 2013 was not material. We acquired Twitch because of its community and the live streaming experience it provides. The primary reasons for our other 2014 acquisitions were to acquire technologies and know-how to enable Amazon to serve customers more effectively.

and then in th pro-forma add:

The acquired companies were consolidated into our financial statements starting on their respective acquisition dates. The aggregate net sales and operating loss of the companies acquired was $12 million and $3 million for the nine months ended September 30, 2014.

This means that Amazon acquired Twitch for approximately 70x sales! (Earnings multiple is negative since Twitch was losing money it would appear).

A Data Revolution that Works for All of Us

Many of today’s global challenges are not new. Economic inequality, the unfettered power of corporations and markets, the need to cooperate to address global problems and the unsatisfactory levels of accountability in democratic governance – these were as much problems a century ago as they remain today.

What has changed, however – and most markedly – is the role that new forms of information and information technology could potentially play in responding to these challenges.

What’s going on?

The incredible advances in digital technology mean we have an unprecedented ability to create, share and access information. Furthermore, these technologies are increasingly not just the preserve of the rich, but are available to everyone – including the world’s poorest. As a result, we are living in a (veritable) data revolution – never before has so much data – public and personal – been collected, analysed and shared.

However, the benefits of this revolution are far from being shared equally.

On the one hand, some governments and corporations are already using this data to greatly increase their ability to understand – and shape – the world around them. Others, however, including much of civil society, lack the necessary access and capabilities to truly take advantage of this opportunity. Faced with this information inequality, what can we do?

How can we enable people to hold governments and corporations to account for the decisions they make, the money they spend and the contracts they sign? How can we unleash the potential for this information to be used for good – from accelerating research to tackling climate change? And, finally, how can we make sure that personal data collected by governments and corporations is used to empower rather than exploit us?

So how should we respond?

Fundamentally, we need to make sure that the data revolution works for all of us. We believe that key to achieving this is to put “open” at the heart of the digital age. We need an open data revolution.

We must ensure that essential public-interest data is open, freely available to everyone. Conversely, we must ensure that data about me – whether collected by governments, corporations or others – is controlled by and accessible to me. And finally, we have to empower individuals and communities – especially the most disadvantaged – with the capabilities to turn data into the knowledge and insight that can drive the change they seek.

In this rapidly changing information age – where the rules of the game are still up for grabs – we must be active, seizing the opportunities we have, if we are to ensure that the knowledge society we create is an open knowledge society, benefiting the many not the few, built on principles of collaboration not control, sharing not monopoly, and empowerment not exploitation.

Announcing a Leadership Update at Open Knowledge

Today I would like to share some important organisational news. After 3 years with Open Knowledge, Laura James, our CEO, has decided to move on to new challenges. As a result of this change we will be seeking to recruit a new senior executive to lead Open Knowledge as it continues to evolve and grow.

As many of you know, Laura James joined us to support the organisation as we scaled up, and stepped up to the CEO role in 2013. It has always been her intention to return to her roots in engineering at an appropriate juncture, and we have been fortunate to have had Laura with us for so long – she will be sorely missed.

Laura has made an immense contribution and we have been privileged to have her on board – I’d like to extend my deep personal thanks to her for all she has done. Laura has played a central role in our evolution as we’ve grown from a team of half-a-dozen to more than forty. Thanks to her commitment and skill we’ve navigated many of the tough challenges that accompany “growing-up” as an organisation.

There will be no change in my role (as President and founder) and I will be here both to continue to help lead the organisation and to work closely with the new appointment going forward. Laura will remain in post, continuing to manage and lead the organisation, assisting with the recruitment and bringing the new senior executive on board.

For a decade, Open Knowledge has been a leader in its field, working at the forefront of efforts to open up information around the world and and see it used to empower citizens and organisations to drive change. Both the community and original non-profit have grown – and continue to grow – very rapidly, and the space in which we work continues to develop at an incredible pace with many exciting new opportunities and activities.

We have a fantastic future ahead of us and I’m very excited as we prepare Open Knowledge to make its next decade even more successful than its first.

We will keep everyone informed in the coming weeks as our plans develop, and there will also be opportunities for the Open Knowledge community to discuss. In the meantime, please don’t hesitate to get in touch with me if you have any questions.

A Data API for Data Packages in Seconds Using CKAN and its DataStore

dpm the command-line ‘data package manager’ now supports pushing (Tabular) Data Packages straight into a CKAN instance (including pushing all the data into the CKAN DataStore):

dpm ckan {ckan-instance-url}

This allows you, in seconds, to get a fully-featured web data API – including JSON and SQL-based query APIs:

dpm ckan demo

View fullsize

Once you have a nice web data API like this we can very easily create data-driven applications and visualizations. As a simple demonstration, there’s the CKAN Data Explorer (example with IMF data - see below).

Where Can I Find a CKAN instance to Upload to?

If you’re looking for a CKAN site to upload your Data Packages to we recommend the DataHub which is community-run and free. To upload to the DataHub you’ll want to.

  1. Configure the DataHub CKAN instance in your .dpmrc

    url =
    apikey = your-api-key
  2. Upload your Data Package

    dpm ckan datahub --owner_org=your-organization

    You have to set the owner organization as all datasts on the DataHub need an owner organization.

One I Did Earlier

Here’s a live example of one “I did earlier”:

Context: a big motivation (personally) for doing this is that I’d like to see a nice web data API available for the “Core” Data Packages we’re creating as part of the Frictionless Data effort. If you’re interested in helping, get in touch.

Books Recently Read

Read recently (i.e. last couple of months)

  • The Silent Woman: Sylvia Plath and Ted Hughes by Janet Malcolm. (Feb 2014). Brilliant, thought-provking and insightful. A fascinating narrative, raising interesting and complex questions and all delivering in lapidary prose.
  • The Unwinding by George Packer (May 2014). I picked this up by accident in bookstore in DC Union Station. It hit me like a bolt between the eyes. The personal stories read like novels, the vignettes on famous personalities are rapiers that cut open the follies and predilictions of the age, and the overall sweep is poignant, powerful and profound. An extraordinary, special achievement. Read this book.
  • This Time is Different – Eight Centuries of Financial Folly by Reinhardt and Rogoff. (May 2014?) Quite interesting. Felt rather like an extended series of papers rather than a book. Quite a lot of good data.
    • 3 types of crises: External debt, domestic debt and banking.
    • Financial crises (default episodes) recur repeatedly but are often spaced substantially apart leading to “this time is different”
    • Takes a long time for a country to graduate to having a good borrowing rating and this counts for a lot.
    • Countries with a high rating don’t default (that much).
    • 2008 crisis was most severe since 30s in terms of amount of debt in default.
    • Generally see substantial increase in debt through a crisis and this is paid down via inflation
  • Lords of Finance. The Bankers who Broke the World by Liaquat Ahamed. (June 2014) Much more enjoyable than I had imagined. More of a narrative history of the period with lots of great anecdotes and excellent analysis of what went wrong. Much better (in many ways) than “This Time is Different”.
  • The Assassins’ Gate – America in Iraq by George Packer. (July 2014) Brilliant – almost as good as “The Unwinding.
  • The Master Switch – The Rise and Fall of Information Empires by Tim Wu. (July 2014). Pretty good though ultimately felt a bit disappointed – the earlier sections felt a lot more detailed and solid than the latter.
  • Under the Skin by Michael Faber. (July 2014) Disturbing allegory that is surprisingly powerful. Very well executed and more powerful for its (in)humanity.
  • The Crimson Petal and the White by Michael Faber. (July 2014). Engrossing but not that substantive. His misanthropy which infects the start gradually mellows. The ending is quite disatisfying as if he simply ran out of steam rather than reached a necessary conclusion.