Category Archives: Code

Shuttleworth Fellowship Bi-Annual Review

As part of my Shuttleworth Fellowship I’m preparing bi-annual reviews of what I — and projects I’m involved in — have been up to. So, herewith are some some highlights from the last 6 months.

CKAN and the theDataHub

OpenSpending

  • Two major point releases of OpenSpending software v0.10 and v0.11 (v0.11 just last week!). Huge maturing and development of the system. Backend architecture now finalized after a major refactor and reworking.
  • Community has grown significantly with now almost 50 OpenSpending datasets on theDataHub.org and growing group of core “data wranglers”
  • Spending Stories was a winner of the Knight News Challenge. Spending Stories will build on and extend OpenSpending.

Open Bibliography and the Public Domain

Open Knowledge Foundation and the Community

  • In September we received a 3 year grant from the Omidyar Network to help the Open Knowledge Foundation sustain and expand its community especially in the formation of new chapters
  • Completed a major recruitment process in (Summer-Autumn 2011) to bring on more paid OKFN team members including community coordinators, foundation coordinator and developers
  • The Foundation participated in launch of Open Government Partnership and CSO events surrounding the meeting
  • Working groups continuing to develop. Too much activity to summarize it all here but some highlights include:
    • WG Science Coordinator Jenny Molloy travelling to OSS2011 in SF to present Open Research Reports with Peter Murray-Rust
    • Open Economics WG developing and Open Knowledge Index in August
    • Open Bibliography working group’s work on an Metadata guide.
    • Open Humanities / Open Literature working group winning Inventare Il Futuro competition with their idea to use the Annotator
  • Development of new Local Groups and Chapters
    • Lots of ongoing activities in existing local groups and chapters such as those in Germany and Italy have
    • In addition, interest from a variety of areas in the establishment of new chapters and local groups, for example in Brazil and Belgium
  • Start of work on OKFN labs

Meetups and Events

Talks and Events

  • Attended Open Government Partnership meeting in July in Washington DC and launch event in New York in September
  • Attended Chaos Computer Camp with other OKFNers in August near Berlin
  • September: Spoke at PICNIC in Amsterdam
  • October: Code for America Summit in San Francisco (plus meetings) – see partial writeup
  • October: Open Government Data Camp in Warsaw (organized by Open Knowledge Foundation)
  • November: South Africa – see this post on Africa@Home and Open Knowledge meetup in Cape Town

General

Tabular Data Formats

As part of recent work on the DataExplorer I’ve been looking into formats / schemas for tabular data and have just posted this info on the wiki:

http://wiki.ckan.org/Data_Formats#Formats_-_Tabular

The list is quite short and if anyone out there has useful links or comments I’d love to know more (as one example, I hear very positive things about R and its data frames but have not yet tracked down a really good overview of interface of how its designed).

Background: why are we looking at this? The immediate reason is that we want to define a lightweight intermediate format for DataExplorer (and possibly the Webstore) into which one can convert incoming data coming from different sources (e.g. Webstore, Google docs, OData etc) before exporting to formats needed for the display widgets (such as SlickGrid, flot, d3 etc).

Javascript Templating and Frameworks

Ongoing and incomplete review of javascript templating systems and frameworks.

Templating

Unobtrusive (HTML + JSON)

‘Standard’ Templating Browser

Listings

Testing

  • nodeunit
  • qunit
  • jasmine
  • sinon.js (mocking) – integrates with qunit well

Frameworks

Client-side

  • backbone – used quite a bit
  • knockout
  • (big) sproutcore

Node

  • express
    • tags: nodejs
  • backbone now supported pretty well

Messaging and Job Queues

HTML5

ORMs

  • For mongo: http://mongoosejs.com/
  • Backbone sort of includes one (though relationships are poorly handled at the moment)

hg-git and pushing to git from mercurial

Documenting my experience pushing mercurial repos to git (and github specifically).

Install hg-git

Follow https://bitbucket.org/durin42/hg-git/src/tip/README.md

Install dulwich >= 0.6. On ubuntu:

sudo apt-get install python-dulwich

Get the latest version of hg-git:

hg clone https://bitbucket.org/durin42/hg-git

Add it to your extensions

[extensions]
git = path/to/hg-git/hggit

Push an existing mercurial repo

Assuming you’ve got a git repo somewhere, e.g. for me (rgrp) on github:

 cd my-current-mercurial-repo
 hg push git+ssh://git@github.com/rgrp/myrepo

Really important note: do not change git before the @ sign to your username as you would in mercurial but leave it as ‘git’ (this cost me around 20m of googling with errors like

Permission denied (publickey).
abort: the remote end hung up unexpectedly

You may also want to check your ssh setup with github really is working (see http://help.github.com/troubleshooting-ssh/).

Datapkg 0.8 Released

A new release (v0.8) of datapkg, the tool for distributing, discovering and installing data is out!

There’s a quick getting started section below (also see the docs).

About the release

This release brings substantial improvements to the download functionality of datapkg including support for extending the download system via plugins. The full changelog below has more details and here’s an example of the new download system being used to download material selectively from the COFOG package on CKAN.

# download metadata and all resources from cofog package to current directory
# Resources to retrieve will be selected interactively
download ckan://cofog .

# download all resources
# Note need to quote *
download ckan://name path-on-disk "*"

# download only those resources that have format 'csv' (or 'CSV')
download ckan://name path-on-disk csv

For more details see the documentation of the download command:

datapkg help download

Get started fast

# 1. Install: (requires python and easy_install)
$ easy_install datapkg
# Or, if you don't like easy_install
$ pip install datapkg or even the raw source!

# 2. [optional] Take a look at the manual
$ datapkg man

# 3. Search for something
$ datapkg search ckan:// gold
gold-prices -- Gold Prices in London 1950-2008 (Monthly)

# 4. Get some data
# This will result in a csv file at /tmp/gold-prices/data
$ datapkg download ckan://gold-prices /tmp

Find out more » — including how to create, register and distribute your own ‘data packages’.

Changelog

  • ResourceDownloader objects and plugin point (#964)
  • Refactor PackageDownloader to use ResourceDownloader and support Resource filtering
  • Retrieval options for package resourcs (#405). Support selection of resources to download (on command line or API) via glob style patterns or user interaction.

Introducing YourTopia – Development beyond GDP

The following is cross-posted from the Open Knowledge Foundation blog post. It reports the results of the code-sprint reported in this previous blog post.

Today we’re announcing a simple new app (also submitted to World Bank Apps competition) that allows anyone to say what kind of world, what ‘YourTopia’, they would like to live in:

http://yourtopia.net/

As well as having a very simple function: to tell you what country is closest to your ideal, the app also has a very serious purpose: to help us develop a real empirical basis for the measures of development that are used to guide policy-making.

Is health more important than education, or GDP, is the amount of R&D more important than amount spent on primary education? Help us find out what the world thinks!

You can see the app in action in the following video, or head over directly YourTopia and answer the 2-minute quiz.

More Information

Development Economics has for a long time recognised the deficiency of GDP as an indicator of human development but with little reception in policy-circles. Recently, however, the debate changed and no month passes now without a high-level report on “Development beyond GDP”.

OKFN’s new Open Economics Group has now constructed an application to test two solutions to primary problems in this debate, and it is participating in the World Bank’s competition “Applications for Development“.

Measures of human progress beyond GDP either use so-called dashboards of indicators (e.g. WDI) or composite indices (e.g. HDI or MPI). An openness-problem with the first approach has been that dashboards were so complex that the public was de facto excluded from the debate. The second approach tried to simplify through combining different dimensions into a single index but then suffered from arbitrary assumptions on the choice of weights applied to indices and choice of proxies for different development dimensions.

These are significant problems and so we’ve created Yourtopia, as the first application that produces a composite index of human development (OpenHDI) without arbitrary choices of indicator-weights and proxy choices.

We circumvent these problems simply: by letting the user participate. Rather than the researcher selecting proxies and indicator-weights we let the user choose. The resulting index of human progress is then personalised and contains no arbitrary assumptions by construction.

While the constructors of the HDI, for example, was always attacked for their assumption that human progress just depends on education, health and income and that these each carried the same importance, we now let the user decide which dimensions of progress are important and how they compare to each other.

Get Involved

We’d love to improve YourTopia in lots of ways and we need help with design, coding (python or javascript), and writing (from both an economists and a layman’s point of view!) (for example what does GNI in PPP terms mean to most people — we need translators from jargon to English!).

If you’re interested in helping please send either join the open-economics mailing list or just send a mail to info [at] okfn [dot] org.

Colophon

OpenHDI: Open Human Development Index

A few members of the Open Knowledge Foundation’s nascent open economics working group are having a code-sprint this Friday and Saturday to work on an app for the world bank competition currently called ‘Open HDI’ (Human Development Index):

The idea is to look at ‘development beyond GDP’ by collecting weightings on particular aspects of ‘development’ (health, education, gdp, inequality) from users and using that to build our own human development index.

We first talked about this a few months ago at the open economics online meetup. Dirk Heine and Guo Xu then put together an excellent demo version: http://eutopia.guoxu.org/ and now we’re working to take that to the status of a full app!

PyWordPress – Python Library for WordPress

Announcing pywordpress, a python interface to WordPress using the WordPress XML-RPC API.

Usage

Command line

Check out the commands::

wordpress.py -h 

You will need to create a config with the details (url, login) of the wordpress instance you want to work with::

cp config.ini.tmpl config.ini
# now edit away ...
vim config.ini

Python library

Read the code documentation::

>>> from pywordpress import WordPress
>>> help(WordPress)

Datapkg 0.7 Released

A major new release (v0.7) of datapkg is out!

There’s a quick getting started section below (also see the docs).

About the release

This release brings major new functionality to datapkg especially in regard to its integration with CKAN. datapkg now supports uploading as well as downloading and can now be easily extended via plugins. See the full changelog below for more details.

Get started fast

# 1. Install: (requires python and easy_install)
$ easy_install datapkg
# Or, if you don't like easy_install
$ pip install datapkg or even the raw source!

# 2. [optional] Take a look at the manual
$ datapkg man

# 3. Search for something
$ datapkg search ckan:// gold
gold-prices -- Gold Prices in London 1950-2008 (Monthly)

# 4. Get some data
# This will result in a csv file at /tmp/gold-prices/data
$ datapkg download ckan://gold-prices /tmp

Find out more » — including how to create, register and distribute your own ‘data packages’.

Changelog

  • MAJOR: Support for uploading datapkgs (upload.py)
  • MAJOR: Much improved and extended documenation
  • MAJOR: New sqlite-based DB index giving support for a simple, central, ‘local’ index (ticket:360)
  • MAJOR: Make datapkg easily extendable

    • Support for adding new Index types with plugins
    • Support for adding new Commands with command plugins
    • Support for adding new Distributions with distribution plugins
  • Improved package download support (also now pluggable)

  • Reimplement url download using only python std lib (removing urlgrabber requirment and simplifying installation)
  • Improved spec: support for db type index + better documentation
  • Better configuration management (especially internally)
  • Reduce dependencies by removing usage of PasteScript and PasteDeploy
  • Various minor bugfixes and code improvements