Category Archives: Python

ANN: PyWordPress – Python WordPress Library using the WordPress XML-RPC API

Announcing PyWordpress, a Python library for WordPress that provides a pythonic interface to WordPress using the WordPress XML-RPC API:

Along with a wrapper for the main functions it also provides various helper methods, for example to create many pages at once. This is somewhat of a belated announce as the first version of this was written almost a year ago!

Usage

Command line

Check out the commands::

wordpress.py -h 

Commands::

create_many_pages: Create many pages at once (and only create pages which do not already exist).
delete_all_pages: Delete all pages (i.e. delete_page for each page in instance).
delete_page: http://codex.wordpress.org/XML-RPC_wp#wp.deletePage
edit_page: http://codex.wordpress.org/XML-RPC_wp#wp.editPage
get_authors: http://codex.wordpress.org/XML-RPC_wp#wp.getAuthors
get_categories: http://codex.wordpress.org/XML-RPC_wp#wp.getCategories
get_page: http://codex.wordpress.org/XML-RPC_wp#wp.getPage
get_page_list: http://codex.wordpress.org/XML-RPC_wp#wp.getPageList
get_pages: http://codex.wordpress.org/XML-RPC_wp#wp.getPages
get_tags: http://codex.wordpress.org/XML-RPC_wp#wp.getTags
init_from_config: Class method to initialize a `Wordpress` instance from an ini file.
new_page: http://codex.wordpress.org/XML-RPC_wp#wp.newPage

You will need to create a config with the details (url, login) of the wordpress instance you want to work with::

cp config.ini.tmpl config.ini
# now edit away ...
vim config.ini

Python library

Read the code documentation::

>>> from pywordpress import WordPress
>>> help(WordPress)

License

MIT-licensed: http://www.opensource.org/licenses/mit-license.php

Datapkg 0.8 Released

A new release (v0.8) of datapkg, the tool for distributing, discovering and installing data is out!

There’s a quick getting started section below (also see the docs).

About the release

This release brings substantial improvements to the download functionality of datapkg including support for extending the download system via plugins. The full changelog below has more details and here’s an example of the new download system being used to download material selectively from the COFOG package on CKAN.

# download metadata and all resources from cofog package to current directory
# Resources to retrieve will be selected interactively
download ckan://cofog .

# download all resources
# Note need to quote *
download ckan://name path-on-disk "*"

# download only those resources that have format 'csv' (or 'CSV')
download ckan://name path-on-disk csv

For more details see the documentation of the download command:

datapkg help download

Get started fast

# 1. Install: (requires python and easy_install)
$ easy_install datapkg
# Or, if you don't like easy_install
$ pip install datapkg or even the raw source!

# 2. [optional] Take a look at the manual
$ datapkg man

# 3. Search for something
$ datapkg search ckan:// gold
gold-prices -- Gold Prices in London 1950-2008 (Monthly)

# 4. Get some data
# This will result in a csv file at /tmp/gold-prices/data
$ datapkg download ckan://gold-prices /tmp

Find out more » — including how to create, register and distribute your own ‘data packages’.

Changelog

  • ResourceDownloader objects and plugin point (#964)
  • Refactor PackageDownloader to use ResourceDownloader and support Resource filtering
  • Retrieval options for package resourcs (#405). Support selection of resources to download (on command line or API) via glob style patterns or user interaction.

Introducing YourTopia – Development beyond GDP

The following is cross-posted from the Open Knowledge Foundation blog post. It reports the results of the code-sprint reported in this previous blog post.

Today we’re announcing a simple new app (also submitted to World Bank Apps competition) that allows anyone to say what kind of world, what ‘YourTopia’, they would like to live in:

http://yourtopia.net/

As well as having a very simple function: to tell you what country is closest to your ideal, the app also has a very serious purpose: to help us develop a real empirical basis for the measures of development that are used to guide policy-making.

Is health more important than education, or GDP, is the amount of R&D more important than amount spent on primary education? Help us find out what the world thinks!

You can see the app in action in the following video, or head over directly YourTopia and answer the 2-minute quiz.

More Information

Development Economics has for a long time recognised the deficiency of GDP as an indicator of human development but with little reception in policy-circles. Recently, however, the debate changed and no month passes now without a high-level report on “Development beyond GDP”.

OKFN’s new Open Economics Group has now constructed an application to test two solutions to primary problems in this debate, and it is participating in the World Bank’s competition “Applications for Development“.

Measures of human progress beyond GDP either use so-called dashboards of indicators (e.g. WDI) or composite indices (e.g. HDI or MPI). An openness-problem with the first approach has been that dashboards were so complex that the public was de facto excluded from the debate. The second approach tried to simplify through combining different dimensions into a single index but then suffered from arbitrary assumptions on the choice of weights applied to indices and choice of proxies for different development dimensions.

These are significant problems and so we’ve created Yourtopia, as the first application that produces a composite index of human development (OpenHDI) without arbitrary choices of indicator-weights and proxy choices.

We circumvent these problems simply: by letting the user participate. Rather than the researcher selecting proxies and indicator-weights we let the user choose. The resulting index of human progress is then personalised and contains no arbitrary assumptions by construction.

While the constructors of the HDI, for example, was always attacked for their assumption that human progress just depends on education, health and income and that these each carried the same importance, we now let the user decide which dimensions of progress are important and how they compare to each other.

Get Involved

We’d love to improve YourTopia in lots of ways and we need help with design, coding (python or javascript), and writing (from both an economists and a layman’s point of view!) (for example what does GNI in PPP terms mean to most people — we need translators from jargon to English!).

If you’re interested in helping please send either join the open-economics mailing list or just send a mail to info [at] okfn [dot] org.

Colophon

PyWordPress – Python Library for WordPress

Announcing pywordpress, a python interface to WordPress using the WordPress XML-RPC API.

Usage

Command line

Check out the commands::

wordpress.py -h 

You will need to create a config with the details (url, login) of the wordpress instance you want to work with::

cp config.ini.tmpl config.ini
# now edit away ...
vim config.ini

Python library

Read the code documentation::

>>> from pywordpress import WordPress
>>> help(WordPress)

Datapkg 0.7 Released

A major new release (v0.7) of datapkg is out!

There’s a quick getting started section below (also see the docs).

About the release

This release brings major new functionality to datapkg especially in regard to its integration with CKAN. datapkg now supports uploading as well as downloading and can now be easily extended via plugins. See the full changelog below for more details.

Get started fast

# 1. Install: (requires python and easy_install)
$ easy_install datapkg
# Or, if you don't like easy_install
$ pip install datapkg or even the raw source!

# 2. [optional] Take a look at the manual
$ datapkg man

# 3. Search for something
$ datapkg search ckan:// gold
gold-prices -- Gold Prices in London 1950-2008 (Monthly)

# 4. Get some data
# This will result in a csv file at /tmp/gold-prices/data
$ datapkg download ckan://gold-prices /tmp

Find out more » — including how to create, register and distribute your own ‘data packages’.

Changelog

  • MAJOR: Support for uploading datapkgs (upload.py)
  • MAJOR: Much improved and extended documenation
  • MAJOR: New sqlite-based DB index giving support for a simple, central, ‘local’ index (ticket:360)
  • MAJOR: Make datapkg easily extendable

    • Support for adding new Index types with plugins
    • Support for adding new Commands with command plugins
    • Support for adding new Distributions with distribution plugins
  • Improved package download support (also now pluggable)

  • Reimplement url download using only python std lib (removing urlgrabber requirment and simplifying installation)
  • Improved spec: support for db type index + better documentation
  • Better configuration management (especially internally)
  • Reduce dependencies by removing usage of PasteScript and PasteDeploy
  • Various minor bugfixes and code improvements

Datapkg v0.7 Beta Released

I’ve just put out a beta of a major new version of datapkg (see changelog below):

There’s a quick getting started section below (also see docs).

About the release

This is a substantial release with a lot of new features. As this is a client app which will run on a variety of platforms its been released as a beta first so there’s a chance to catch any of the cross-platform compatibility bugs that inevitably show up. (My favourite from last time was a variation between python 2.5 and 2.6 in the way urlparse functioned for non-standard schemes …)

I’d therefore really welcome any feedback especially regarding bugs and from people using platforms I don’t usually — such as windows!

Get started fast

# 1. Install: (requires python and easy_install)
$ easy_install datapkg
# Or, if you don't like easy_install
$ pip install datapkg or even the raw source!

# 2. [optional] Take a look at the manual
$ datapkg man

# 3. Search for something
$ datapkg search ckan:// gold
gold-prices -- Gold Prices in London 1950-2008 (Monthly)

# 4. Get some data
# This will result in a csv file at /tmp/gold-prices/data
$ datapkg download ckan://gold-prices /tmp

Find out more » — including how to create, register and distribute your own ‘data packages.

Changelog

  • (MAJOR) Support for uploading datapkgs (upload.py)
  • (MAJOR) Much improved and extended documenation
  • (MAJOR) Make datapkg easily extendable
    • Support for adding new Index types with plugins
    • Support for adding new Commands with command plugins
    • Support for adding new Distributions with distribution plugins
  • Improved package download support (also now pluggable)
  • New sqlite-based DB index (ticket:360)
  • Improved spec: support for db type index + better documentation
  • Better configuration management (especially internally)
  • Reduce dependencies by removing dependency on PasteScript and PasteDeploy
  • Various minor bugfixes and code improvements

Versioning / Revisioning for Data, Databases and Domain Models: Copy-on-Write and Diffs

There are several ways to implement revisioning (versioning) of domain model and Databases and data generally):

  • Copy on write – so one has a ‘full’ copy of the model/DB at each version.
  • Diffs: store diffs between versions (plus, usually, a full version of the model at a given point in time e.g. store HEAD)

In both cases one will usually want an explicit Revision/Changeset object to which :

  • timestamp
  • author of change
  • log message

In more complex revisioning models this metadata may also be used to store key data relevant to the revisioning structure (e.g. revision parents)

Copy on write

In its simplest form copy-on-write (CoW) would copy entire DB on each change. However, this is cleary very inefficient and hence one usually restricts the copy-on-write to relevant changed “objects”. The advantage of doing this is that it limits the the changes we have to store (in essence objects unchanged between revision X and revision Y get “merged” into a single object).

For example, if our domain model had Person, Address, Job, a change to Person X would only require a copy of Person X record (an even more standard example is wiki pages). Obviously, for this to work, one needs to able to partition the data (domain model). With normal domain model this is trivial: pick the object types e.g. Person, Address, Job etc. However, for a graph setup (as with RDF) this is not so trivial.

Why? In essence, for copy on write to work we need:

  1. a way to reference entities/records
  2. support for putting objects in a deleted state

The (RDF) graph model has poor way for referencing triples (we could use named graphs, quads or reification but none are great). We could move to the object level and only work with groups of triples (e.g. those corresponding to a “Person”). You’d also need to add a state triple to every base entity (be that a triple or named graph) and add that to every query statement. This seems painful.

Diffs

The diff models involves computing diffs (forward or backward) for each change. A given version of the model is then computed by composing diffs.

Usually for performance reasons full representations of the model/DB at a given version are cached — most commonly HEAD is kept available. It is also possible to cache more frequently and, like copy-on-write, to cache selectively (i.e. only cache items which have change since the last cache period).

The disadvantage of the diff model is the need (and cost) of creating and composing diffs (CoW is, generally, easier to implement and use). However, it is more efficient in storage terms and works better with general data (one can always compute diffs), especially that which doesn’t have such a clear domain model — e.g. the RDF case discussed above.

Usage

  • Wikis: Many wikis implement a full copy-on-write model with a full copy of each page being made on each write.
  • Source control: diff model (usually with HEAD cached and backwards diffs)
  • vdm: copy-on-write using SQL tables as core ‘domain objects’
  • ordf: (RDF) diffs with HEAD caching

Howto Install 4store

My experiences (with the assistance of Will Waites) of installing 4store On Ubuntu Jaunty.

No packaged versions of code (there is one in fact from Yves Raimond from mid 2009 but now out of date …), so need to get from github.

Recommend using will waites fork which adds useful features like:

  • multiple connections
  • triple deletion

Note I had to make various fixes to get this to compile on my ubuntu machine. See diff below.

Install standard ubuntu/debian dependencies:

  • See 4store wiki
  • rasqal needs to be latest version
    • Get it
    • ./configure –prefix=/usr –sysconfdir=/etc –localstatedir=/var
    • make, make install
  • Now install

Now to start a DB:

  • 4s-backend-setup {db-name}
  • 4s-backend {db-name}

Now for the python bindings also created by will waites and which can be found here

  • On my Jaunty needed to convert size_t to int everywhere
  • Needed to run with latest cython (v0.12) installed via pip/easy_install
  • To run tests need backend db called py4s_test (harcoded)

To run multiple backends at once you will probably need to have avahi dev libraries (not sure which!).

Diff for wwaites 4store fork (updated diff as of 2010-04-28)


diff --git a/src/backend/Makefile b/src/backend/Makefile
index 51a957c..e64eb13 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -2,7 +2,7 @@ include ../discovery.mk
 include ../rev.mk
 include ../darwin.mk

-CFLAGS = -Wall -Wstrict-prototypes -Werror -g -std=gnu99 -O2 -I.. -DGIT_REV=\"$(gitrev)\" pkg-config --cflags raptor glib-2.0 +CFLAGS = -Wall -Wstrict-prototypes -g -std=gnu99 -O2 -I.. -DGIT_REV=\"$(gitrev)\" pkg-config --cflags raptor glib-2.0 LDFLAGS = $(ldfdarwin) $(ldflinux) -lz pkg-config --libs raptor glib-2.0 $(avahi)

LIB_OBJS = chain.o bucket.o list.o tlist.o rhash.o mhash.o sort.o \ diff --git a/src/common/Makefile b/src/common/Makefile index 9b33e94..60cd04f 100644 --- a/src/common/Makefile +++ b/src/common/Makefile @@ -21,7 +21,7 @@ ifdef dnssd mdns_flags = -DUSE_DNS_SD endif

-CFLAGS = -std=gnu99 -fno-strict-aliasing -Wall -Werror -Wstrict-prototypes -g -O2 -I../ -DGIT_REV=\"$(gitrev)\" $(mdns_flags) pkg-config --cflags $(pkgs) +CFLAGS = -std=gnu99 -fno-strict-aliasing -Wall -Wstrict-prototypes -g -O2 -I../ -DGIT_REV=\"$(gitrev)\" $(mdns_flags) pkg-config --cflags $(pkgs) LDFLAGS = $(ldfdarwin) $(lfdlinux) LIBS = pkg-config --libs $(pkgs)

diff --git a/src/frontend/results.c b/src/frontend/results.c index 485ac31..162aa3d 100644 --- a/src/frontend/results.c +++ b/src/frontend/results.c @@ -381,12 +381,12 @@ fs_value fs_expression_eval(fs_query *q, int row, int block, rasqal_expression * return v; }

  • case RASQAL_EXPR_SUM:
  • case RASQAL_EXPR_AVG:
  • case RASQAL_EXPR_MIN:
  • case RASQAL_EXPR_MAX:
  • case RASQAL_EXPR_LAST:
  • return fs_value_error(FS_ERROR_INVALID_TYPE, "unsupported aggregate operation");
  • //case RASQAL_EXPR_SUM:
  • //case RASQAL_EXPR_AVG:
  • //case RASQAL_EXPR_MIN:
  • //case RASQAL_EXPR_MAX:
  • //case RASQAL_EXPR_LAST:
  •  //    return fs_value_error(FS_ERROR_INVALID_TYPE, "unsupported aggregate operation");
    

    endif

Diff to wwaites py4s (updated diff as of 2010-04-28)


diff --git a/_py4s.pxd b/_py4s.pxd
index 5251289..0e26250 100644
--- a/_py4s.pxd
+++ b/_py4s.pxd
@@ -110,7 +110,7 @@ cdef extern from "frontend/results.h":

cdef extern from "frontend/import.h": int fs_import_stream_start(fsp_link *link, char *model_uri, char *mimety - int fs_import_stream_data(fsp_link *link, unsigned char *data, size_t co + int fs_import_stream_data(fsp_link *link, unsigned char *data, int count int fs_import_stream_finish(fsp_link *link, int *count, int *errors)

cdef extern from "frontend/update.h":

Using Deliverance as Middleware (with Proxying)

Deliverance is a great library that lets you easily re-theme external websites on the fly. Designed as WSGI middleware, it can be easily combined with some proxying to integrate a bunch of websites together

You can use deliverance plus proxying out-of-the-box using the deliverance-proxy command. However, I was interested in using Deliverance as middleware from code. This turned out to be none too trivial to do — all the examples on the internet seemed to focus on using deliverance-proxy or using it in an ini file.

After much wrestling, most notably with odd issues with gzipped (deflated) content I got it working and you can find a demo implementation (see demo.py and README.txt) here:

http://rufuspollock.org/code/deliverance/

I should also mention the following sources which were all of help in my quest:

SQLAlchemy Migrate with Pylons

Instructions on using sqlalchemy migrate with Pylons, especially to convert an existing pylons project to use sqlalchemy migrate

This is based off several excellent sources including this guide and these threads.

One important point to note is that you are likely to end up with two versions of your model tables: one in yourapp/model and one in yourapp/migration/versions/*.py with the former representing your tables at HEAD and the second containing upgrade/downgrade scripts whose final result is HEAD. This duplication is a bit annoying and I discuss how it can be avoided below.

1. Install sqlalchemy migrate for your project e.g.

  pip -E {your-virtualenv} install sqlalchemy-migrate
  # or
  easy_install sqlalchemy-migrate

NB: latest version of migrate are only compatible with sqlalchemy >= 0.5 (for previous version of sqlalchemy you need migrate <= 0.4.5)

2. Create the migrate repository (i.e. store for upgrade scripts …).

In your project directory

  migrate create myapp/migration/ "MyApp"

Now create a temporary helper script:

  migrate manage dbmanage.py --repository=myapp/migration/ --url={your-sqlalchemy-db-uri}

3. Set up db version control

  python dbmanage.py version_control

Check the current version (should be 0)

  python dbmanage.py version

4. Create a migration script for your existing db

  python dbmanage.py script "Add existing tables"

This will create a script in myapp/migration/versions/001_add_existing_tables.py

Copy into that file the definition for all your existing tables (and other database stuff such as constraints) and then create those tables in the upgrade() function (and delete them in downgrade()).

That’s it! (in theory)

Additional Issues

1. Duplication of model/db code

You now have two places for model/db code:

  1. Your migration scripts
  2. Your actual model

This doesn’t have to be a problem but it is an obvious way for bugs to creep especially when some people start by creating their DB from the model code and others from the migration scripts.

Warning: this method will not work if do stuff in your table creation that is not persisted into the actual DB sql (e.g. column default values based on a function, custom db types …).

One way to avoid the duplication is to have all table creation and alteration confined to your migration scripts and then have your model tables set up directly from the DB using the autoload=True option. The one disadvantage of this is you can’t see the complete DB set up in one places as tables construction may be spread over several migrate scripts. One solution to this is provided by the experimental ‘create_model’ command which dumps the current DB model in the required sqlalchemy table code.

More discussion in this migrate-users thread

Bringing the Migration DB up to date

If you do set up your DB (from new) directly from your model code rather than the migration scripts then this requires that you set up the migration stuff and update the migrate version to the correct number. (I note there is an experimental update_db_to_model command which is supposed to do this for you). You can do this as follows (assuming your migrate repository is at YOURAPP:

      from migrate.versioning.api import version_control, version
      import YOURAPP.migration.versions
      v = version(YOURAPP.migration.__path__[0])
      # log.info( "Setting current version to '%s'" % v )
      # url is your sqlalchemy db url 
      version_control(url, YOURAPP.migration.__path__[0], v)

Extras

  • Should wrap upgrade/downgrade in transactions. I found one way to do this here but testing indicated this didn’t work for me (rollback was not working properly when there was an error)