Yesterday I was at RE:PUBLICA XI to give a talk on Open Government Data in the opening session of the “open” stream. The crammed to over-capacity room was a nice indicator of the growing attention and interest being generated by open data, and especially open governernment data. Slides online here and below.
Archives for April 2011
Background: I first got involved with Creative Commons (CC) in 2004 soon after its UK chapter started. Along with Damian Tambini, the then UK ‘project lead’ for CC, and the few other members of ‘CC UK’, I spent time working to promote CC and its licenses in the UK (and elsewhere). By mid-2007 I was no longer very actively involved and to most intents and purposes was no longer associated with the organization. I explain this to give some background to what follows.
Creative Commons as a brand has been fantastically successful and is now very widely recognized. While in many ways this success has been beneficial for those interested in free/open material it has also raised some issues that are worth highlighting.
Creative Commons is not a Commons
Ironically, despite its name, Creative Commons, or more precisely its licenses, do not produce a commons. The CC licenses are not mutually compatible, for example, material with a CC Attribution-Sharealike (by-sa) license cannot be intermixed with material licensed with any of the CC NonCommercial licenses (e.g. Attribution-NonCommercial, Attribution-Sharealike-Noncommercial).
Given that a) the majority of CC licenses in use are ‘non-commercial’ b) there is also large usage of ShareAlike (e.g. Wikipedia), this is an issue affects a large set of ‘Creative Commons’ material.
Unfortunately, the presence of the word ‘Commons’ in CC’s name and the prominence of ‘remix’ in the advocacy around CC tends to make people think, falsely, that all CC licenses as in some way similar or substitutable.
The ‘Brand’ versus the Licenses
More and more frequently I hear people say (or more significantly write) things like: “This material is CC-licensed”. But as just discussed there is large, and very significant, variation in the terms of the different CC licenses. It appears that for many people the overall ‘Brand’ dominates the actual specifics of the licenses.
This is in marked contrast to the Free/Open Source software community, where even in the case of the Free Software Foundation’s licenses people tend to specify the exact license they are talking about.
Standards and interoperability are what really matter for licenses (cf the “Commons” terminology). Licensing and rights discussions are pretty dull for most people — and should be. They are important only because they determine what you and I can and can’t do, and specifically what material you and I can ‘intermix’ — possible only where licenses are ‘interoperable’.
To put it the other way round: licenses are interoperable if you can intermix freely material licensed under one of those licenses with material licensed under another. This interoperability is crucial and it is, in license terms, what underlies a true commons.
More broadly we are interested in a ‘license standard’, in knowing, not only that a set of licenses are interoperable, but that they all allow certain things, for example for anyone to use, reuse and redistribute the licensed material (or to put in terms of freedom, that they guarantee those freedoms to users). This very need for a standard is why we created the Open Definition for content and data building directly on the work on a similar standard (the Open Source Definition) in the Free/Open Source software community.
The existence of non-commercial
CC took a crucial decision in including NonCommercial licenses in their suite. Given the ‘Brand’ success of Creative Commons the inclusion of NC licenses has been to give them a status close to, if not identical, with the truly open, commons-supporting, licenses in the CC suite.
This is a noticeable difference here with the software world, where NC is also active, but under the ‘freeware’ and ‘shareware’ names (these terms aren’t always used consistently), and with this material clearly distinguished from the Free/Open Source software community.
As the CC brand has grown, there is a desire by some individuals and institutions to use CC licenses simply because they are CC licenses (this is also encouraged by the baking in of CC licenses to many products and services). Faced with choosing a license, many people, and certainly many institutions, tend to go for the more restrictive option available (especially when the word commercial is in there — who wants to sanction exploitation for gain of their work by some third-party!). Thus, it is no surprise that non-commercial licenses appear to be by far the most popular.
Without the NC option, some of these people would have chosen one of the open CC licenses instead. Of course, some would not have licensed at all (or, at least not with a CC license), sticking with pure copyright or some other set of terms. Nevertheless, the benefit in gaining a clear dividing line, and in creating brand-pressure for a real commons, and real openness would have been substantial, and worth, in my opinion, the loss of the non-commercial option.
Structure and community
It is notable in the F/OSS community that most licenses, especially the most popular, are either not ‘owned’ by anyone (MIT/BSD) or are run by an organization with a strong community base (e.g. the Free Software Foundation). Creative Commons seem rather different. While there are public mailing lists ultimately decisions regarding the licenses, and about crucial features thereof such as compatibility with 3rd party licenses, remains with CC central based in San Francisco.
Originally, there was a fair amount of autonomy given to country projects but over time this autonomy has gradually been reduced (there are good reasons for this — such as a need for greater standardization across licenses). This has concrete affects for the terms in licenses.
For example, for v3.0 the Netherlands were requested to remove their provisions which included things like DB rights in their share-alike provision and instead standardize on a waiver for these additional rights (rights which are pretty important if you are doing data(base) licensing). Most crucially the CC licenses reserve the right to Creative Commons as an organization to determine compatibility decisions. This is arguably the single most important aspect of licensing, at least in respect of interoperability and the Commons.
Creative Commons and Data
Update: as September 2011 there has been further discussion between Open Data Commons and Creative Commons on these matters, especially regarding interoperability and Creative Commons v4.0.
From my first involvement in the ‘free/open’ area, I’d been interested in data licensing, both because of personal projects and requests from other people.
When first asked how to deal with this I’d recommended ‘modding’ a specific CC license (e.g. Attribution-Sharealike) to include provisions for data and data(bases). However, starting from 2006 there was a strong push from John Wilbanks, then at Science Commons but with the apparent backing of CC generally, against this practice as part of a general argument for ‘PD-only’ for data(bases) (with the associated implication that the existing CC licenses were content-only). While I respect John, I didn’t really agree with his arguments about PD-only and furthermore it was clear that there was a need in the community for open but non-PD licenses for data(bases).
In late 2007 I spoke with Jordan Hatcher and discovered about the work he and Charlotte Waelde were doing for Talis, to draft a new ‘open’ license for data(bases). I was delighted and started helping Jordan with these licenses — licenses that became the Open Data Commons PDDL and the ODbL. We sought input from CC during the drafting of these licenses, specifically the ODbL, but the primary response we had (from John Wilbanks and colleagues) was just “don’t do this”.
Once the ODbL was finalized we then contacted CC further about potential compatibility issues.
The initial response then was that, as CC did not recommend use of its licenses (other than CCZero) for data(bases), there should not be an issue since, as with CC licenses and software, there should be an ‘orthogonality’ of activity — CC licenses would license content, F/OSS licenses would license code, and data(base) licenses (such as the ODC ones) would license data. We pressed about this and had a phone con about this with Diane Peters and John Wilbanks in January 2010, with a follow-up email detailing the issues a bit later.
We’ve also explained on several occasions to senior members of CC central our desire to hear from CC on this issue and our willingness to look at ways to make any necessary amendments to ODC licenses (though obviously such changes would be conditional on full scrutiny by the Advisory Council and consultation with the community).
No response has been forthcoming. To this date, over a year later, we are yet to receive any response from CC despite having though we have now been promised a response at least 3 times (we’ve basically given up asking).
Further to this lack response, without any notice or discussion to ODC, CC recently put out a blog post in which they stated, in marked contrast to previous statements, that CC licenses were entirely suited to data. In many ways this is a welcome step (cf. my original efforts to use CC licenses for data above) but CC have made no statement about a) how they would seek to address data properly b) mention of the relationship of these efforts to existing work in Open Data Commons and especially re. the ODbL. One can only assume, at least in the latter case, that the omission was intentional.
All of this has led me, at least, to wonder what exactly CC’s aims are here. In particular, is CC genuinely concerned with interoperability (beyond a simple ‘everyone uses CC’) and the broader interests of the community who use and apply their licenses?
Creating a true commons for content and data is incredibly important (it’s one of the main things I work on day to day). Creative Commons have done amazing work in this area but as I outline above there is an important distinction between the (open) commons and CC licenses.
Many organisations, institutions, governments and individuals are currently making important decisions about licensing and legal tools – in relation to opening up everything from scientific information, to library catalogues to government data. CC could play an important role in the creation of an interoperable commons of open material. The open CC licenses (CC0, CC-BY and CC-BY-SA) are an important part of the legal toolbox which enables this commons.
I hope that CC will be willing to engage constructively with others in the ‘open’ community to promote licenses and standards which enable a true commons, particularly in relation to data where interoperability is especially crucial.
I’ve posted the slides online and iframed below.
Over the past few years, there has an explosive growth in open data with significant uptake in government, research and elsewhere.
Bibliographic records are a key part of our shared cultural heritage. They too should therefore be open, that is made available to the public for access and re-use under an open license which permits use and reuse without restriction (http://opendefinition.org/). Doing this promises a variety of benefits.
First, it would allow libraries and other managers of bibliographic data to share records more efficiently and improve quality more rapidly through better, easier feedback. Second, through increased innovation in bibliographic services and applications generating benefits for the producers and users of bibliographic data and the wider community.
This talk will cover the what, why and how of open bibliographica data, drawing on direct recent experience such as the development of the Open Biblio Principles and the work of the Bibliographica and JISC OpenBib projects to make the 3 million records of the British Library’s British National Bibliography (BNB) into linked open data.
With a growing number of Government agencies and public institutions making data open, is it now time for the publishing and library community to do likewise?
Unobtrusive (HTML + JSON)
- tags: unobtrusive beta
- tags: unobtrusive
‘Standard’ Templating Browser
- jquery.tmpl / jqtpl
- Many of these work with browser
- sinon.js (mocking) – integrates with qunit well
- backbone – used quite a bit
- (big) sproutcore
- tags: nodejs
- backbone now supported pretty well
Messaging and Job Queues
- resque in js – https://github.com/technoweenie/coffee-resque
- For mongo: http://mongoosejs.com/
- Backbone sort of includes one (though relationships are poorly handled at the moment)