Flexible Dates in Python (including BC)
I’ve had occasion recently to frequently work with “dates” that come in a lot of shapes and sizes including:
- Dates in distant past and future including BC/BCE dates
- Dates in a wild variety of formats: Jan 1890, January 1890, 1st Dec 1890, Spring 1890 etc
- Dates of varying precision: e.g. 1890, 1890-01 (i.e. Jan 1890), 1890-01-02
- Imprecise dates: c1890, 1890?, fl 1890 etc
Unfortunately existing support for these in python is fairly weak. I therefore authored a python FlexiDate module (now part of datautil part of a new swiss (army knife) package) which is focused on supporting:
- Dates outside of Python (or DB) supported period (esp. dates < 0 AD)
- Imprecise dates (c.1860, 18??, fl. 1534, etc)
- Normalization of these dates to machine processable versions especially:
- ISO 8601
- Dates sortable in the database (in correct date order)
Background
Things we would like:
- Dates outside of Python (or DB) supported period (esp. dates < 0 AD)
- Imprecise dates (c.1860, 18??, fl. 1534, etc)
- Normalization of dates to machine processable versions
- Sortable in the database (in correct date order)
- Human readability as dates will be re-edited/viewed by people
Not all of these requirements are satisfiable at once in a simple way.
Be clear about what we want:
- Storage (and preservation) of “user” dates (both normal and non-normal)
- Normalization of dates (e.g. to ~ ISO 8601)
- Integration with database (sortability and serializability)
Solution for 1: Represent dates as strings.
Solution for 2: Have a parser (via an intermediate FlexiDate object).
Solution for 3: convert to a float.
Remark: no string based date format will sort dates correctly based on std string ordering (PF: let x,y be +ve dates and X,Y their string representations then if X < Y => -X < -Y (wrong!))
Thus we need to add some other field if we wish dates to be correctly sorted (or not worry about sorting of -ve dates …)
- For any given date attribute have 2 actual fields:
- user version — the version edited by users
- normalized/parsed version — a version that is usable by machines
Store both versions in a single field but with some form of serialization.
Convert dates to long ints (unlimited in precision) and put this in a separate field and use that for sorting.
Comments
Initially thought that we should parse before saving into a FlexiDate format but: a) why bother b) when parsing always hard not to be lossy (in particular when converting to iso8601 using e.g. dateutil very difficult to not add info e.g. parsing 1860 can easily give us 1860-01-01 …).
References and Existing Libraries
-
Categories
- *nix
- Academic
- Activity Updates
- Books
- Cinema
- Code
- Command Line
- Copyright
- Culture and Society
- Data Digging
- Economics
- EUPD
- External
- Filesharing
- Governance
- Hacks
- Happiness
- Hardware
- History
- Innovation and Intellectual Property
- Intellectual Myths
- Javascript
- Knowledge Systems
- Miscellaneous
- Musings
- Notes
- Open Bibliographic Data
- Open Data
- Open Knowledge Foundation
- Openness
- Own Work
- Papers
- People
- Photos
- Platforms
- Poetry
- Policy
- PSI
- Python
- Quote
- RDF
- Shuttleworth Fellow
- Software
- Sysadmin
- Talks
- Transaction Costs
- Work In Progress
-
Articles
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- November 2009
- October 2009
- September 2009
- August 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
- August 2006
- July 2006
- June 2006
- May 2006
- April 2006
- March 2006
- February 2006
- January 2006
- December 2005
- November 2005
- October 2005
- September 2005
- August 2005
- July 2005
- June 2005
- April 2005
- March 2005
- February 2005
- January 2005
- December 2004
- November 2004
- October 2004
- June 2004
- May 2004
- March 2004
- October 2003
-
Meta




