I am re-publishing the below by kind permission of the Chartered Institute of Library and Information Professionals (CILIP), as it was written for the May issue of their Update magazine.
[Image Source: Qwantz]
Research findings released late last year by the University of Illinois suggest that when conducting cardiopulmonary resuscitation on those who have suffered a cardiac arrest, even medical practitioners benefit from singing a song with a tempo of 100 beats per minute (bpm) to match the optimum 100 compressions per minute required to revive a patient. Songs matching the 100bpm rate include, appropriately enough, ‘Staying Alive’ by the Bee Gees, less appropriately ‘Another One Bites the Dust’ by Queen, and ‘Connected’ by Stereo MCs, with a tempo of 100.7bpm, lyrics of which go something like this:
‘If you make sure you’re connected, the writing’s on the wall, but if your mind’s neglected, stumble you might fall, stumble you might fall…’
One can only presume that those in command at the Guardian and BBC came across this research and decided that the approach applied equally well to information stores as to human patients, for in the six months since, both institutions have opened up semi-public discussion around how best to set their valuable information free and get themselves connected.
The BeeBCamp 2.0 unconference, held at White City on 18 February 2009, brought together a diverse ‘digital’ group from around the BBC. I was privileged to attend, despite the event being primarily for internal BBC staff (GirlyGeekdom Note only: as was another ‘outsider’, Rachel Clarke, who wrote up some great notes on her blog). Conversations held at the various 20-minutes informal sessions covered such topics as harnessing user-generated content, encouraging the public to innovate, reviving the BBC Computer Literacy Project which popularised BBC Micro computers in the 1980s, and technical and bureaucratic issues around such changes.
For information practitioners, the session of most interest was entitled ‘Semantic BBC’. During this session, BBC insiders discussed:
- a proposed iFinder to help staff find multimedia content across BBC sites
- the possibility of incorporating the BBC Press Database which contains text transcript ‘scrapes’ from BBC news stories (obtained by converting interview recordings from audio to text), associates common keywords to such transcripts, and subsequently maps such keywords against the BBC ontology, enabling BBC staff to search for keywords/speakers’ names, etc. Access to the BBC Press Database isn’t available to the public for copyright reasons, but the BBC Programmes Ontology is published online
- semantic tagging trials within BBC Vision, the division responsible for broadcasting the content of BBC television channels
- ways the BBC online team might incorporate programme details stored in the BBC Programme Catalogue maintained by the BBC’s Information & Archives team. The BBC Programme Catalogue was created in the mid 1980s to replace their legacy card system, and trial public online access to the calogue was provided from 2006-2008, however this has been removed until further notice
- providing the public with information visualisations, visual representations of data, with such visualisations relating to searches made on www.bbc.co.uk, BBC iPlayer audience viewing patterns, and so on
- proposals to tag new BBC website content with GPS co-ordinates, where relevant
- the BBC field journalists’ content management systems and that group’s current migration across to their internally developed and maintained Content Production System (CPS)
- a proposed semi-closed taxonomy for the BBC audience to suggest new tags but with moderation
- the semedia.org online collaborative ‘crowdsourced’ project whereby users can play BBC archive films and tag the film in real-time, with such tags cross-referenced against those proposed by other users, as where tags are identical for the same section they will be considered more likely to be accurate.
For more information on BeeBCamp2 and to learn more about similar BBC initiatives, visit the BBC Internet Blog round-up.
BarCampLondon6 (GirlyGeekdom Note only: which was organised with military precision by the fabulous Emma Persky, Kevin Prince and co., thanks again!), held over the weekend of 28-29 March, was another ‘unconference’, open to those lucky enough to obtain a ticket via the online ticketing system. BarCamp6 was the first large conference held at the Guardian‘s new offices by Kings Cross, and as BarCamps traditionally involve ‘camping’ at the venue overnight, the new facilities were well and truly tested.
One session was run by Simon Willison, a software architect who joined the Guardian in late 2008 with the remit to ‘‘work with both the Guardian‘s existing data sources and third-party companies to prepare content for the [www.guardian.co.uk] platform, allowing the external development community to create applications using Guardian content’.
During his session, Willison shared objectives and technological underpinnings of the Guardian‘s Open Platform project. As part of this, the Guardian‘s technology team has developed an application programming interface (API) providing access to every piece of content on the Guardian‘s website for which the newspaper has enough relevant rights (c. 700,000 individual articles from the past 10 years), and the team intends to continue to share this content as it gradually brings other non-digital assets online. (Girlygeekdom Note only: this figure slightly differs from that on Simon’s personal blog however this is what he said at BarCamp a couple of weeks later, following the official launch, so I presume 700,000 is more accurate.)
The Open Platform project includes the API Explorer, a ‘Firebird-inspired’ console which enables those interested to connect their own web services to Guardian editorially chosen tags and rich metadata. The API content is largely shared through RSS, though JSON format is also supported, and Atom feeds are supported for tools such as Yahoo Pipes, to enable end-users to build on that.
Another fascinating project is the Guardian Data Store which shares raw Guardian research data which it hopes the public might cross-reference against other data, perhaps creating infographics and finding correlations. Distribution is the key focus for the Guardian, and provides the impetus for opening such content stores. Its paper weekday circulation is 400,000, while its website gets more than 33m hits per month: as the Guardian‘s revenue is largely derived from advertising, continuing to share their content makes a great deal of sense.
Guardian and BBC information being set free in this way certainly makes my pulse quicken, and where practicable we might follow their example to keep our own content alive… ‘make sure you’re connected, the writing’s on the wall…’

