³ÉÈË¿ìÊÖ

Archives for June 2008

Under the iPlayer hood for radio

Post categories:

James Cridland James Cridland | 11:45 UK time, Friday, 27 June 2008

People have been asking about the and that we're using on national radio within the new iPlayer beta.

The quick answer is "they're different per station, they're different whether live or on-demand, and they'll change at least another two times this year". If that satisfies you, you have no requirement to read on. If you want more information, however, I'm happy to help. Note that I'm only talking about national radio, and only for listeners in the UK.

First, you'll notice that for "live" we're currently using rather than Real Player (for most of you - we still give Real to some operating systems). We're doing this because we know online radio is particularly useful in the office, and chances are that Windows Media is automatically installed on most computers, and most corporates won't let you install other software. It should, therefore, 'just work'. I should though say that if you need RealPlayer for your internet radio or your , those streams continue; we've no plans to remove them.

The future for "live" is firstly to significantly improve the bitrate (which we'll do in July). In parallel with that we're working on a way of delivering higher-quality still, using a Flash-based player and an -family stream. We're working with our distribution partners to enable this; the upshot is that it should sound even better but use less bandwidth.

For "on-demand", you'll have spotted that we're using Flash, within the lovely embedded media player that you're familiar with for TV in the iPlayer. Under the hood is a protected MP3 stream for now: again, we're shifting over to AAC-family later in the year. The real difference here is the quality - we've significantly improved the bitrates we can offer.

For on-demand content, we're launching iPlayer with four MP3 profiles based on the content of the programme: and we're using four different bitrates for these profiles.

These are the launch bitrates; we'll tweak things, and moving to the AAC family will reduce the bitrates we use (to make your listening more reliable, whilst maintaining audio quality). Again, the Real listen-again streams that your internet radio uses will still work.

Finally, perhaps I might be able to let you into a bit of a dirty secret. For the last six years, the online streams from ³ÉÈË¿ìÊÖ national radio have been taken from satellite: the same feeds you get on or . So we've been taking a lossy MP2 audio feed, and then encoding it further into even lower bitrates. As we move into higher quality audio online, clearly this has to stop. So, from July, it will - we'll be encoding everything within Broadcasting House, plugged in to the studio feeds. So better bitrate is only part of the story - it's also better sound.

If you've got feedback about radio within iPlayer beta, we're watching ; or if you're blogless, please do comment here.

The Simple Joys of Web-Scale Identifiers

Post categories:

Michael Smethurst Michael Smethurst | 13:26 UK time, Wednesday, 25 June 2008

<aside>Second post of the day is quite a record for me but this one isn't about microformats so you can probably look away now...<aside>

Bob Dylan with his MusicBrainz identifier

This post is partly a response to and partly the result of conversations with Matthew Wood, Chris Sizemore and John O'Donovan on our recent jaunt to . Now I think that most of our department would agree with Tom. After all we've been having these conversations for a few years now and when it comes to URL design .

When you're building anything it's always good to admit that cleverer people than you or I (or even ) came before. In the case of the web gave us and HTTP is stateless. It's the whole beauty of the web: everyone, everywhere gets the same thing from the same place. The moment you pick a fight with this design you're probably gonna get beat.

Which is not to say that people haven't picked this fight. Many websites (including the ³ÉÈË¿ìÊÖ) use to preserve state across requests. So but when you make that choice you need to be aware that all your user activity will remain uncaptured by the web - no browsability, no google goodness, no benefit to your organisation (beyond the obvious) and no caching.

So, like I say I agree with the but I'd like to try to add a fifth: if possible don't reinvent other people's web identifiers. By web identifiers I mean those fragments of URLs that uniquely identify a resource within a domain. So in the case of the entry for The Fall () that'll be d5da1841-9bc8-4813-9f89-11098090148e.

The last time we updated the /music site we made this mistake (kind of unavoidable at the time). Even though we linked our data to MusicBrainz we minted new identifiers for artists. So The Fall became /music/artist/jb9x/ where jb9x was the identifier. But jb9x doesn't exist anywhere outside of /music. We'll (hopefully) .

When we first the big attraction was 2 fold:

  • stable web-scale identifiers
  • - no separate deals to reuse data in APIs etc

So when the next version of /music goes live you'll see: /music/artists/d5da1841-9bc8-4813-9f89-11098090148e and the world will hopefully be a slightly better place.

Now I can already hear my old mentor saying:

Michael noooo! URIs are just identifiers for resources. They shouldn't reflect the taxonomy of the site. The resource should define it's relationships to other resources not the URI. Call them anything you like but just keep them stable.

With which I also mostly agree but - if bbc.co.uk/programmes tagged content with the same vocabulary as we'd be able to cross promote news stories from programmes and programmes from news stories by sharing APIs not databases. Tie this into personalisation and the power goes logarithmic. Read six articles on reconstruction in Iraq? Then you might like this Panaroma programme.

But if the vocabulary used to tag programmes and news was web-scale then , , etc (or someone in between) could start to aggregate stories around a shared sense of topic. This is what Chris' recent post on using wikipedia / dbpedia as a controlled vocabulary begins to hint at. It's like or except the terms returned are web native or web-scale identifiers if you will.

So what's the practical benefit: well because the new /music URLs will be based on MusicBrainz identifiers and because /music will be interlinked with /programmes and because the speaks in MusicBrainz identifiers can spend a weekend at making something that takes your Last.fm user name, extracts your favourite artists, ties them to /music and recommends ³ÉÈË¿ìÊÖ programmes. Which is a .

Taking another example for those who wish to stalk Tom Scott. His blog is at which is also his OpenID, you'll find his delicious account at , his tweets at and if you want to hire him he's at on LinkedIn. So derivadow is a web-scale identifier for Tom. It's not as strong or as powerful as a set of RDF linked URIs but if you wanna aggregate Tom-ness it's a pretty good starting point. Sadly I can't find him anywhere on Last.fm but that's possibly a godsend.

The obvious question is if web-scale identifiers are so good why did the ³ÉÈË¿ìÊÖ mint it's own for programmes? After all the the b00c4wxm used in /programmes and iPlayer is a ³ÉÈË¿ìÊÖ invention. And the answer is there were no suitable identifiers out there. I'd like to think that if Program(me)Brainz existed with stable identifiers we'd have put in the work to use those instead. But it didn't so we couldn't... But now we have stable identifiers out there on the web free to use for anyone. It would be good for example to see these identifiers adopted by . Time will tell.

One argument against all this is that web-scale identifiers are often kinda ugly. After all if Last.fm gets away with why do we need d5da1841-9bc8-4813-9f89-11098090148e. The answer is ambiguity. MusicBrainz has . Which one(s) does the ³ÉÈË¿ìÊÖ play? Probably none actually but you get the point. If we want to be exact in what we point to we need to handle ambiguity. In general we follow 3 commandments:

  1. URLs should be human readable
  2. URLs should be hackable
  3. URLs should persistently point to one concept

And the greatest of these is persistence. If you can't maintain stable URLs per concept don't even bother with 1 and 2. There are others that argue that . If resolving ambiguity is not important to your business then I'd agree but if you need to differentiate stuff with the same label you need unique identifiers - better yet web-scale identifiers.

Now I guess the people would say do this properly in with etc and we will do. But for hackers without PhDs the possibility of instant interoperability and quick mesh-ups is irresistible. Obviously you'll still need to establish equivalency between and but luckily that's where the people have done some of our work for us. And they're damn nice people to boot.

So I guess what I'm saying echoes Tom. Cleverer people than us have come up with ways to attach web-scale identifiers to content so why waste time reinventing. Whilst the ³ÉÈË¿ìÊÖ or *insert your organisation here* should own their data (whilst hopefully making it free - as in beer; as in speech) we don't have to own our identifiers. If we choose to use the power of web-scale identifiers we free our content to fly and . It's not exactly profound but it does feel like a small breakthrough to an .

Microformats and RDFa and RDF

Post categories:

Michael Smethurst Michael Smethurst | 10:13 UK time, Wednesday, 25 June 2008

Improving the Acronym Karma

My original post on removing microformats from /programmes seems to have kicked off . Unfortunately some of this seems to have resulted in RDFa people criticising microformats and vice versa. Which wasn't really the intention.

The post covered 3 things:

  • the decision by the ³ÉÈË¿ìÊÖ to ban the use of microformats which use non-human-readable data in the title attribute of the abbreviation element (most obviously the datetime abbreviation design pattern)
  • the impact of this on /programmes
  • the possibility of using on /programmes

so it's probably best to break these things apart.

Banning some uses of the abbreviation design pattern on bbc.co.uk

This is hopefully only a temporary ban until the microformats community come up with an alternative to the that doesn't break ³ÉÈË¿ìÊÖ accessibility standards. It doesn't mean that hCalendar is banned or even the abbreviation design pattern is banned per se. Just that we can't use it where the title attribute contains non-human-readable data. Note that hCalendar can be used without the abbreviation design pattern but none of the alternatives fit with our needs.

The impact on /programmes

I concentrated on /programmes because:

  • it's the project I work on
  • it's probably the bit of bbc.co.uk that makes most extensive use of microformats

Obviously there are other bits of bbc.co.uk that use microformats that would break the new accessibility standards but we were aware of people screen scraping the /programmes microformats in lieu of a full API so thought we'd best flag up what was happening.

RDFa

First it's probably important to note that interest in RDFa is pretty much an Audio and Music thing. I've spoken to other people in various bits of the ³ÉÈË¿ìÊÖ who've expressed an interest but so far the majority of discussions have been confined to Henry Wood House. So this next bit is with A&Mi hat firmly on.

A number of A&Mi projects are being developed in accordance with the principles of . For these sites we intend to provide at separate URLs. In the case of /programmes this has resulted in the development of the - an RDF vocabulary to describe programmes. We're following the same principles with the redevelopment of /music (where we'll be using the existing ). Where we're providing full RDF it makes sense (at least to us) to reuse these ontologies and also produce RDFa.

Other projects might be data driven but might not want to go down the full RDF route. In this case they might opt for RDFa or they might choose accessible microformats.

For more lightweight, possibly hand-coded projects (still the majority of bbc.co.uk) accessible microformats would probably be most suitable.

So in short it's easy to imagine a ³ÉÈË¿ìÊÖ website with a mixed economy of . It certainly shouldn't be an either/or. So mostly except that I'm not sure that the accessibility of the abbreviation design pattern is a bug so much as an expected result of . Anyway it's a problem that seems to have been around for a while now - hopefully it'll get sorted soon and we can all get back to using microformats (where appropriate) with a bit more peace of mind.

Removing Microformats from bbc.co.uk/programmes

Post categories:

Michael Smethurst Michael Smethurst | 10:48 UK time, Monday, 23 June 2008

Since /programmes first went live we've been working to ensure that programme data was accessible to people and machines alike. The API design was baked in at the application design stage. Similarly we've worked on adding to HTML pages as a lightweight API. All broadcasts use the microformat to add start times, end times, broadcast channels etc.

Unfortunately there have been a over hCalendar's use of the . This uses the HTML abbreviation element to add machine data to pages. Our concerns were:

  • the effect on blind users using screen readers with abbreviation expansion turned on where abbreviations designed for machines would be read out
  • the effect on partially sighted users using screen readers where tool tips of abbreviations designed for machines would be read out
  • the effect of incomprehensible tooltips on users with cognitive disabilities
  • the potential fencing off of abbreviations to domains that need them (travel - , finance - etc)

Until these issues are resolved the ³ÉÈË¿ìÊÖ semantic markup standards have been updated to prevent the use of non-human-readable text in abbreviations. As I type the revised standard has not been published - I'll update this post with a link when that happens. Updated standard is here. For this reason we've taken the decision to remove the hCalendar microformat from /programmes until:

  • either the ³ÉÈË¿ìÊÖ accessibility group does further testing and declares the abbreviation design pattern to be safe to use
  • or the microformats community settles on an accessible alternative to the abbreviation design pattern. has already been started by .

hCalendar will be gone from /programmes by the next deploy (probably this Thursday).

In the meantime we'll be looking at the possible use of (a slightly bigger S technology similar to microformats but without some of the more unexpected side-effects).

Apologies to who's been using hCalendar to help with screen-scraping of /programmes. We know we've been for and the /programmes development team will be campaigning to bring this up the product backlog. In the meantime schedules are already available as json and xml. Leave a comment if there are specific views / formats you'd like to see next.

Probably best to note that this only affects microformats using the abbreviation design pattern. Any and microformats will remain (at least until/if we fully embrace RDF-a). And probably also best to note that this is not a decision that has come down from on high by the . The /programmes team has been concerned about this issue for a few months now and it's good to get some clarity here.

Stay tuned to radiolabs and we'll keep you updated if / as things change.

Radio Labs at Mashed 08

Post categories:

Tristan Ferne | 16:43 UK time, Friday, 20 June 2008

is just starting at Alexandra Palace and Radio Labs is sending a crack team of developers who will be building stuff with ³ÉÈË¿ìÊÖ radio and music metadata. And we've also pulled together some new data for you to play with for this weekend...

XMPP Now Playing feed
Live now playing data for Radios 1, 2, 1Xtra and 6Music featuring track information, MusicBrainz artist IDs and programme identifiers for /programmes

Audio archive of ³ÉÈË¿ìÊÖ radio
Access to audio for the past month for the 10 national ³ÉÈË¿ìÊÖ radio stations with the ability to request any segment of this, accurate to the second.

Live ³ÉÈË¿ìÊÖ radio streams
MPEG over HTTP.

Artist playcount data
Data for how many times each radio station and DJ have played an artist. Maybe you can build a recommendations engine?

RDF for bbc.co.uk/programmes
For the semantic web fans amongst you - RDF data for brands, series, and episodes based on /programmes

Hopefully will be talking briefly about these feeds at the end of /programmes talk around 11am, so grab him if you want to know more.

All the data will be found here shortly:

Also featuring our data from last year's Hackday (well, what's still available) including Top of the Pops historical data, John Peel data, now playing RSS and LiveText data.

Enjoy.

Links for 19-06-2008

Post categories:

Tristan Ferne | 12:36 UK time, Thursday, 19 June 2008

No radio links this week, just three interesting articles on open-source hardware, serendipitous recommendations and how to unleash your creativity.


The Economist gives an introduction to open-source hardware featuring devices from openmoko, Chumby, Bug Labs et al.


A good post about how about typical recommendation systems on the web aren't about recommendation, but about prediction. They tend to do a good job of making consistent, safe recommendations but should be more open and serendipitous.


Capturing, challening, broadening, surrounding and "walk out the door for 20 minutes or so and see what happens to your thinking".

More at

Wikipedia + Lucene's MoreLikeThis = useful bits about the bits?

Post categories: ,Ìý,Ìý

Chris Sizemore Chris Sizemore | 14:49 UK time, Friday, 13 June 2008


'bits about the bits' -- those bits that describe the narrative...

My colleague Michael recently posted about Nicholas Negroponte's prescient 1995 musings into the info glut challenges traditional TV and radio broadcasters are now feeling as a result of going digital.

: "...we need those bits that describe the narrative with key words... these will be inserted by humans aided by machines... the[se] bits about the bits change broadcasting totally... they give [audiences] a handle by which to grab what interests [them], and [they] provide the [broadcaster] with a means to ship [its programmes] into any nook or cranny that wants them..."

I've been working for some years now on methods of providing audiences with access to ³ÉÈË¿ìÊÖ Radio and TV programmes based on genre, topic, and subject. In other words, I, and many of my colleagues, have been concentrating on the "bits about the bits" part of the chain.

Recently, I managed to hack a promising little "bits about the bits" prototype together, something that attempts to address in particular Negroponte's notion of "...bits that describe the narrative with key words..." My approach begins by treating and its articles as a or . Yes, for these purposes, suspend disbelief and assume Wikipedia is useful fodder for -- whether or not it's a trustworthy or authoritative journalistic resource is an interesting debate, but isn't relevant for the job we want to do here.

My proof-of-concept is based on vacuuming every Wikipedia article into the to build a . It's possible you may find this approach useful in your own "bits about the bits" endeavours.

Read the rest of this entry

Links for 06-06-2008

Post categories:

Tristan Ferne | 16:24 UK time, Friday, 6 June 2008


A report from the Radio Advertising Bureau suggesting that young "Digital natives" still like the radio


Some radio-based art.


A visualisation for your mobile phone that constantly evolves, showing recent call and text activity in a spiral of brightly coloured circles.


A good introduction to transmedia storytelling and the geek producers who are leading it.


"The Media Futures Conference is a one day exploration of the dynamics and trends shaping the future of media" - at Alexandra Palace, with on the following day.

And finally, some RDF and ontologies for you...


For all your harmonic descriptive needs.


For events and happenings.

Thinking Digital

Post categories:

Tristan Ferne | 09:26 UK time, Thursday, 5 June 2008

Guy Strelitz, a Technical Project Manager in our team, went to the ThinkingDigital conference in Newcastle. This is his report...

It's already a couple of weeks since I was at the conference in Newcastle, run by , for the Regional Development Agency for the NorthEast. The three-and-a-half-day event was crammed with content - even a précis of the whole thing takes too many pages of A4 and no-one wants to read that. So I thought I'd present some highlights from Day 1.


The Future of Media
The conference ran on the basis of several guests in a session each speaking very broadly to the same theme. The Future of Media brought us Matt Locke, formerly of the ³ÉÈË¿ìÊÖ, now Commissioning Editor at Channel 4 Education, Eric Lindstrom and Steve Jelley, partners in , an agency specialising in "online video entertainment and community websites for brand owners and video content producers" and Jeremy Silver, General Manager of Avid Education.

Matt spoke about following audience trends for educational programming at Channel 4. Channel 4 Education produces informal educational content for 14-19-year-olds, traditionally daytime TV during the school term. Hardly an ideal slot for the demographic, so they've diverted their £6-million budget from broadcast to online video. He gave an analysis of 6 different types of online social space, each fostering different types of interaction (Secret, Group, Publishing, Performing, Participation and Passive crowd), and the need to create the right type of space if you're inviting contribution from neurotic teens.

Eric and Steve spoke on the difference between online video and previous channels. 2 key lessons: 1. Defend your brand. If you want your site to appear at the top of Google's pagerank, design a kick-arse hub for original content, not an aggregator. People only go to aggregators when they don't know what they want. 2. Tell new types of story with video in the new medium. Just as television allows long-form drama with dramatically greater intricacy than cinema (think Lost vs Memento), so web-based video enables new story-telling paradigms. Eric is keen on saying that now you can tell Dickens as it was written, in bite-sized serial form.

And Jeremy Silver spoke on how digital is failing to kill music, just changing the balance of power in the industry - a process it's undergone before in previous technological shifts. He foresees "an amazing flowering at hand" in the industry...but wisely declines to predict what form it will take!

United We Stand
The wide variety of speakers started to become apparent with Darren Thwaites, Editor of the Teesside's Evening Gazette newspaper, Ian Kennedy, Cisco's Head of Technical Operations, EMEA, and Tara Hunt, online community maven.

Darren Thwaites, spoke compellingly about hyper-local journalism. An old-media print journal, the Gazette have trained volunteer 'citizen journalists' on a per-postcode basis to produce 20 extremely local online editions, composed entirely of UGC without pre-moderation. It's been a success to the point that it's spawned new print editions and fed features back to the parent paper.

Ian Kennedy spoke in fairly broad terms about research on collaboration technologies - essentially telepresence tools - developed in part with a view to fostering ground-up innovation. It included a video demo of using Second Life as a meeting space, but the highlight was undoubtedly footage of an apparent .

Finally Tara Hunt spoke passionately about the flowering of the BarCamp concept since its inception in 2005. The community's now expanding into co-working spaces in several cities around the world - geek-friendly venues where you can just show up, plug in your laptop and connect to the network. Key quote-oid *: "I don't make money from it, bit I make money because of it at events such as this."

* words to this effect anyway, and in another session later in the day

The Singularity
Ray Kurzweil signally failed to talk about . Instead he blew us all out of the water with a talk on miniaturisation of IT and its implications for human longevity. The first thing about Kurzweil's talk - he appeared long-distance using a . Not as impressive as the Cisco technology seen earlier, it was nonetheless an appropriate meeting of medium and message. Secondly, backed up by copious historical data, he made a compelling case that the spatial density of processing power has increased over decades at an exponential rate, technological limitations be damned. It shows no sign of slowing down - he predicts for instance that we'll be using 3D chips several years before the current 2D paradigm is exhausted. He certainly had our rapt attention with the implication that within decades we will have computers small enough to run in our bloodstream, increasing our intelligence and reversing aging. Apparently there are several existing research groups working on this agenda...


And sessions on the history of mulitmap from startup to Microsoft purchase, the caustic humour of and his purchase by the company he was rebelling against in the first place.

All the video is available from the .

³ÉÈË¿ìÊÖ iD

³ÉÈË¿ìÊÖ navigation

Copyright © 2015 ³ÉÈË¿ìÊÖ. The ³ÉÈË¿ìÊÖ is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.