More traffic, more videos, more screens: building the ³ÉÈË¿ìÊÖ's Olympic site
³ÉÈË¿ìÊÖ Olympics architecture overview. It shows how many components are involved!
Hi, I'm Matthew Clark, the Senior Technical Architect for ³ÉÈË¿ìÊÖ Online's Olympic website and apps.
Alongside colleagues Mike Brown and David Holroyd, it's been my responsibility to create the technical strategy that has allowed us to produce successful online Olympic products.
We've focused on the design and development to make sure the site and apps stay reliable and can handle high traffic loads, whilst offering more content than ever before. In this technical blog post I'll be looking at some of the challenges we faced and how we overcame them.
More traffic: Handling unprecedented audience levels
We expected the Olympics would drive far moreÌýtraffic to our site than ever before, and it did. Planning for this load was not easy. There are over 60,000 dynamically generated pages, many with a significant amount of content on them, so efficient page generation is vital.
Content needs to be as 'live' as possible, so long-term caching is not an option. We use a range of caches (including Content Delivery Networks,, and ) to offload the bulk of the traffic from our Apache web servers. For content that's dynamic, cache lifespan (max-age) varies between a few seconds and a few minutes, depending on the context. This is particularly true for the new video player, which needs the latest data every few seconds to compliment the live video stream.
Page generation is done using , which is and receives all of its data through calls to a l API. This API is thethat retrieves its data from a range of data stores (including , , andcontent stores). It's the most critical layer from a performance point-of-view, as load is high, calls can involve significant processing, potentially multiple data store calls, and limitations on what can be parallelised.(mod_cache and ) is again used to address the bulk of the traffic.
From data stores to screens, sites and apps
We spent considerable time modelling how traffic creates load on the whole stack. This was first done theoretically (through modelling of user behaviour, load balancing, and caching). We then did it for real - we used data centres around the world to place load equivalent to over a million concurrent users on the site, to confirm everything worked during busy Games load.
More video: handling 2,500 hours of coverage
A moment when all 24 streams were running at the same time. Can you identify what they are?
There's been plenty of discussion about the ³ÉÈË¿ìÊÖ's 24 video streams, and the challenges of creating them at the International Broadcast Centre by the Olympic Park. This is the equivalent to 24 new channels, offered over cable and satellite as well as via IP.
Once the channels are created, the challenge is to direct viewers to the right content at the right times.
Sport schedules have a habit of changing - extra time, delays due to rain, etc - and the sites and apps need to show this. When an event starts, the tools used by (human) controllers to control the video also log the start in our XML content store. This metadata is then picked up by pages and apps (via an API) so that, within a minute of the event starting, there are links to the content throughout the site, app, and red button. A similar process happens when coverage finishes. Olympic sessions can be as short as 45 minutes, so the faster a video stream can be made available, the better.
More screens: coverage on mobiles, tablets, computers and internet connected TV
The ³ÉÈË¿ìÊÖ has a four screen strategy where we develop for PCs, tablets, mobiles, and internet connected TV. For the Games we've offered an unprecedented amount of content to all four. In addition, there are Olympic apps for iOS and Android smartphones, a Facebook app, foreign language content for World Service sites, and a red button service for satellite and cable TVs.
Our architecture is the classic multi-tier approach - pushing as much logic as possible into shared components, so that the amount of development for each interface is as low as possible. This is at a multi-platform level. For web pages, a single PHP codebase creates both the desktop and mobile versions. This includes the iOS and Android apps, which use to 'wrap' the mobile website for most of its functionality. It has saved us having to rewrite functionality in native code. Certain other applications, such as the Olympic Facebook app, are different enough to warrant their own codebase, but still make the same API calls to the Java application layer, where most of the 'business' logic is held.
More content: Data is power
Video aside, there is a wealth of data required to make the Olympic site. The primary source is the Olympic Data Service, which blog posts from Oli Bartlett and Dave Rogers have already covered in depth. In brief, (OBS) provide a comprehensive data feed that covers all sports, and provides a wealth of data - from latest scores to medal tables. This, combined with stories from journalists, and other sources such as Twitter, creates the content for tens of thousands of results, athletes, country, and event pages.
The Dynamic Semantic Publishing (DSP) model, which understands relationships (triples) between all content and concepts, is the process that ensures everything automatically appears in the right place. All created content, including stories, medals, and world records, are tagged (normally automatically) with the appropriate athletes, sports and countries. This causes the content to appear on the appropriate page without human intervention.
In essence, it's this automatic curation of pages that has allowed us to offer such a broad range of product. The automation has kept maintenance to a minimum, freeing journalists to focus on writing content. It's allowed multiple products and thousands of pages to stay up-to-date without a large operational overhead.
More testing: Simulating an entire Olympics
With all this content, data, video, and technology, comes a huge engineering challenge: how do you test it? All development areas follow so there is no shortage of automated unit and component tests. But what happens when you plug everything together? How can you be sure that the right medals go to the right country, or that video works on all devices, or that results appear correctly for all 36 sports? Unlike, say, theÌýfootball season, the Olympics only happen once every four years, and only last a couple of weeks.
There are no second chances.
We needed to be 100% sure that on day one of the Games, everything would work as expected.
To tackle this we set up an entire team, as big as any development team, with the job of proving everything would work when the Games started. We took a three-pronged approach:
- We used as much of the Olympic technology as we could for other sporting events, such as F1 and Wimbledon. This was often offered as a beta service - and we're grateful to all who gave these a go.
- We used the to get video and stats that would be similar to that of the Games. We ran this on our staging environment so that they could be made to look like real Olympic events without appearing on our normal site.
- Most importantly, we created fake video and data that let us simulate the Olympics. We picked the most interesting moments of the Games (the opening ceremony, the first big day, etc) and created all the inputs as they would be at that moment. This was run in our staging environment and allowed us to see all sites and apps behave as they would when the Olympics were underway. (You can read more about how we made the simulated data in the post by my colleague Dave Rogers.)
This testing process lasted several months and caught a significant number of bugs and performance problems. Fortunately it paid off - I was a little nervous on the first day of the Games, but itÌýpassed without incident.
More for the future
With the end of the Games fast approaching, attention now turns to other areas of the Sport website.
Some features have already been applied throughout - for example, most live video coverage will be in HD from now on. Other services that we've offered for Olympics aren't yet ready for use elsewhere (mobile apps and video chapter points being two). Over the coming months we'll be working on bringing many of these features to the rest of Sport, and perhaps other parts of ³ÉÈË¿ìÊÖ Online too.
If you've any questions about the technology we've used during the Olympics, please get in touch using the comments below.
Matthew Clark is Senior Technical Architect, Knowledge & Learning, ³ÉÈË¿ìÊÖ Future Media
Comment number 1.
At 16th Aug 2012, icogill wrote:any chance of publishing a larger version of the architecture diagram?
Complain about this comment (Comment number 1)
Comment number 2.
At 17th Aug 2012, thepoettrap wrote:Stuff worked well done but where was the dynamic interaction with athlete form? The related athlete social media pipes as pre-race discussion points? The graphical representation of outside stadia events? The effect of real-time results on their historic pool across every sport? Quality interactive information on screen and web? Data innovation?
Complain about this comment (Comment number 2)
Comment number 3.
At 20th Aug 2012, vladtn wrote:also interested by a larger version of the architecture diagram
Complain about this comment (Comment number 3)
Comment number 4.
At 21st Aug 2012, Matthew Clark wrote:@icogill, @vladtn: Of course - you can find a full-size diagram here (PDF):
/blogs/bbcinternet/2012/08/21/2012_Architecture_Overview_Diagram_20120727.pdf
@thepoettrap: They are great ideas, and we had many of them on our list too. Unfortunately we can't do everything - there's always a limit of time and resource.
Complain about this comment (Comment number 4)
Comment number 5.
At 22nd Aug 2012, stuartwd wrote:I was very impressed with the user experience of the ³ÉÈË¿ìÊÖ Olympics site so well done there.
I realise that it would have been a tough thing to organise, and not just technically, but one thing that would have enhanced my experience is subtitles.
Stuart
Complain about this comment (Comment number 5)
Comment number 6.
At 6th Sep 2012, girishmuraly wrote:The overall architecture bears semblance to the ³ÉÈË¿ìÊÖ Weather architecture /blogs/bbcinternet/2011/12/bbc_weather_technical_architec_1.html
Complain about this comment (Comment number 6)
Comment number 7.
At 12th Nov 2012, Horation T Burns wrote:To create this type of architecture for the ³ÉÈË¿ìÊÖ London 2012 Olympic TV programming would require a lot of software testing, in terms of checking that all the inputs are sync with the programs times and the outputs are showing the correct sport.
Horatio T Burns
[Unsuitable/Broken URL removed by Moderator]
Complain about this comment (Comment number 7)