Search Engine Optimisation: Rebuilding Food
The 'apples' page on ³ÉÈË¿ìÊÖ Food
Hi, I'm Oli Bartlett and I was the product manager for ³ÉÈË¿ìÊÖ Food during the rebuild in 2009-10. This post is a follow-on to Duncan's SEO post to provide a little more context and detail on how we tried to maintain our audience reach during the re-launch.
In the ³ÉÈË¿ìÊÖ we often see temporary drops in audience reach after a major re-working of a website. In situations where a website is given such a significant overhaul that its structure and page URLs change, one major factor in this drop in traffic is the removal of the old URLs from the site.
Put simply, if you remove the pages and those pages were getting views, then you no longer get the views.
However, links to those pages continue exist all over the web, most importantly for us in search engine indexes.
Once search engines discover that their indexed URLs are no longer valid (i.e. they receive a ), they will remove those pages from their indexes. In order to maintain the traffic from search engines it's important to put in place a good http response strategy for those URLs. For example, where content has been moved rather than deleted, use a to redirect to the new location.
bbc.co.uk/food
Part of the problem with the old ³ÉÈË¿ìÊÖ Food website was that there was too much content duplicated in different forms across the website - for example we often had two or more pages displaying the same recipe - which is really bad for users and SEO.
Additionally, a lot of the content was due a refresh in the context of the new product goals - finding recipes and food from your favourite ³ÉÈË¿ìÊÖ programmes. This led to the decision to cull around 2000 pages from the old website - these included recipes whose rights had expired, duplicate recipes, and articles and other content which simply didn't fit with the objectives for the new product.
Three Kinds of Deleted Page
Each deleted page, or group of deleted pages, required a different approach to http responses:
- Expired recipes: . We present a message explaining the situation regarding rights to ³ÉÈË¿ìÊÖ recipes, and giving links to similar recipes (where recipe rights have expired we still know the detail of the original recipe so can link to similar recipes - ie for the same dish, by the same chef, using the same ingredients etc.).
- Duplicate recipes: . One of the duplicate recipes was kept, the other was deleted from the system and a 301 redirect put in place from the deleted recipe to the new canonical one.
- Consolidated articles: We created 'food' pages (e.g. bbc.co.uk/food/apple) which acted as canonical resources containing the typical editorial content found in our old food articles (ie how to prepare, choose, store etc.). Each deleted article was 301 redirected to the most relevant food page, and in the case of articles about diets, occasions, cuisines etc. we had appropriate canonical pages for each.
Sometimes, 404 is the right answer
We tried to minimise the number of URLs that returned but invariably there were some which were removed and had no suitable alternative.
In this case it was considered to be better to return a 404 than to redirect to the food homepage.
Simply redirecting all removed pages to the homepage breaks the web. For example, if someone has posted a link to a page that subsequently gets removed, by putting a redirect to the homepage you give the impression (to users and search bots) that the post was about the ³ÉÈË¿ìÊÖ Food homepage.
Additionally, if a recipe search result links you to the ³ÉÈË¿ìÊÖ Food homepage, that's not helpful and you're less likely to click on a ³ÉÈË¿ìÊÖ link next time. We'd prefer those links to be removed from search engine indexes so people don't have that experience.
For the few weeks following the relaunch of ³ÉÈË¿ìÊÖ Food we were getting significant numbers of 404/410s reported on the site, but these were expected.
As the invalid page links were removed from search indexes, very quickly these errors tailed off.
The new pages were soon indexed and after a brief dip, our audience figures were back and rising healthily. We didn't completely avoid the post-launch dip, but it was predictable and reversible and so much easier to stomach.
Oliver Bartlett is Product Manager, Olympic Data, 2012
Comment number 1.
At 28th Aug 2012, Frankie Roberto wrote:What reason is there for returning a 404 (Not Found) rather than a 410 (Gone) response for deleted pages? Surely the 410 communicates more (there was a page here but it's gone) than the generic 404 (there might never have been a page here).
Complain about this comment (Comment number 1)
Comment number 2.
At 28th Aug 2012, Oliver Bartlett wrote:Hi Frankie, good question! An audit of the most visited areas of the site allowed us to configure the new dynamic application to return a 410 or 301 for pages being replaced or consciously removed. However, we turned off the entire old site after the re-launch, and invariably there were many pages which were removed as part of this that hadn’t been covered in the audit. These now return 404 (the default for non-existent URLs). You’re right, a 410 is a more appropriate response for all removed pages but in practical terms we couldn’t explicitly capture all of the removed URLs.
Complain about this comment (Comment number 2)
Comment number 3.
At 1st Sep 2012, SophiaLawrence wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 3)