6blog Part of 6bot

October Woe

We've been unable to add new track information for a few days because the BBC have changed the now playing part of the 6 music page that we scrape the data from. Normally, we can deal with this pretty quickly, but they've changed it to a two stage process:

  1. The standard web page loads as normal, but the now playing data can be up to 30 minutes out of date.
  2. The up to date now playing data is then injected (probably by javascript) into the web page afterwards.

The scraping mechanism that we use ignores javascript, so we were left with using data that was out of date, and no way of accurately working out when the track was actually played.

So, we've now moved to a different mechanism entirely - we're scraping the now playing data from the BBC6MusicBot twitter feed, and then blending in any missing data from scraping the web page as before. Hopefully, this will be a more stable way of getting the data.

The change may also have broken how we backfill any missing data, we're just looking into that now. EDIT we've updated one of the backfill routines, so that should kick in and start working properly now.

Easter woe

The BBC appear to have retired their API that 6bot uses to populate the track information in 6bot. The alternative page that we were using as a backup also seems to have been retired too. They might just be down for Easter, but please bear with us whilst we work on a workaround.

Fixes, fixes, fixes

We've made a few changes over the last few days.

  • We've changed the search page to filter tracks by artist too. If you're looking for "Woman" by Karen O & Danger Mouse, then it'll no longer return John Lennon's track of the same name. If you still want to see all artists who've sang a track called "Woman", you can use the bespoke search feature and type it into the search box directly.

  • We've made the same change to the clickable track names on the search page, and home page. We've also made the same change to the clickable links on the top 10 charts on the home page, as well as filtering the chart track data by artist too. The 10 most popular tracks shown will no longer include other tracks of the same name by a different artist. As a result, you may have noticed that most of the top 10 track entries have recently changed.

  • An error meant that one of the top 10 track entries was blank. This has also been fixed.

  • The backfill process was incorrectly adding tracks that were already in place, because the artist or track data was being incorrectly imported. For example it was incorrectly importing "Woman" by Karen O & Danger Mouse just as "Woman" by Danger Mouse and was then failing to dedupe the tracks as it assumed that they were different artists. This import issue has been fixed, and the erroneous duplicate tracks have been removed. As a result, you may have noticed a reduction in overall track volume in the last couple of days, as duplicate tracks have been deleted.

  • We've cached the summary data at the top of the home page, which has improved loading time for the home page.

  • We've retired the BETA version of the search page.

  • We've made a behind the scenes change to how we log the number of searches performed. This should be less resource intensive and more accurate.

  • We've made a change to ensure that we're consistent in how we store artist names and song titles that contain an ampersand (&).

Update:

End of the line for BBC find-a-track

It's been a few months now since the BBC website removed it's find-a-track service. This has been a double-edged sword for 6bot. Whilst it increases the need to use 6bot, it's also removed the main source of the data 6bot used.

We've switched to an alternative source for the data: https://polling.bbc.co.uk/radio/nhppolling/bbc_6music?&callback=nhprealtimepolling

However, we've noticed that it can be a bit flakey: it sometimes can't make up it's mind whether to send the play time as GMT or BST, and although it's meant to contain the past hour's worth of tracks, songs are sometime missing.

As a result, the number of songs captured each day has reduced (for some days) after we switched to using this data.

Because of this, we've added more backfill routines to plug some of these gaps. The backfill process is automated, and will take a few days / weeks to insert the missing tracks.

Home ← Older posts