<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet title="XSL_formatting" type="text/xsl" href="/blogs/shared/nolsol.xsl"?>

<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>

<title>
BBC Internet Blog
 - 
Geoffrey Goodwin
</title>
<link>https://bbcstreaming.pages.dev/blogs/bbcinternet/</link>
<description>Staff from the BBC&apos;s online and technology teams talk about BBC Online, BBC iPlayer, and the BBC&apos;s digital and mobile services. The blog is reactively moderated. Posts are normally closed for comment after three months. Your host is Eliza Kessler. </description>
<language>en</language>
<copyright>Copyright 2012</copyright>
<lastBuildDate>Fri, 06 Jun 2008 18:16:02 +0000</lastBuildDate>
<generator>http://www.sixapart.com/movabletype/?v=4.33-en</generator>
<docs>http://blogs.law.harvard.edu/tech/rss</docs> 


<item>
	<title>Sound Index Algorithm</title>
	<description><![CDATA[<p>Thanks for your <a href="https://bbcstreaming.pages.dev/blogs/bbcinternet/2008/07/sound_index_data.html#comment1">comments</a> on <a href="https://bbcstreaming.pages.dev/blogs/bbcinternet/2008/06/sound_index_data.html">Beth's previous post</a> about the <a href="https://bbcstreaming.pages.dev/soundindex/">Sound Index</a>. </p>

<p>The Sound Index is not meant to be a definitive chart (like a <a href="https://bbcstreaming.pages.dev/radio1/chart/top40.shtml">sales chart</a>). Rather, it's a gauge of who and what is currently driving conversation and interaction about music online. It's a great tool for music discovery, and to find out <a href="https://bbcstreaming.pages.dev/soundindex/soundindex/?type=artist">who's currently hot</a> in the music world of teenagers. However, we have taken steps to ensure that our data collection is as accurate as possible, and have implemented an <a href="http://en.wikipedia.org/wiki/Algorithm">algorithm</a> to help us create the most editorially relevant and robust Index. </p>

<p><a href="https://bbcstreaming.pages.dev/soundindex/"><img alt="sound_index_about.png" src="https://bbcstreaming.pages.dev/blogs/bbcinternet/img/sound_index_about.png" width="430" height="177"></form></a></p>

<p>The Sound Index is currently in a four month <a href="http://en.wikipedia.org/wiki/Software_release_life_cycle#Beta">beta</a> stage, so that (among other things) the technical, editorial and cost implications of various algorithm options can be assessed. </p>

<p>After viewing an Index of based on the raw data we felt an algorithm was needed to allow all the sources to contribute to the Index, and for all forms on activity on the internet - plays, comments and downloads - to affect the rankings. Without an algorithm the large volumes of the more dominant forms of interaction - mainly plays and downloads - drowned out the smaller numbers of comments, which we felt were important to reflect in the Index. </p>

<p>Therefore, my team has developed the following algorithm, which I feel gives an editorially relevant and justified chart, without any bespoke manipulation or input, meaning that the Sound Index can be viewed as an accurate gauge of online buzz. </p>

<p>For each type of interaction (play, comment, download) all the data for each artist for each individual site has been added. Then each artist is given a score depending on how popular they are on each site. This score is directly related to how many artists are on that site. For example, if there were 200 artists from <a href="http://www.myspace.com/">MySpace</a>, the Number One artist (with the most counts) would have a score of 200, whereas if they had the least, they would have a score of 1. </p>

<p><a href="https://bbcstreaming.pages.dev/soundindex/"><img alt="sound_index_fresh.png" src="https://bbcstreaming.pages.dev/blogs/bbcinternet/img/sound_index_fresh.png" width="175" height="51"></a>We didn't want sites with massive amounts of only one type of data totally dominate the Sound Index. So each type of data - play, download, comment - is limited to make up a set proportion of each artist's popularity. This is determined by how many different sources there are for each type of data. So, if there were ten sources in total made up of five play counts, three download and two comments, we would multiply the ranks from each source in the following way: 5/10 for counts, 3/10 for downloads and 2/10 for comments. </p>

<p>These figures from each type of activity from each site for each artist multiplied by this fixed proportion are then added together, to give a total buzz score, which is used to create the Sound Index. The same method is applied separately for individual tracks. We have also put in processes with our data collection methods to reduce the impact of gaming. Our partner <a href="http://www.almaden.ibm.com/cs/projects/iis/sound/">IBM</a> has implemented spam filtering, porn filtering, multiple post detection and <a href="http://musicbrainz.org/">MusicBrainz</a> verification to help the data be as clean as possible.</p>

<p>The Sound Index is a project based on trialing new technology. I think that in its current form it's been successful in achieving an exciting way of discovering which bands and artists are creating the most buzz. We are not using it to define any charts or create any definitive lists. Anything editorial around the Sound Index should not use it as an absolute measure: it's a gauge of what is hot. It's a great example of innovation and collaboration with major music sites. We're still learning about what the Sound Index can do.</p>

<p>How would you like the Sound Index to develop? Please do leave a comment.</p>

<p><em>Geoffrey Goodwin is Head of <a href="https://bbcstreaming.pages.dev/switch/">BBC Switch</a>.</em></p>]]></description>
         <dc:creator>Geoffrey Goodwin 
Geoffrey Goodwin
</dc:creator>
	<link>https://bbcstreaming.pages.dev/blogs/bbcinternet/2008/06/sound_index_algorithm.html</link>
	<guid>https://bbcstreaming.pages.dev/blogs/bbcinternet/2008/06/sound_index_algorithm.html</guid>
	<category>Music</category>
	<pubDate>Fri, 06 Jun 2008 18:16:02 +0000</pubDate>
</item>


</channel>
</rss>

 
