Matthew Wirtala

ECE/CS/ME 539 Fall 2005

Project Proposal

Sophmore Slumpware: Predicting Album Sales with ANN

With the advent of file-sharing, iPods, and other manners of digital media, the music industry has been in a state of flux for the past several years. Record sales have declined while CD prices have risen. Desperate, some labels have resorted to filing lawsuits against those caught file sharing. Furthermore, scandals involving pay-for-play on radio stations and the sales of CDs that contain rootkit viruses that compromise the security of customers’ computers have shown that the industry is incredibly resistant to changing its business model; instead it seems to be doing all it can to force consumers to play by its rules.

In spite of all this, many independent labels have embraced the technological advances of our time as a new way to reach potential listeners. With the low cost of web hosting, the surge in online social networking tools where fans can discuss their favorite music, and the free press available via blogs and online review sites, it is becoming more viable for a smaller label to carve out a niche for itself.

For my project, I aim to analyze the correlation to online coverage of different musical artists and develop a predictive neural network of approximately how many records a musician can expect to sell based on different factors. While major label acts will also be considered, a good deal of focus will be put on independent musicians as I feel that the market is slowly moving to a “long tail” model in which the majority of music that is bought and sold is the sum of many smaller artists and labels reaching varied pockets of fans globally, as opposed to the massive superstar-driven hits that dominate the mainstream culture of today.

For my analysis, I will gather review data and coverage statistics from online sources such as and as well as ‘traditional’ media outlets such as Rolling Stone, MTV, and Spin. Using critical ratings, press coverage (“hype”) and “establishment statistics” (number of previous albums released and the sales of those albums) surrounding an artist, I hope to train a back-propagation multi-layer neural network to classify the record sales that can be expected based on these inputs. Albums for which sales figures are readily available will be used to determine the accuracy of the predictive model.