skip to main content

Audio-based cold-start in music recommendation systems

Borges, Rodrigo Carvalho

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística 2022-07-20

Acesso online. A biblioteca também possui exemplares impressos.

  • Título:
    Audio-based cold-start in music recommendation systems
  • Autor: Borges, Rodrigo Carvalho
  • Orientador: Queiroz, Marcelo Gomes de
  • Assuntos: Conteúdo De Áudio; Sistemas De Recomendação De Música; Sistemas De Recomendação De Música Baseados Em Áudio; Audio Content; Audio-Based Music Recommendation; Cold-Start; Music Recommendation Systems
  • Notas: Tese (Doutorado)
  • Descrição: Music streaming platforms have become popular in the last decades due to the increasing number of tracks available online. The track catalogues offered by these platforms are usually too big to be searched manually, and automatic recommendation algorithms might be implemented for helping users navigate on these platforms. More specifically, Music Recommendation Systems (MRS) are designed for analyzing user listening behaviours and for predicting the songs that will be played in the near future by one specific user or within a listening session. But in the case new tracks are added to a platform, also known as the cold-start problem, no listening data is available, and the system needs to somehow incorporate these tracks into its recommendation algorithms. In this work, we propose methods that leverage the audio associated with tracks that were recently added to streaming platforms as an alternative for compensating the lack of interaction data. Our propositions are elaborated considering collaborative filtering (CF), sequence-aware (SA), and stream-based (SB) recommendation systems, and audio files are considered represented as codeword histograms, Mel-spectrograms, and raw waveforms. In the first experiment, we propose a method that applies Convolutional Neural Networks (CNN) for mapping audio content to profiles containing the users who listened to a track. In a second experiment, Recurrent Neural Networks (RNN) are trained for reproducing the audio feature associated with the upcoming tracks within a listening session, given the audio feature associated with the current track. An inverted index structure is used for retrieving tracks given their estimated audio feature in an efficient way. In a third experiment, we propose a model that maps track/track transitions to an audio domain in a multi-level Markov Chain fashion. The method allows dynamic updates, allowing its application to scenarios of data streams. The experiments were conducted using the LFM-1b music consumption dataset, and audio previews downloaded from Spotify. Our methods presented competitive prediction results in situations of cold-start in the case of CF and SA recommendation systems. The novel stream-based method is able to recommend tracks with an accuracy that is comparable to the accuracy measured for conventional rating-based methods, being based exclusively on audio content.
  • DOI: 10.11606/T.45.2022.tde-14102022-124655
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística
  • Data de criação/publicação: 2022-07-20
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.