portnovo.blogg.se

Spotify podcasts ranking
Spotify podcasts ranking






We are using a machine learning technique called Dense Retrieval, which consists of training a model that produces query and episode vectors in a shared embedding space. The rest of this post will go into further detail about our architecture. We also took advantage of vector search techniques like Approximate Nearest Neighbor (ANN) for fast online serving. To achieve this result, we leveraged recent advances in Deep Learning / Natural Language Processing (NLP) like Self-supervised learning and Transformer neural networks. However, those episodes seem quite relevant to the user’s query. It is noticeable that none of the retrieved episodes contains all of the query words in their title (that’s why Elasticsearch was not picking them up). Thanks to this technique, one can retrieve relevant content for our example query:Įxample of Natural Language Search for podcast episodes. Our solution is now deployed for most Spotify users. It matches synonyms, paraphrases, etc., and any variation of natural language that express the same meaning.Īs a first step, we decided to apply Natural Language Search to podcast episode retrieval, as we thought that semantic matching would be most useful when searching for podcasts.

spotify podcasts ranking

In a nutshell, Natural Language Search matches a query and a textual document that are semantically correlated instead of needing exact word matches. To enable users to find more relevant content with less effort, we started investigating a technique called Natural Language Search, also known as Semantic Search in the literature. Going back to the query “electric cars climate impact”, our Elasticsearch cluster did not actually retrieve anything for it… but does this mean that we don’t have any relevant content to show to the user for this query?Įnter Natural Language Search. While these techniques are very helpful for the user, they have limitations, as they cannot capture all variations of expressing yourself in natural language, especially when using natural language sentences.

Spotify podcasts ranking manual#

However, we know users don’t always type the exact words for what they want to listen to, and we have to use fuzzy matching, normalization, and even manual aliases to make up for it. For example, if you type the query “electric cars climate impact”, Elasticsearch will return search results that contain everything that has each of those query words in its indexed metadata (like in the title of a podcast episode).

spotify podcasts ranking

Until recently, Search at Spotify relied mostly on term matching.






Spotify podcasts ranking