• ActiveLoop
    • Solutions
      Industries
      • agriculture
        Agriculture
      • audio proccesing
        Audio Processing
      • autonomous_vehicles
        Autonomous & Robotics
      • biomedical_healthcare
        Biomedical & Healthcare
      • generative_ai_and_rag
        Generative AI & RAG
      • multimedia
        Multimedia
      • safety_security
        Safety & Security
      Case Studies
      Enterprises
      BayerBiomedical

      Chat with X-Rays. Bye-bye, SQL

      MatterportMultimedia

      Cut data prep time by up to 80%

      Flagship PioneeringBiomedical

      +18% more accurate RAG

      MedTechMedTech

      Fast AI search on 40M+ docs

      Generative AI
      Hercules AIMultimedia

      100x faster queries

      SweepGenAI

      Serverless DB for code assistant

      Ask RogerGenAI

      RAG for multi-modal AI assistant

      Startups
      IntelinairAgriculture

      -50% lower GPU costs & 3x faster

      EarthshotAgriculture

      5x faster with 4x less resources

      UbenwaAudio

      2x faster data preparation

      Tiny MileRobotics

      +19.5% in model accuracy

      Company
      Company
      about
      About
      Learn about our company, its members, and our vision
      Contact Us
      Contact Us
      Get all of your questions answered by our team
      Careers
      Careers
      Build cool things that matter. From anywhere
      Docs
      Resources
      Resources
      blog
      Blog
      Opinion pieces & technology articles
      langchain
      LangChain
      LangChain how-tos with Deep Lake Vector DB
      tutorials
      Tutorials
      Learn how to use Activeloop stack
      glossary
      Glossary
      Top 1000 ML terms explained
      news
      News
      Track company's major milestones
      release notes
      Release Notes
      See what's new?
      Academic Paper
      Deep Lake Academic Paper
      Read the academic paper published in CIDR 2023
      White p\Paper
      Deep Lake White Paper
      See how your company can benefit from Deep Lake
      Free GenAI CoursesSee all
      LangChain & Vector DBs in Production
      LangChain & Vector DBs in Production
      Take AI apps to production
      Train & Fine Tune LLMs
      Train & Fine Tune LLMs
      LLMs from scratch with every method
      Build RAG apps with LlamaIndex & LangChain
      Build RAG apps with LlamaIndex & LangChain
      Advanced retrieval strategies on multi-modal data
      Pricing
  • Book a Demo
3 Ways to Build a Recommendation Engine for Songs with LangChain
    • Back
      • Tutorials
      • LangChain

    3 Ways to Build a Recommendation Engine for Songs with LangChain

    FairyTaleDJ: Disney Song Recommendations with LangChain. We Used 3 Ways - Direct or Emotions Embeddings, & ChatGPT as a Retrieval System. Learn Which One Works.
    • Davit BuniatyanDavit Buniatyan
    10 min readon May 23, 2023
  • TL;DR We used LangChain, OpenAI ChatGPT, Deep Lake, and Streamlit to create a web app that recommends Disney songs based on user input. There are three main approaches you could go about the problem, but not all of them work (we learned it the hard way). What to do if you’re looking to build a similar app? Learn more below.

    song recommendation engine langchain

    A demo is on Hugging Face 🤗

    Hey there! Today we will see how to leverage Deep Lake Vector Database to create a document retrieval system. This will be different from your usual Question Answering demo app, where we just directly apply the user’s query to embedded documents using LangChain. We will showcase how we can leverage Large Language Models (LLMs) to encode our data to make our matching easier, better, and faster.

    Step by step, we’ll unpack the behind-the-scenes of FairytaleDJ, a web app to recommend Disney songs based on user input. The goal is simple: We ask how the user feels, and we want to retrieve Disney songs that go “well” with that input. For example, if the user is sad, a song like Reflection from Mulan would probably be appropriate. Spotify, we’re coming for you.

    Just joking…

    Or maybe not…

    In any case, such ‘document’ retrieval is a perfect example where vanilla Question Answering over docs fails. You won’t get good results if you try to find similarities between users’ feelings (like, “Today I am great”) and song lyrics. That’s because song embeddings capture everything in the lyrics, making them "more open". Instead, we want to encode inputs, users, and lyrics into a similar representation and then run the search. We won’t spoil too much here, so shopping list time. We need mainly three things: data, a way to encode it, and a way to match it with user input.

    Getting the Data for the Song Recommendation Engine

    To get our songs, we scraped https://www.disneyclips.com/lyrics/, a website containing all the lyrics for all Disney songs ever made. The code is here, and it relies on asyncio to speed up things. We won’t focus too much on it, since it’s not central to our story (plays Encanto music we don’t talk about asyncio, no, no, no…).

    Then, we used Spotify Python APIs to get all the embedding URLs for each song into the “Disney Hits” Playlist. We removed all the songs we had scraped but were not in this playlist. By doing so, we end up with 85 songs.

    We end up with a json looking like this.

    1json
    2{
    3  "Aladdin": [
    4    {
    5      "name": "Arabian Nights",
    6      "text": "Oh, I come from a land, from a faraway place. Where the caravan camels roam... ",
    7      "embed_url": "https://open.spotify.com/embed/track/0CKmN3Wwk8W4zjU0pqq2cv?utm_source=generator"
    8    },
    9    ...
    10  ],
    11

    Data Encoding for the Recommendation Engine

    We were looking for the best way to retrieve the songs. We evaluated different approaches. We used Activeloop DeepLake Vector Database - more specifically, its implementation in LangChain.

    Creating the dataset is pretty straightforward. Given the previous json file, we proceed to embed the text field using langchain.embeddings.openai.OpenaAIEmbeddings and add all the rest of the keys/values as metadata

    1python
    2from langchain.embeddings.openai import OpenAIEmbeddings
    3from langchain.llms import OpenAI
    4from langchain.vectorstores import DeepLake
    5
    6def create_db(dataset_path: str, json_filepath: str) -> DeepLake:
    7    with open(json_filepath, "r") as f:
    8        data = json.load(f)
    9
    10    texts = []
    11    metadatas = []
    12
    13    for movie, lyrics in data.items():
    14        for lyric in lyrics:
    15            texts.append(lyric["text"])
    16            metadatas.append(
    17                {
    18                    "movie": movie,
    19                    "name": lyric["name"],
    20                    "embed_url": lyric["embed_url"],
    21                }
    22            )
    23
    24    embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
    25
    26    db = DeepLake.from_texts(
    27        texts, embeddings, metadatas=metadatas, dataset_path=dataset_path
    28    )
    29
    30    return db
    31

    To load it, we can simply:

    1def load_db(dataset_path: str, *args, **kwargs) -> DeepLake:
    2    db = DeepLake(dataset_path, *args, **kwargs)
    3    return db
    4

    My dataset_path is hub://<ACTIVELOOP_ORGANIZATION_ID>/<DATASET_NAME>, but you can also store it locally. To store Deep Lake datasets locally, check out this doc here.

    3 Approaches to Matching Moods to Songs

    The next step was to find a way to match our songs with a given user input. In this tutorial, we tried 3 approaches so you don’t have to! Ultimately, we found a cheap way that worked qualitatively well. So let’s start with the failures 😅

    What Didn’t Work

    Similarity Search of Direct Embeddings

    This approach was straightforward. We create embeddings for the lyrics and the user input with gpt3 and do a similarity search. Unfortunately, we noticed terrible suggestions because we want to match the user’s emotions to the songs rather than precisely what it says.

    For example, if we search for similar songs using "I am Sad", we will see very similar scores across all documents:

    1 db.similarity_search_with_score("I am happy", distance_metric="cos", k=100)
    2

    If we plot the scores using a box plot, we will see they mostly are around 0.74.

    langchain scores bad

    While the first ten songs do not match so well

    1The World Es Mi Familia 0.7777353525161743
    2Go the Distance 0.7724394202232361
    3Waiting on a Miracle 0.7692896127700806
    4Happy Working Song 0.7679054141044617
    5In Summer 0.7620900273323059
    6So Close 0.7601353526115417
    7When I Am Older 0.7582702040672302
    8How Far I'll Go 0.7560539245605469
    9You're Welcome 0.7539903521537781
    10What Else Can I Do? 0.7535801529884338
    11

    Using ChatGPT as a Retrieval System

    We also tried to nuke the whole lyrics into ChatGPT and asked it to return matching songs with the user input. We had first to create a one-sentence summary of each lyric to fit into 4096 tokens. It resulted in around 3k tokens per request (0.006$). It follows the prompt template, which is very simple but very long. The {songs} variable holds the JSON with all the songs.

    1You act like a song retrieval system. We want to propose three songs based on the user input. We provide you a list of songs with their themes in the format <MOVIE_NAME>;<SONG_TITLE>:<SONG_THEMES>. To match the user input to the song, try to find themes/emotions from it and imagine what emotions the user may have and what song may be lovely to listen to. Add a bit of randomness to your decision.
    2If you don't find a match, provide your best guess. Try to look at each song's themes to offer more variations in the match. Please only output songs contained in the following list.
    3
    4{songs}
    5
    6Given an input, output three songs as a list that goes well with the input. The list of songs will be used to retrieve them from our database. The type of reply is List[str, str, str]. Please follow the following example formats.
    7
    8Examples:
    9Input: "Today I am not feeling great."
    10["<MOVIE_NAME>;<SONG_TITLE>", "<MOVIE_NAME>;<SONG_TITLE>", "<MOVIE_NAME>;<SONG_TITLE>"]
    11Input: "I am great today"
    12["<MOVIE_NAME>;<SONG_TITLE>", "<MOVIE_NAME>;<SONG_TITLE>", "<MOVIE_NAME>;<SONG_TITLE>"]
    13
    14The user input is {user_input}
    15

    That did work okayish but was overkill.
    Later on, we also tried emotional encoding we will discuss in the next section, which had comparable performance.

    What Did Work: Similarity Search of Emotions Embeddings

    Finally, we arrived at an inexpensive approach to run, which gives good results. We convert each lyric to a list of 8 emotions using ChatGPT. The prompt is the following

    1I am building a retrieval system. Given the following song lyric
    2
    3{song}
    4
    5You are tasked to produce a list of 8 emotions that I will later use to retrieve the song. 
    6
    7Please provide only a list of comma-separated emotions.
    8

    For example, using the “Arabian Nights” from Aladdin (shown in the previous section), we obtained "nostalgic, adventurous, exotic, intense, romantic, mysterious, whimsical, passionate".

    We then embedded each emotion for each song with GPT3.5-turbo and stored it with Deep Lake.

    The entire script is here

    Then, we need to convert the user input to a list of emotions. We used ChatGPT again with a custom prompt.

    1text
    2We have a simple song retrieval system. It accepts eight emotions. You are tasked to suggest between 1 and 4 emotions to match the users' feelings. Suggest more emotions for longer sentences and just one or two for small ones, trying to condense the central theme of the input.
    3
    4Examples:
    5
    6Input: "I had a great day!" 
    7"Joy"
    8Input: "I am exhausted today and not feeling well."
    9"Exhaustion, Discomfort, and Fatigue"
    10Input: "I am in Love"
    11"Love"
    12
    13Please, suggest emotions for input = "{user_input}", and reply ONLY with a list of emotions/feelings/vibes.
    14

    Here we tasked the model to provide between one and four emotions. This worked best empirically, given the fact that most inputs are short.

    Let’s see some examples:

    1"I'm happy and sad today" -> "Happiness, Sadness"
    2"hey, rock you" -> "Energy, excitement, enthusiasm."
    3"I need to cry" -> "Sadness, Grief, Sorrow, Despair." 
    4

    workflow langchain deeplake

    Then we used these emotions to perform the similarity search on the vector database.

    1python
    2user_input = "I am happy"
    3# We use chatGPT to get emotions from a user's input
    4emotions = chain.run(user_input=user_input)
    5# We find the k more similar song
    6matches = db.similarity_search_with_score(emotions, distance_metric="cos", k=k)
    7

    These are the scores obtained from that search (k=100). They are more spread apart.

    scores improved

    And the songs make more sense.

    1Down in New Orleans (Finale) 0.9068354368209839
    2Happy Working Song 0.9066014885902405
    3Love is an Open Door 0.8957026600837708
    4Circle of Life 0.8907418251037598
    5Where You Are 0.8890194892883301
    6In Summer 0.8889626264572144
    7Dig a Little Deeper 0.8887585401535034
    8When We're Human 0.8860496282577515
    9Hakuna Matata 0.8856213688850403
    10The World Es Mi Familia 0.884093165397644
    11

    We also implement some postprocessing. We first filter out the low-scoring ones.

    1python
    2def filter_scores(matches: Matches, th: float = 0.8) -> Matches:
    3    return [(doc, score) for (doc, score) in matches if score > th]
    4
    5matches = filter_scores(matches, 0.8)
    6

    To add more variations, aka only sometimes recommend the first one, we need to sample from the list of candidate matches. To do so, we first ensure the scores sum to one by dividing by their sum.

    1python
    2def normalize_scores_by_sum(matches: Matches) -> Matches:
    3    scores = [score for _, score in matches]
    4    tot = sum(scores)
    5    return [(doc, (score / tot)) for doc, score in matches]
    6

    Then we sample n songs using a modified version of np.random.choice(..., p=scores), basically everything we sample we remove the element we have sampled. This ensures we don’t sample two times the same element.

    1python
    2docs, scores = zip(*matches)
    3docs = weighted_random_sample(
    4    np.array(docs), np.array(scores), n=number_of_displayed_songs
    5).tolist()
    6for doc in docs:
    7    print(doc.metadata["name"])
    8

    And finally, we have our songs. Then, we created a web app using Streamlit, and we hosted the app on an Hugging Face space. Go give it a try! :)

    resulting app built with langchain and deeplake for music recommendation

    Conclusion: Technology Choice Matters When Building a Recommendation Engine with LangChain and Deep Lake

    While we explained how to mix these technologies to create a song recommendation system, you can apply the same principles to more use cases. With Deep Lake’s multi-modality, you can embed store multiple embeddings to the same set of lyrics, or even incorporate additional factors such as embeddings based on song tempo, instruments used, and more!

    The main takeaway is understanding how to leverage LLMs to make the data work for you by transforming it to fit your task better. This was crucial for us since only after we converted both users’ inputs and songs’ lyrics to a list of emotions were we able to have suitable matches.

    That’s all, folks 🎉

    Thanks for reading, and see you in the next one 💜
    Francesco

    Share:

    • Table of Contents
    • Getting the Data for the Song Recommendation Engine
    • Data Encoding for the Recommendation Engine
    • 3 Approaches to Matching Moods to Songs
    • What Didn't Work
    • Similarity Search of Direct Embeddings
    • Using ChatGPT as a Retrieval System
    • What Did Work: Similarity Search of Emotions Embeddings
    • Conclusion: Technology Choice Matters When Building a Recommendation Engine with LangChain and Deep Lake
    • Previous
        • Tutorials
        • LangChain
      • DataChad: an AI App with LangChain & Deep Lake to Chat with Any Data

      • on May 17, 2023
    • Next
        • News
      • Introducing Deep Lake, the Data Lake for Deep Learning

      • on Sep 30, 2022
  • deep lake database

    Deep Lake. Database for AI.

    • Solutions
      AgricultureAudio ProcessingAutonomous Vehicles & RoboticsBiomedical & HealthcareMultimediaSafety & Security
    • Company
      AboutContact UsCareersPrivacy PolicyDo Not SellTerms & Conditions
    • Resources
      BlogDocumentationDeep Lake WhitepaperDeep Lake Academic Paper
  • Tensie

    Featured by

    featuredfeaturedfeaturedfeatured