TherapyTone / README.md
ageraustine's picture
Update README.md
7d88f77 verified

A newer version of the Gradio SDK is available: 5.13.1

Upgrade
metadata
title: TherapyTone
emoji: πŸ¦€
colorFrom: yellow
colorTo: gray
sdk: gradio
sdk_version: 5.5.0
app_file: app.py
pinned: false
license: mit

Therapeutic Music Generator 🎡 πŸ§˜β€β™€οΈ

This Hugging Face Space hosts an interactive application that generates personalized therapeutic music based on your current emotional state and desired mood. By combining mood assessment with AI-powered music generation, it creates unique musical pieces designed to support emotional well-being and mood transformation.

Features

  • Comprehensive Mood Assessment: Complete a quick questionnaire about your current emotional state, including:

    • Energy levels
    • Stress levels
    • Happiness levels
    • Current emotions
    • Desired mood state
  • Customizable Music Preferences: Optional settings to tailor the generated music to your taste:

    • Genre preferences
    • Preferred instruments
    • Tempo preferences
    • Musical mood preferences
  • AI-Powered Music Generation: Utilizes Facebook's MusicGen model to create unique, therapeutic music pieces based on your inputs

How It Works

  1. Mood Assessment: Users complete a simple questionnaire rating their current emotional state and desired mood
  2. Music Preferences: Optionally specify musical preferences to personalize the generated content
  3. Prompt Generation: The system uses GPT-4-mini through LangChain to create a specialized music generation prompt
  4. Music Creation: The prompt is processed by Facebook's MusicGen model to create a unique piece of music
  5. Delivery: Listen to your personalized therapeutic music directly in the browser

Code Structure and Flow

LangChain Integration

LangChain plays a crucial role in this project by orchestrating the prompt generation pipeline and ensuring consistent, high-quality prompts for music generation. Here's how LangChain is implemented:

LangChain Components Used

  1. ChatOpenAI Integration
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
  • Provides the language model interface
  • Handles token management and API communication
  • Ensures consistent response formatting
  1. PromptTemplate
from langchain.prompts import PromptTemplate
music_prompt_template = """
Based on the user's mood assessment:
- Energy level: {energy}
...
"""
  • Structures the input data consistently
  • Maintains prompt engineering best practices
  • Allows for easy template modifications
  1. RunnablePassthrough Chain
from langchain_core.runnables import RunnablePassthrough
music_chain = RunnablePassthrough() | prompt | llm
  • Creates a sequential processing pipeline
  • Handles data transformation and model interaction
  • Provides error handling and retry capabilities

LangChain Flow

  1. Data Processing

    graph LR
    A[User Input] --> B[RunnablePassthrough]
    B --> C[PromptTemplate]
    C --> D[ChatOpenAI]
    D --> E[Generated Prompt]
    
  2. Chain Execution

    • Input validation and preprocessing
    • Template variable injection
    • LLM prompt generation
    • Response formatting and validation
  3. Benefits of LangChain

    • Modular and maintainable code structure
    • Consistent prompt engineering
    • Easy model switching and testing
    • Built-in error handling
    • Streamlined API integration

Core Components

  1. Constants and Configurations
MOOD_QUESTIONS = [
    "On a scale of 1-5, how would you rate your current energy level?",
    # ... other questions
]
  • Defines the core assessment questions used in the interface
  • Maintains consistency in the mood evaluation process
  1. LangChain Setup
llm = ChatOpenAI(model="gpt-4o-mini")
music_prompt_template = """
Based on the user's mood assessment:
- Energy level: {energy}
# ... template structure
"""
  • Initializes the language model for prompt generation
  • Defines the structured template for converting mood data into music generation prompts
  1. Core Functions

analyze_mood_and_generate_prompt(responses, preferences)

def analyze_mood_and_generate_prompt(responses, preferences):
    """Convert questionnaire responses and preferences into a music generation prompt"""
    try:
        prompt_result = music_chain.invoke({
            "energy": responses[0],
            # ... parameter mapping
        })
        return prompt_result.content
    except Exception as e:
        return f"Error generating prompt: {str(e)}"
  • Processes user inputs into a structured format
  • Invokes LangChain for prompt generation
  • Handles error cases gracefully

generate_music(prompt, duration=10)

def generate_music(prompt, duration=10):
    """Generate music using the MusicGen API"""
    API_URL = "https://api-inference.huggingface.co/models/facebook/musicgen-small"
    # ... implementation
  • Interfaces with the MusicGen API
  • Handles temporary file creation for audio storage
  • Manages API communication and error handling

Application Flow

  1. Input Processing

    graph LR
    A[User Input] --> B[Mood Assessment]
    B --> C[Musical Preferences]
    C --> D[Prompt Generation]
    D --> E[Music Generation]
    E --> F[Audio Output]
    
  2. Data Flow

    • User inputs β†’ Gradio interface
    • Interface β†’ LangChain prompt generation
    • Prompt β†’ MusicGen API
    • API response β†’ Audio file
    • Audio file β†’ User interface
  3. Error Handling Flow

    • Input validation at the Gradio interface level
    • LangChain error handling:
      • Template validation errors
      • Model API failures
      • Response format validation
      • Token limit management
    • MusicGen API error management
    • User feedback through the interface
  4. LangChain Error Recovery

    • Automatic retry mechanism for transient errors
    • Fallback templates for prompt generation
    • Graceful degradation when model is unavailable
    • Detailed error reporting for debugging

Gradio Interface Structure

with gr.Blocks() as demo:
    # Interface layout
    with gr.Row():
        with gr.Column():
            # Input components
        with gr.Column():
            # Output components
  • Organized in a two-column layout
  • Left column: User inputs and controls
  • Right column: Generated outputs and status

Usage

  1. Move the sliders to rate your current energy, stress, and happiness levels (1-5 scale)
  2. Type in your current emotions and desired mood state
  3. (Optional) Fill in your musical preferences
  4. Click "Generate Therapeutic Music"
  5. Wait for the system to generate your personalized music
  6. Listen to the generated audio and read the prompt that created it

Technical Requirements

Dependencies

gradio
requests
langchain_openai
langchain_core

Environment Variables

  • HF_API_KEY: Your Hugging Face API key with access to the MusicGen model

API Specifications

  • MusicGen API endpoint: https://api-inference.huggingface.co/models/facebook/musicgen-small
  • Maximum request timeout: 300 seconds
  • Output format: WAV audio file

Limitations

  • Music generation may take a few minutes
  • Generated audio clips are limited in duration
  • The system works best with clear, specific emotional descriptions
  • API rate limits may apply
  • Temporary file storage considerations

Future Improvements

  • Extended music duration options
  • More detailed musical customization
  • Batch generation capabilities
  • History tracking of generated music
  • Mood improvement tracking
  • Integration with additional music generation models
  • Enhanced error handling and retry mechanisms
  • User session management
  • Feedback collection system

Credits

  • MusicGen by Facebook Research
  • GPT-4-mini for prompt generation
  • Hugging Face for model hosting
  • Gradio for the user interface