# Vynl - Recommendation Architecture ## Data Sources ### Spotify Audio Features API (already integrated) Pre-computed by Spotify for every track: - **Tempo** (BPM) - **Energy** (0.0–1.0, intensity/activity) - **Danceability** (0.0–1.0) - **Valence** (0.0–1.0, musical positivity) - **Acousticness** (0.0–1.0) - **Instrumentalness** (0.0–1.0) - **Key** and **Mode** (major/minor) - **Loudness** (dB) - **Speechiness** (0.0–1.0) ### Metadata (from Spotify + supplementary APIs) - Artist name, album, release date - Genres and tags - Popularity score - Related artists ### Supplementary APIs (to add) - **MusicBrainz** — artist relationships, detailed genre/tag taxonomy, release info - **Last.fm** — similar artists, user-generated tags, listener overlap stats ## Recommendation Pipeline ``` User imports playlist │ ▼ Spotify API ──→ Track metadata + audio features │ ▼ Build taste profile: - Genre distribution - Average energy/danceability/valence/tempo - Mood tendencies - Sample artists and tracks │ ▼ LLM (cheap model) receives: - Structured taste profile - User's specific request/query - List of tracks already in library (to exclude) │ ▼ Returns recommendations with "why you'll like this" explanations ``` ## Model Choice The LLM reasons over structured audio feature data + metadata. It needs broad music knowledge but not heavy reasoning. Cheapest model wins: | Model | Cost (per 1M tokens) | Notes | |-------|---------------------|-------| | Claude Haiku 4.5 | $0.25 in / $1.25 out | Best value, great music knowledge | | GPT-4o-mini | $0.15 in / $0.60 out | Cheapest option | | Gemini 2.5 Flash | $0.15 in / $0.60 out | Also cheap, good quality | | Claude Sonnet | $3 in / $15 out | Overkill for this task | ## Taste Profile Structure Built from a user's imported tracks: ```json { "top_genres": [{"name": "indie rock", "count": 12}, ...], "avg_energy": 0.65, "avg_danceability": 0.55, "avg_valence": 0.42, "avg_tempo": 118.5, "track_count": 47, "sample_artists": ["Radiohead", "Tame Impala", ...], "sample_tracks": ["Radiohead - Everything In Its Right Place", ...] } ``` The LLM uses this profile to understand what the user gravitates toward sonically (high energy? melancholy? upbeat?) and find new music that matches or intentionally contrasts those patterns. ## Platform Support ### Currently Implemented - Spotify (OAuth + playlist import + audio features) ### Planned - YouTube Music (via `ytmusicapi`, unofficial Python library) - Apple Music (MusicKit API, requires Apple Developer account) - Last.fm (scrobble history import + similar artist data) - Tidal (official API) - Manual entry / CSV upload (fallback for any platform)