Vynl - Recommendation Architecture

Data Sources

Spotify Audio Features API (already integrated)

Pre-computed by Spotify for every track:

Tempo (BPM)
Energy (0.0–1.0, intensity/activity)
Danceability (0.0–1.0)
Valence (0.0–1.0, musical positivity)
Acousticness (0.0–1.0)
Instrumentalness (0.0–1.0)
Key and Mode (major/minor)
Loudness (dB)
Speechiness (0.0–1.0)

Metadata (from Spotify + supplementary APIs)

Artist name, album, release date
Genres and tags
Popularity score
Related artists

Supplementary APIs (to add)

MusicBrainz — artist relationships, detailed genre/tag taxonomy, release info
Last.fm — similar artists, user-generated tags, listener overlap stats

Recommendation Pipeline

User imports playlist
       │
       ▼
Spotify API ──→ Track metadata + audio features
       │
       ▼
Build taste profile:
  - Genre distribution
  - Average energy/danceability/valence/tempo
  - Mood tendencies
  - Sample artists and tracks
       │
       ▼
LLM (cheap model) receives:
  - Structured taste profile
  - User's specific request/query
  - List of tracks already in library (to exclude)
       │
       ▼
Returns recommendations with
"why you'll like this" explanations

Model Choice

The LLM reasons over structured audio feature data + metadata. It needs broad music knowledge but not heavy reasoning. Cheapest model wins:

Model	Cost (per 1M tokens)	Notes
Claude Haiku 4.5	$0.25 in / $1.25 out	Best value, great music knowledge
GPT-4o-mini	$0.15 in / $0.60 out	Cheapest option
Gemini 2.5 Flash	$0.15 in / $0.60 out	Also cheap, good quality
Claude Sonnet	$3 in / $15 out	Overkill for this task

Taste Profile Structure

Built from a user's imported tracks:

{
  "top_genres": [{"name": "indie rock", "count": 12}, ...],
  "avg_energy": 0.65,
  "avg_danceability": 0.55,
  "avg_valence": 0.42,
  "avg_tempo": 118.5,
  "track_count": 47,
  "sample_artists": ["Radiohead", "Tame Impala", ...],
  "sample_tracks": ["Radiohead - Everything In Its Right Place", ...]
}

The LLM uses this profile to understand what the user gravitates toward sonically (high energy? melancholy? upbeat?) and find new music that matches or intentionally contrasts those patterns.

Platform Support

Currently Implemented

Spotify (OAuth + playlist import + audio features)

Planned

YouTube Music (via ytmusicapi, unofficial Python library)
Apple Music (MusicKit API, requires Apple Developer account)
Last.fm (scrobble history import + similar artist data)
Tidal (official API)
Manual entry / CSV upload (fallback for any platform)

2.7 KiB Raw Blame History Unescape Escape