2.7 KiB
2.7 KiB
Vynl - Recommendation Architecture
Data Sources
Spotify Audio Features API (already integrated)
Pre-computed by Spotify for every track:
- Tempo (BPM)
- Energy (0.0–1.0, intensity/activity)
- Danceability (0.0–1.0)
- Valence (0.0–1.0, musical positivity)
- Acousticness (0.0–1.0)
- Instrumentalness (0.0–1.0)
- Key and Mode (major/minor)
- Loudness (dB)
- Speechiness (0.0–1.0)
Metadata (from Spotify + supplementary APIs)
- Artist name, album, release date
- Genres and tags
- Popularity score
- Related artists
Supplementary APIs (to add)
- MusicBrainz — artist relationships, detailed genre/tag taxonomy, release info
- Last.fm — similar artists, user-generated tags, listener overlap stats
Recommendation Pipeline
User imports playlist
│
▼
Spotify API ──→ Track metadata + audio features
│
▼
Build taste profile:
- Genre distribution
- Average energy/danceability/valence/tempo
- Mood tendencies
- Sample artists and tracks
│
▼
LLM (cheap model) receives:
- Structured taste profile
- User's specific request/query
- List of tracks already in library (to exclude)
│
▼
Returns recommendations with
"why you'll like this" explanations
Model Choice
The LLM reasons over structured audio feature data + metadata. It needs broad music knowledge but not heavy reasoning. Cheapest model wins:
| Model | Cost (per 1M tokens) | Notes |
|---|---|---|
| Claude Haiku 4.5 | $0.25 in / $1.25 out | Best value, great music knowledge |
| GPT-4o-mini | $0.15 in / $0.60 out | Cheapest option |
| Gemini 2.5 Flash | $0.15 in / $0.60 out | Also cheap, good quality |
| Claude Sonnet | $3 in / $15 out | Overkill for this task |
Taste Profile Structure
Built from a user's imported tracks:
{
"top_genres": [{"name": "indie rock", "count": 12}, ...],
"avg_energy": 0.65,
"avg_danceability": 0.55,
"avg_valence": 0.42,
"avg_tempo": 118.5,
"track_count": 47,
"sample_artists": ["Radiohead", "Tame Impala", ...],
"sample_tracks": ["Radiohead - Everything In Its Right Place", ...]
}
The LLM uses this profile to understand what the user gravitates toward sonically (high energy? melancholy? upbeat?) and find new music that matches or intentionally contrasts those patterns.
Platform Support
Currently Implemented
- Spotify (OAuth + playlist import + audio features)
Planned
- YouTube Music (via
ytmusicapi, unofficial Python library) - Apple Music (MusicKit API, requires Apple Developer account)
- Last.fm (scrobble history import + similar artist data)
- Tidal (official API)
- Manual entry / CSV upload (fallback for any platform)