Agentic Movie Recommender
Agentic AI/RAG/Ollama Cloud/DSPy/GEPA/TMDB/Python — preference-first recommendations with strict latency and output contracts
Course project RAG + LLM + DSPy/GEPA
Author: Regan Yin
Course: BAMS 521 — Agentic AI (UBC Sauder)
A preference-first movie recommender over a TMDB top-1000 CSV: deterministic retrieval and ranking in Python, Ollama Cloud (gemma4:31b-cloud) for final selection and copy, plus corrective retry and a deterministic fallback so every call returns a valid tmdb_id and a persuasive description within course limits.
Contents
Executive summary Grader requirements Tech stack Pipeline Design choices Evaluation & tuning How to runExecutive summary
The system behaves as a retrieval-augmented re-ranker: lexical and keyword-heavy RAG pulls a strong shortlist, hybrid scoring encodes preferences over watch history (including explicit conflict handling), and the LLM only chooses one ID and writes the pitch. Invalid IDs, history collisions, or timeouts fall back to a hash-stable deterministic description so outputs stay spec-compliant.
Enforced contracts (course / grader)
get_recommendation(preferences, history, history_ids) must return a dict with keys tmdb_id (int) and description (str). The ID must sit in the offline pool, must never duplicate watch history, descriptions must respect the character cap, API keys load from the environment — not the source tree. | Requirement | Where it lives |
|---|---|
Valid tmdb_id in pool | Post-LLM validation, retry, then fallback |
| No history repeats | Retrieval de-dup, prompt rules, final check |
| 20 s wall clock | Timeouts + instant fallback path |
| 500-char descriptions | Constants + sanitizer + smart truncation |
Tech stack
Python pandas Ollama Cloud DSPy GEPA Streamlit (app) TMDB CSV (offline corpus)Pipeline
- Normalize inputs — strip, dedupe, type-coerce preferences and history.
- Preference analysis — genre weights, blocked genres, tone/mood expansions.
- RAG retrieval (~100) — lexical + overview/tagline/keywords + quality, with conflict and block penalties.
- Hybrid re-rank (~14) — preference beats history; bonuses for tone/title fit.
- LLM pick + JSON — strict prompt, banned marketing phrases, pivot guidance when history conflicts with preferences.
- Corrective retry — one short retry if the model violates pool or history rules.
- Deterministic fallback — template × hook variety keyed by hash for stable, fluent copy without the model.
- Sanitize — strip markdown, labels, banned phrases; truncate at sentence boundaries.
Design highlights
- Preference over history — explicit conflict genres (e.g. romance in history vs “pure thriller” ask) reduce scores and nudge the model to pivot in one clause.
- Tone vocabulary — alias maps lift intent (“slow-burn”, “twist”, “feel-good”) beyond raw genres.
- Negative constraints — “no horror” style phrases become hard scoring penalties plus prompt rules.
- No live TMDB at inference — keeps latency predictable; TMDB API optional for offline eval generation only.
- Process-local cache on normalized inputs for fast repeat calls.
- GEPA output — winning prompt fragments persist in
dspy_gepa_best_config.jsonand load inllm.py.
Evaluation and tuning
Quality is driven by a weighted automated metric (genre alignment, quality priors, description length bucket, specificity vs filler, history-acknowledgement bonus, penalties for banned fluff) plus hard gates on pool and history membership. dspy_gepa_benchmark.py runs a style sweep, then GEPA reflection on prompt instructions using structured metric feedback, and writes the best configuration for production inference.
How to run
Clone the repo, create a virtual environment, install requirements, and set OLLAMA_API_KEY. Use python llm.py for the CLI, python test.py for the course tests, and optionally dspy_gepa_benchmark.py to regenerate tuned prompts.
git clone https://github.com/Regan-Yin/agentic-movie-recommender.git cd agentic-movie-recommender python -m venv .venv && source .venv/bin/activate pip install -r requirements.txt export OLLAMA_API_KEY=your_key_here python llm.py --preferences "Sci-fi with a smart twist" --history "Inception" python test.py
Full flags, zip submission layout, and metric tables are documented in the repository README.