Agentic Movie Recommender

Agentic AI/RAG/Ollama Cloud/DSPy/GEPA/TMDB/Python — preference-first recommendations with strict latency and output contracts

Course project RAG + LLM + DSPy/GEPA

Author: Regan Yin
Course: BAMS 521 — Agentic AI (UBC Sauder)
A preference-first movie recommender over a TMDB top-1000 CSV: deterministic retrieval and ranking in Python, Ollama Cloud (gemma4:31b-cloud) for final selection and copy, plus corrective retry and a deterministic fallback so every call returns a valid tmdb_id and a persuasive description within course limits.

View on GitHub →

Agentic movie recommender interface preview
Project preview — UI and recommendation flow.

Executive summary

Latency budget
≤ 20 s (13 s primary + 4 s retry + fallback)
Description
≤ 500 chars, sentence-aware trim
Candidate pool
350 titles from CSV (top slice)
LLM
gemma4:31b-cloud via Ollama
Tuning
DSPy + GEPA on a weighted metric

The system behaves as a retrieval-augmented re-ranker: lexical and keyword-heavy RAG pulls a strong shortlist, hybrid scoring encodes preferences over watch history (including explicit conflict handling), and the LLM only chooses one ID and writes the pitch. Invalid IDs, history collisions, or timeouts fall back to a hash-stable deterministic description so outputs stay spec-compliant.

Enforced contracts (course / grader)

get_recommendation(preferences, history, history_ids) must return a dict with keys tmdb_id (int) and description (str). The ID must sit in the offline pool, must never duplicate watch history, descriptions must respect the character cap, API keys load from the environment — not the source tree.
Requirement Where it lives
Valid tmdb_id in pool Post-LLM validation, retry, then fallback
No history repeats Retrieval de-dup, prompt rules, final check
20 s wall clock Timeouts + instant fallback path
500-char descriptions Constants + sanitizer + smart truncation

Tech stack

Python pandas Ollama Cloud DSPy GEPA Streamlit (app) TMDB CSV (offline corpus)

Pipeline

  1. Normalize inputs — strip, dedupe, type-coerce preferences and history.
  2. Preference analysis — genre weights, blocked genres, tone/mood expansions.
  3. RAG retrieval (~100) — lexical + overview/tagline/keywords + quality, with conflict and block penalties.
  4. Hybrid re-rank (~14) — preference beats history; bonuses for tone/title fit.
  5. LLM pick + JSON — strict prompt, banned marketing phrases, pivot guidance when history conflicts with preferences.
  6. Corrective retry — one short retry if the model violates pool or history rules.
  7. Deterministic fallback — template × hook variety keyed by hash for stable, fluent copy without the model.
  8. Sanitize — strip markdown, labels, banned phrases; truncate at sentence boundaries.

Design highlights

  • Preference over history — explicit conflict genres (e.g. romance in history vs “pure thriller” ask) reduce scores and nudge the model to pivot in one clause.
  • Tone vocabulary — alias maps lift intent (“slow-burn”, “twist”, “feel-good”) beyond raw genres.
  • Negative constraints — “no horror” style phrases become hard scoring penalties plus prompt rules.
  • No live TMDB at inference — keeps latency predictable; TMDB API optional for offline eval generation only.
  • Process-local cache on normalized inputs for fast repeat calls.
  • GEPA output — winning prompt fragments persist in dspy_gepa_best_config.json and load in llm.py.

Evaluation and tuning

Quality is driven by a weighted automated metric (genre alignment, quality priors, description length bucket, specificity vs filler, history-acknowledgement bonus, penalties for banned fluff) plus hard gates on pool and history membership. dspy_gepa_benchmark.py runs a style sweep, then GEPA reflection on prompt instructions using structured metric feedback, and writes the best configuration for production inference.

How to run

Clone the repo, create a virtual environment, install requirements, and set OLLAMA_API_KEY. Use python llm.py for the CLI, python test.py for the course tests, and optionally dspy_gepa_benchmark.py to regenerate tuned prompts.

git clone https://github.com/Regan-Yin/agentic-movie-recommender.git
cd agentic-movie-recommender
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
export OLLAMA_API_KEY=your_key_here
python llm.py --preferences "Sci-fi with a smart twist" --history "Inception"
python test.py

Full flags, zip submission layout, and metric tables are documented in the repository README.