Final Project

Build a private web application that:

takes a spoken description of a book,
transcribes the request,
decides whether the description is specific enough to identify the target,
asks a follow-up question when needed,
retrieves the text in an unable-to-be-anticipated format via an IRC channel
dynamically, decides how to safely unpacks or normalizes the retrieved file(s), and
reads the result aloud using the student’s own synthesized voice.

Expected Components

Browser-based or web-app voice input
Speech-to-text
LLM or equivalent reasoning component for disambiguation
Clarification dialogue
Retrieval backend
IRC interaction
Safe extraction or normalization pipeline
Text-to-speech
Azure-backed deployment or infrastructure

Evaluation Notes

This project is judged substantially on integration and infrastructure, not only on model demos. Common failure points are:

no working fully agentic component
no working cloned voice end-to-end
local-only implementations without a real web front end
incomplete retrieval or file-handling logic

Sample Test Prompts

Sample prompts include:

“Wolf Story by William McCleery”
“The Dolls’ House by Rumer Godden”
“Twig by Elizabeth Orton Jones”
“The Twenty-One Balloons by William Pene du Bois”
“Emil is sedated with laced chocolate and robbed on a train…”
“15th century Poland, alchemy, the Philosopher’s Stone…”

The system must handle both exact titles and vague natural-language descriptions.

Deployment

Students will deploy their pipelines using Azure.

Repository Copy

A markdown copy of this handout is also kept in the repository under _starter_code/final-project/README.md.