Vidistiller
Extracts YouTube transcripts with timestamps and snapshots to feed directly into a language model.
Vidistiller solves a specific problem: YouTube is full of information that language models cannot access directly. Vidistiller extracts the transcript, aligns it with timestamps, and captures snapshots — producing a structured, model-ready artifact from any video.
The name is a portmanteau of vidi (Latin: I saw) and distiller. It takes what was seen and reduces it to what is useful.
What it does
- Fetches YouTube transcripts with precise timestamp alignment
- Captures frame snapshots at key moments for visual context
- Outputs a clean, structured format ready to feed into any language model
- Turns passive video content into active, queryable knowledge
Status
Active. Used in the workshop for research, reference, and model context building.