Extract transcript highlights

Create highlight cuts based on transcript signal — information density, signal words, and meaningful punctuation.

Select the best parts of spoken content using transcript analysis. This tool scores transcript segments for information density and signal, then keeps the winners — perfect for podcasts, interviews, and presentations.

What it does

Uses captain transcript timing to score each segment
Scores for density (information per word), signal words, and punctuation
Matches against your stated goal (e.g., "strongest arguments")
Keeps selected transcript ranges
Removes low-signal segments
Ripples main track, overlays, and audio to preserve sync

When to use it

| Source | Use case | |--------|----------| | Podcast | Best insights, key takeaways | | Interview | Strongest answers, quotable moments | | Presentation | Dense explanations, killer stats | | Tutorial | Core tips, skipped fluff |

How to use

With duration target:

"Extract the best 3 minutes based on the transcript"

With content focus:

"Keep the segments with the strongest arguments"

"Extract highlights about pricing and objections"

Combining:

"Extract highlights about marketing, 2 minutes"

Scoring criteria

The agent scores each transcript segment:

| Signal | How it scores | |--------|---------------| | Word density | Concise, information-rich sentences score higher | | Signal words | "Because," "reason," "result," "difference" boost score | | Punctuation | Exclamation points, meaningful pauses | | Match to goal | Segments containing your goal keywords | | Grammar | Statements > fragments > filler |

What you get

Kept segments stitched together in original order
Removed segments deleted
Timeline compacted (ripples)
Caption timing auto-adjusts if present

Prerequisites

Must have captions/transcript on the timeline.

If you don't have captions yet:

"Generate captions first, then extract highlights about [topic]"

Combines well with

| Tool | Result | |------|--------| | Caption generation | Creates the transcript this tool uses | | Cut silences | Tightens gaps before highlight extraction | | Smart reframe | Reframes after extraction for vertical | | Caption skin | Restyles captions on the cut edit |

Tips

Export transcript first — ask "show me the transcript" if you want to review before cutting
Target slightly longer — an 80% density target gives better flow than tightest possible
Combine with filler removal — "remove fillers, then extract highlights" yields cleaner results
Undo and iterate — try different goals or durations; every extraction is undoable

Limitations

Uses transcript segments, not word-level precision (segments are typically 2-10 seconds)
Works best on clear speech; crosstalk may score unpredictably
Goal matching is keyword-based, not semantic (says "reason", not understands reasoning)