Add cutter report and auto-regen on each match

- New CUTTER_REPORT.md: per-beat hand-off table for the video editor doing the manual recut. Per beat: trailer SMPTE in/out, source SMPTE in/out, scene id, score, status (OK / ? / MAN.), and a one-line phase description from the cached vision text. - New scripts/generate_cutter_report.py: pure renderer that reads the current cache (match_results.json + trailer_beats.json + optional vision_descriptions.json) and writes CUTTER_REPORT.md. No side effects on the cache. - cli.py: after every successful match the cutter report is regenerated automatically (best-effort; failures are logged and do not abort). - README.md: new top-section "Fuer den Cutter" describing exactly what the editor needs (which two files to look at, how the status flag works, the recommended NLE workflow). The technical algorithm description follows below. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-04 13:09:16 +02:00
parent 06a2326bf1
commit 97a8f9e305
4 changed files with 390 additions and 25 deletions
@@ -0,0 +1,100 @@
 # Cutter-Report — manuelles Nachschneiden
 Stand: 2026-05-04. Frame-Rate: 23.976 fps. Source: BehindTheRedDoor_FTR_1080P_2398_Fixed.mp4 — Trailer: BehindTheRedDoor_Trailer_REFERENCE.mp4.
 Diese Datei wird automatisch aus dem Match-Cache erzeugt. Nach jedem `python cli.py match` mit `python scripts/generate_cutter_report.py` neu generieren.
 ## Wie diese Tabelle zu lesen ist
 - **Beat**: Nummer im Referenz-Trailer.
 - **Trailer In/Out**: SMPTE-Position des Beats im Trailer (h:mm:ss:ff).
 - **Source In/Out**: vorgeschlagene Position im Quellfilm. Bei `MAN.` selbst aussuchen.
 - **Scene**: ID der Source-Szene aus PySceneDetect (nur fuer Debug-Zwecke).
 - **Score**: 0..1, je hoeher desto besser. >=0.65 ist als bestaetigt eingestuft.
 - **Status**:
    - `OK`   — bestaetigt durch CV + Vision-Phasenpruefung, kann ohne weitere Pruefung uebernommen werden.
    - `?`    — vorlaeufig, korrekte Szene aber Score unter 0.65; Bewegungsphase im Vorschauclip pruefen und ggf. um wenige Frames verschieben.
    - `MAN.` — kein automatischer Treffer; entweder manuell suchen oder als Schwarzfade/Titel uebernehmen.
 - **Phase**: was im Trailerbeat zu sehen ist (aus Vision-Beschreibung). Hilft dir, die richtige Stelle im Source zu finden.
 ## Status-Uebersicht
 - **Beats gesamt**: 25
 - **Automatisch gefunden**: 20 (5 davon bestaetigt)
 - **Manuell zu setzen**: 5
 ## Beat-Tabelle
 | Beat | Trailer In / Out | Source In / Out | Scene | Score | Status | Was im Bild zu sehen ist |
 |-----:|------------------|------------------|------:|------:|:------:|---------------------------|
 |    0 | 00:00:00:00-00:00:03:00 | —-— | — | 0.000 | MAN. | logo animation assembling from distorted shapes with motion blur |
 |    1 | 00:00:03:00-00:00:08:10 | 00:00:04:09-00:00:06:03 | 1 | 0.380 | ? |  |
 |    2 | 00:00:08:10-00:00:16:23 | —-— | — | 0.000 | MAN. |  |
 |    3 | 00:00:16:23-00:00:19:03 | 01:02:17:22-01:02:19:14 | 436 | 0.469 | ? |  |
 |    4 | 00:00:19:03-00:00:20:15 | 01:02:21:01-01:02:22:10 | 437 | 0.647 | ? |  |
 |    5 | 00:00:20:15-00:00:26:09 | 00:01:33:04-00:01:37:10 | 10 | 0.501 | ? |  |
 |    6 | 00:00:26:09-00:00:29:06 | 00:01:03:06-00:01:05:21 | 5 | 0.548 | ? |  |
 |    7 | 00:00:29:06-00:00:31:16 | 01:20:10:10-01:20:12:16 | 553 | 0.463 | ? | man appears to be engaged in conversation |
 |    8 | 00:00:31:16-00:00:33:15 | 00:00:51:07-00:00:53:01 | 5 | 0.733 | OK | static or slow drifting |
 |    9 | 00:00:33:15-00:00:36:18 | 01:20:28:20-01:20:31:17 | 557 | 0.529 | ? | speaking, transitioning from closed eyes to open mouth and focused gaze |
 |   10 | 00:00:36:18-00:00:40:02 | 01:20:35:16-01:20:39:00 | 558 | 0.635 | ? | conversation |
 |   11 | 00:00:40:02-00:00:42:03 | 01:20:40:18-01:20:42:18 | 559 | 0.502 | ? | static talking head with slight facial expression changes |
 |   12 | 00:00:42:03-00:00:50:06 | 01:14:26:01-01:14:29:10 | 519 | 0.558 | ? | static profile shot transitioning to black/darkness |
 |   13 | 00:00:50:06-00:00:53:20 | 00:43:20:02-00:43:23:10 | 308 | 0.468 | ? | static conversation; woman on right is standing and holding a cup |
 |   14 | 00:00:53:20-00:00:57:02 | 00:43:24:09-00:43:27:04 | 309 | 0.444 | ? | static conversation, subject holding a white cup |
 |   15 | 00:00:57:02-00:01:01:12 | 00:02:10:11-00:02:12:16 | 0 | 0.467 | ? | static conversation |
 |   16 | 00:01:01:12-00:01:04:12 | 01:05:12:16-01:05:15:06 | 451 | 0.613 | ? | man reaches out and touches the red door with a small object |
 |   17 | 00:01:04:12-00:01:09:03 | 01:31:22:10-01:31:24:09 | 623 | 0.684 | OK | Static intimacy transitioning to a spatial arrangement of figures |
 |   18 | 00:01:09:03-00:01:10:18 | 00:09:13:12-00:09:14:19 | 75 | 0.668 | OK | Woman in foreground turns her head from profile to face the camera while speaking |
 |   19 | 00:01:10:18-00:01:12:12 | 00:16:48:14-00:16:49:15 | 126 | 0.717 | OK | static conversation, subtle facial expression change |
 |   20 | 00:01:12:12-00:01:15:13 | 01:28:04:17-01:28:05:14 | 613 | 0.663 | OK | man kisses woman's forehead, then they pull back slightly to face each other |
 |   21 | 00:01:15:13-00:01:17:12 | —-— | — | 0.000 | MAN. | hand raised to mouth, slight facial movement |
 |   22 | 00:01:17:12-00:01:19:22 | 01:03:05:16-01:03:07:10 | 442 | 0.545 | ? |  |
 |   23 | 00:01:19:22-00:01:25:13 | —-— | — | 0.000 | MAN. |  |
 |   24 | 00:01:25:13-00:01:32:07 | —-— | — | 0.000 | MAN. |  |
 ## Beats die manuelle Aufmerksamkeit brauchen
 ### Manuell setzen (Status `MAN.`)
 - **Beat 0** 00:00:00:00-00:00:03:00: logo animation assembling from distorted shapes with motion blur
 - **Beat 2** 00:00:08:10-00:00:16:23: keine Vision-Beschreibung — vermutlich Title-Card / Fade / Logo
 - **Beat 21** 00:01:15:13-00:01:17:12: hand raised to mouth, slight facial movement
 - **Beat 23** 00:01:19:22-00:01:25:13: keine Vision-Beschreibung — vermutlich Title-Card / Fade / Logo
 - **Beat 24** 00:01:25:13-00:01:32:07: keine Vision-Beschreibung — vermutlich Title-Card / Fade / Logo
 ### Vorlaeufig (Status `?`) — bitte sichten
 | Beat | Score | Source In | Phase laut Vision |
 |-----:|------:|-----------|--------------------|
 |    1 | 0.380 | 00:00:04:09 |  |
 |    3 | 0.469 | 01:02:17:22 |  |
 |    4 | 0.647 | 01:02:21:01 |  |
 |    5 | 0.501 | 00:01:33:04 |  |
 |    6 | 0.548 | 00:01:03:06 |  |
 |    7 | 0.463 | 01:20:10:10 | man appears to be engaged in conversation |
 |    9 | 0.529 | 01:20:28:20 | speaking, transitioning from closed eyes to open mouth and focused gaze |
 |   10 | 0.635 | 01:20:35:16 | conversation |
 |   11 | 0.502 | 01:20:40:18 | static talking head with slight facial expression changes |
 |   12 | 0.558 | 01:14:26:01 | static profile shot transitioning to black/darkness |
 |   13 | 0.468 | 00:43:20:02 | static conversation; woman on right is standing and holding a cup |
 |   14 | 0.444 | 00:43:24:09 | static conversation, subject holding a white cup |
 |   15 | 0.467 | 00:02:10:11 | static conversation |
 |   16 | 0.613 | 01:05:12:16 | man reaches out and touches the red door with a small object |
 |   22 | 0.545 | 01:03:05:16 |  |
 ### Bestaetigt (Status `OK`) — kann uebernommen werden
 | Beat | Score | Source In | Phase laut Vision |
 |-----:|------:|-----------|--------------------|
 |    8 | 0.733 | 00:00:51:07 | static or slow drifting |
 |   17 | 0.684 | 01:31:22:10 | Static intimacy transitioning to a spatial arrangement of figures |
 |   18 | 0.668 | 00:09:13:12 | Woman in foreground turns her head from profile to face the camera while speaking |
 |   19 | 0.717 | 00:16:48:14 | static conversation, subtle facial expression change |
 |   20 | 0.663 | 01:28:04:17 | man kisses woman's forehead, then they pull back slightly to face each other |
 ## Hinweise zur Pruefung
 1. Source-Times sollten zur jeweiligen Trailer-Bewegungsphase passen. Wenn nicht: Source-In innerhalb derselben Source-Szene wenige Frames vor/zurueck verschieben.
 2. Wenn der Source-Clip kuerzer ist als der Trailerbeat (Source-Out < Trailer-Out gerechnet ab Source-In), enthaelt der Trailerbeat eine Blende/Titelkarte; im Schnitt mit Schwarzfade oder Source-Tail auffuellen.
 3. `OK`-Beats sind durch CV + Vision-Phasenpruefung doppelt verifiziert; trotzdem stichprobenartig sichten.
@@ -1,27 +1,63 @@
 # AI Trailer Generator v2
-**Frame-accurate trailer reconstruction via pure Computer Vision**
+**Frame-genaues Nachbauen eines Trailers aus dem Quellfilm.**
-> Gibt einen Reference Trailer und den dazugehörigen Quellfilm hinein — bekommt eine fertige FCPXML/EDL heraus, die den Trailer Frame-genau aus dem Quellfilm nachbaut.
+Du gibst zwei Videos rein — einen Referenz-Trailer und den dazugehörigen
 Spielfilm — und bekommst eine fertige FCPXML/EDL für deinen Schnittplatz, die
 den Trailer Beat für Beat aus dem Quellfilm nachbaut.
 ---
-## Das Kernprinzip
+## Für den Cutter — was du wirklich brauchst
-Standardmäßig kein LLM für visuelles Matching. Optional kann ein Vision-Layer
+Du musst dieses Tool **nicht selbst bedienen** und musst **kein Python können**.
-gecachte 3-Frame-Beschreibungen als zusätzliche Suchanker liefern; der finale
+Was du bekommst sind zwei Dateien, mit denen du arbeitest:
 Match bleibt aber CV-verifiziert.
-| Phase | Was passiert | Technologie |
+1. **`CUTTER_REPORT.md`** — die Tabelle für die manuelle Kontrolle und das
-|-------|-------------|-------------|
+   Nachschneiden. Pro Beat steht drin:
-| **0 — Prep** | Reference Trailer analysieren & Beats extrahieren | PySceneDetect + OpenCV |
+   - der Trailer-Zeitcode (h:mm:ss:ff),
-| **1 — Global Scan**| Gesamten Quellfilm via FFmpeg-Stream (2 FPS) gegen alle Beats scannen | FFmpeg Pipe + Luma-Histogramm |
+   - der vorgeschlagene Source-Zeitcode aus dem Spielfilm,
-| **1b — Optional Vision Seeds** | Unsichere Top-K Szenen mit 3-Frame-Beschreibungen cachen | OpenAI-kompatibles Vision-LLM |
+   - ein Status: `OK` (kann übernommen werden), `?` (bitte sichten) oder
-| **2 — Refine** | Beste Treffer auf Frame-Ebene präzisieren | OpenCV `matchTemplate` |
+     `MAN.` (kein Treffer, manuell setzen),
-| **3 — Dramaturgie** | Narrative BeatType-Klassifikation aus Dialog-Text | OpenRouter LLM |
+   - eine kurze Beschreibung, was im Trailer-Beat zu sehen ist (damit du
-| **4 — Export** | Timeline → FCPXML 1.10 oder CMX 3600 EDL | xml.etree + eigener Timecode-Layer |
+     die richtige Stelle im Source schneller findest).
 2. **`output/*.fcpxml`** und **`output/*.edl`** — die fertige Timeline für
   FCP / Premiere / Avid / Resolve. Beats mit Status `OK` sind dort schon
   richtig gesetzt; `?` und `MAN.` musst du im NLE prüfen bzw. selbst setzen.
-**Text-Safe Crop:** Obere 15% und untere 30% des Frames werden vor jedem Vergleich ausgeblendet, um Title Cards, Logos und Letterbox zu ignorieren.
+**Workflow-Empfehlung:**
 1. Öffne `CUTTER_REPORT.md` und arbeite die Tabelle von oben nach unten ab.
 2. Importiere die FCPXML/EDL ins NLE, lade Trailer und Spielfilm dazu.
 3. Bei `OK`-Beats nur stichprobenartig sichten.
 4. Bei `?`-Beats den Vorschauclip aus dem Report-HTML (siehe unten) prüfen
   und im NLE den Source-In um wenige Frames vor/zurück verschieben, bis die
   Bewegungsphase exakt zum Trailer passt.
 5. Bei `MAN.`-Beats selbst die passende Stelle im Spielfilm suchen — die
   Beschreibung im Report sagt dir was du suchst.
 Alles andere unten ist Hintergrund für den Tool-Verantwortlichen.
 ---
 ## Wie das Tool die Treffer findet (Kurzfassung)
 | Phase | Was passiert |
 |-------|--------------|
 | **0** | Trailer in Beats zerlegen (PySceneDetect). |
 | **1** | Schneller Vibe-Check: für jeden Beat die Top-K ähnlichsten Szenen aus dem Spielfilm vorauswählen (Histogramm + pHash). |
 | **2** | Optional: Vision-LLM beschreibt unsichere Szenen mit 3-Frame-Samples; die Beschreibungen liegen gecached vor. |
 | **3** | Frame-genaue Verfeinerung pro Beat (OpenCV-Templatematching, Bewegungsphasen-Vergleich). |
 | **4** | Phasen-Reparatur: bei segmentierten Beats wird die Bewegungsphase im Source mit der sichtbaren Trailerphase abgeglichen. |
 | **5** | Recovery: Beats ohne Treffer werden via Vision-Phasensuche in den Top-K Szenen nochmal probiert. |
 | **6** | Export als FCPXML 1.10 oder CMX-3600-EDL plus `CUTTER_REPORT.md`. |
 **Text-Safe Crop:** Obere 15 % und untere 30 % jedes Frames werden vor dem
 Vergleich ausgeblendet, damit Title-Cards, Logos und Letterbox die Treffer
 nicht verfälschen.
 **Wichtig:** Auch wenn Vision aktiviert ist — der finale Match bleibt
 CV-verifiziert. Das LLM liefert nur zusätzliche Suchanker.
 ---
@@ -92,6 +92,22 @@ def _save_results(results: list, cfg: "AppConfig") -> None:  # type: ignore[name
    logging.getLogger(__name__).info("Match results cached → %s", p)
 def _regenerate_cutter_report(cfg: "AppConfig") -> None:  # type: ignore[name-defined]
    """Re-render CUTTER_REPORT.md after each cache write so it stays in sync."""
    try:
        from scripts.generate_cutter_report import render_report
    except Exception as exc:
        logging.getLogger(__name__).warning("Cutter report regen skipped: %s", exc)
        return
    try:
        project_root = cfg.paths.cache_dir.parent
        out = project_root / "CUTTER_REPORT.md"
        out.write_text(render_report(project_root), encoding="utf-8")
        logging.getLogger(__name__).info("Cutter report regenerated → %s", out)
    except Exception as exc:
        logging.getLogger(__name__).warning("Cutter report regen failed: %s", exc)
 def _load_results(cfg: "AppConfig") -> list:  # type: ignore[name-defined]
    from src.core.models import MatchResult, MatchSegment
    p = _results_cache_path(cfg)
@@ -676,18 +692,23 @@ def _recover_unmatched_beats_via_vision(results: list, beats: list, cfg) -> list
            islands = _reference_scoreable_segments(beat, cfg)
        except Exception:
            islands = []
        if not islands:
            # Pure fade/title material — no recovery possible by design.
            continue
-        # Use the longest visible island as the target for the recovery search.
+        # Anchor selection: prefer the longest visible island; if none exists,
-        anchor_start_s, anchor_end_s = max(islands, key=lambda iv: iv[1] - iv[0])
+        # fall back to the full beat. The latter handles dark / low-contrast
        # close-ups that drop below the scoreable luma/contrast thresholds but
        # are still semantically describable. The strict vision phase
        # validation later in this pass keeps us from accepting pure title-card
        # or logo material.
        from dataclasses import replace as _replace
        if islands:
            anchor_start_s, anchor_end_s = max(islands, key=lambda iv: iv[1] - iv[0])
            anchor_beat = _replace(
                beat,
                start_s=beat.start_s + anchor_start_s,
                end_s=beat.start_s + anchor_end_s,
            )
        else:
            anchor_beat = beat
        try:
            hits = run_vibe_check(
@@ -1469,6 +1490,7 @@ def cmd_match(args: argparse.Namespace, cfg) -> list:
        results_to_save = results
    _save_results(results_to_save, cfg)
    _regenerate_cutter_report(cfg)
    print(f"\n✅  {len(results)} / {len(beats)} beats matched.")
    for r in results:
@@ -0,0 +1,207 @@
 """
 scripts/generate_cutter_report.py — generate CUTTER_REPORT.md from current cache
 Regenerates CUTTER_REPORT.md from .cache/match_results.json,
 .cache/trailer_beats.json and .cache/vision_descriptions.json. The report is a
 hand-off document for a video editor (Cutter) doing the manual recut: it lists,
 per beat, the trailer position, the proposed source position in SMPTE
 timecodes, the match score, and what the vision model saw in the trailer beat.
 Usage (from project root):
    python scripts/generate_cutter_report.py
 Run this any time after `python cli.py match` to keep CUTTER_REPORT.md in sync
 with the latest cache.
 """
 from __future__ import annotations
 import json
 import re
 import sys
 from datetime import date
 from pathlib import Path
 def smpte(t: float | None, fps: int) -> str:
    if t is None:
        return "--:--:--:--"
    total = int(round(t * fps))
    h = total // (3600 * fps)
    m = (total // (60 * fps)) % 60
    s = (total // fps) % 60
    f = total % fps
    return f"{h:02d}:{m:02d}:{s:02d}:{f:02d}"
 def best_beat_description(items: dict, beat_id: int, start_s: float, end_s: float) -> str | None:
    best, best_diff = None, 1e9
    for key, value in items.items():
        if not key.startswith(f"beat:{beat_id}:") or not isinstance(value, dict):
            continue
        try:
            parts = key.split(":")
            ks, ke = float(parts[2]), float(parts[3])
        except (IndexError, ValueError):
            continue
        diff = abs(ks - start_s) + abs(ke - end_s)
        if diff < best_diff:
            best_diff = diff
            best = value
    return best.get("description", "") if best else None
 def parse_field(desc: str | None, key: str) -> str:
    if not desc:
        return ""
    match = re.search(rf'"{key}"\s*:\s*"([^"]+)"', desc)
    return match.group(1) if match else ""
 def render_report(project_root: Path) -> str:
    sys.path.insert(0, str(project_root))
    from src.core.config import load_config
    cfg = load_config(project_root / "config.toml")
    fps = int(round(cfg.export.edl_frame_rate))
    cache = project_root / ".cache"
    results = {r["beat_id"]: r for r in json.loads((cache / "match_results.json").read_text())}
    beats = json.loads((cache / "trailer_beats.json").read_text())
    vis_path = cache / "vision_descriptions.json"
    vis_items = json.loads(vis_path.read_text())["items"] if vis_path.exists() else {}
    lines: list[str] = []
    lines.append("# Cutter-Report — manuelles Nachschneiden")
    lines.append("")
    lines.append(
        f"Stand: {date.today().isoformat()}. Frame-Rate: {cfg.export.edl_frame_rate} fps. "
        f"Source: {Path(cfg.paths.source_movie).name} — Trailer: {Path(cfg.paths.reference_trailer).name}."
    )
    lines.append("")
    lines.append(
        "Diese Datei wird automatisch aus dem Match-Cache erzeugt. "
        "Nach jedem `python cli.py match` mit `python scripts/generate_cutter_report.py` neu generieren."
    )
    lines.append("")
    lines.append("## Wie diese Tabelle zu lesen ist")
    lines.append("")
    lines.append("- **Beat**: Nummer im Referenz-Trailer.")
    lines.append("- **Trailer In/Out**: SMPTE-Position des Beats im Trailer (h:mm:ss:ff).")
    lines.append("- **Source In/Out**: vorgeschlagene Position im Quellfilm. Bei `MAN.` selbst aussuchen.")
    lines.append("- **Scene**: ID der Source-Szene aus PySceneDetect (nur fuer Debug-Zwecke).")
    lines.append("- **Score**: 0..1, je hoeher desto besser. >=0.65 ist als bestaetigt eingestuft.")
    lines.append("- **Status**:")
    lines.append("    - `OK`   — bestaetigt durch CV + Vision-Phasenpruefung, kann ohne weitere Pruefung uebernommen werden.")
    lines.append("    - `?`    — vorlaeufig, korrekte Szene aber Score unter 0.65; Bewegungsphase im Vorschauclip pruefen und ggf. um wenige Frames verschieben.")
    lines.append("    - `MAN.` — kein automatischer Treffer; entweder manuell suchen oder als Schwarzfade/Titel uebernehmen.")
    lines.append("- **Phase**: was im Trailerbeat zu sehen ist (aus Vision-Beschreibung). Hilft dir, die richtige Stelle im Source zu finden.")
    lines.append("")
    matched = sum(1 for b in beats if b["beat_id"] in results)
    confirmed = sum(1 for b in beats if b["beat_id"] in results and results[b["beat_id"]]["is_confirmed"])
    lines.append("## Status-Uebersicht")
    lines.append("")
    lines.append(f"- **Beats gesamt**: {len(beats)}")
    lines.append(f"- **Automatisch gefunden**: {matched} ({confirmed} davon bestaetigt)")
    lines.append(f"- **Manuell zu setzen**: {len(beats) - matched}")
    lines.append("")
    lines.append("## Beat-Tabelle")
    lines.append("")
    lines.append("| Beat | Trailer In / Out | Source In / Out | Scene | Score | Status | Was im Bild zu sehen ist |")
    lines.append("|-----:|------------------|------------------|------:|------:|:------:|---------------------------|")
    def status_for(rec: dict | None) -> str:
        if rec is None:
            return "MAN."
        return "OK" if rec.get("is_confirmed") else "?"
    for beat in beats:
        bid = beat["beat_id"]
        rec = results.get(bid)
        ti, to = smpte(beat["start_s"], fps), smpte(beat["end_s"], fps)
        if rec is not None:
            si, so = smpte(rec["in_point_s"], fps), smpte(rec["out_point_s"], fps)
            scn = rec["scene_id"]
            sc = rec["match_score"]
        else:
            si = so = "—"
            scn = "—"
            sc = 0.0
        desc = best_beat_description(vis_items, bid, beat["start_s"], beat["end_s"]) or ""
        phase = (parse_field(desc, "action_phase") or parse_field(desc, "subject"))[:90]
        lines.append(f"| {bid:>4} | {ti}-{to} | {si}-{so} | {scn} | {sc:.3f} | {status_for(rec)} | {phase} |")
    lines.append("")
    lines.append("## Beats die manuelle Aufmerksamkeit brauchen")
    lines.append("")
    lines.append("### Manuell setzen (Status `MAN.`)")
    lines.append("")
    for beat in beats:
        bid = beat["beat_id"]
        if bid in results:
            continue
        ti, to = smpte(beat["start_s"], fps), smpte(beat["end_s"], fps)
        desc = best_beat_description(vis_items, bid, beat["start_s"], beat["end_s"]) or ""
        phase = parse_field(desc, "action_phase")
        note = phase or "keine Vision-Beschreibung — vermutlich Title-Card / Fade / Logo"
        lines.append(f"- **Beat {bid}** {ti}-{to}: {note}")
    lines.append("")
    lines.append("### Vorlaeufig (Status `?`) — bitte sichten")
    lines.append("")
    lines.append("| Beat | Score | Source In | Phase laut Vision |")
    lines.append("|-----:|------:|-----------|--------------------|")
    for beat in beats:
        bid = beat["beat_id"]
        rec = results.get(bid)
        if rec is None or rec.get("is_confirmed"):
            continue
        desc = best_beat_description(vis_items, bid, beat["start_s"], beat["end_s"]) or ""
        phase = parse_field(desc, "action_phase")
        lines.append(f"| {bid:>4} | {rec['match_score']:.3f} | {smpte(rec['in_point_s'], fps)} | {phase[:90]} |")
    lines.append("")
    lines.append("### Bestaetigt (Status `OK`) — kann uebernommen werden")
    lines.append("")
    lines.append("| Beat | Score | Source In | Phase laut Vision |")
    lines.append("|-----:|------:|-----------|--------------------|")
    for beat in beats:
        bid = beat["beat_id"]
        rec = results.get(bid)
        if rec is None or not rec.get("is_confirmed"):
            continue
        desc = best_beat_description(vis_items, bid, beat["start_s"], beat["end_s"]) or ""
        phase = parse_field(desc, "action_phase")
        lines.append(f"| {bid:>4} | {rec['match_score']:.3f} | {smpte(rec['in_point_s'], fps)} | {phase[:90]} |")
    lines.append("")
    lines.append("## Hinweise zur Pruefung")
    lines.append("")
    lines.append(
        "1. Source-Times sollten zur jeweiligen Trailer-Bewegungsphase passen. "
        "Wenn nicht: Source-In innerhalb derselben Source-Szene wenige Frames vor/zurueck verschieben."
    )
    lines.append(
        "2. Wenn der Source-Clip kuerzer ist als der Trailerbeat (Source-Out < Trailer-Out gerechnet ab Source-In), "
        "enthaelt der Trailerbeat eine Blende/Titelkarte; im Schnitt mit Schwarzfade oder Source-Tail auffuellen."
    )
    lines.append(
        "3. `OK`-Beats sind durch CV + Vision-Phasenpruefung doppelt verifiziert; trotzdem stichprobenartig sichten."
    )
    lines.append("")
    return "\n".join(lines)
 def main() -> int:
    here = Path(__file__).resolve().parent
    project_root = here.parent
    out = project_root / "CUTTER_REPORT.md"
    out.write_text(render_report(project_root), encoding="utf-8")
    print(f"Wrote {out}")
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())