Compare commits
2 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 97a8f9e305 | |||
| 06a2326bf1 |
@@ -0,0 +1,100 @@
|
|||||||
|
# Cutter-Report — manuelles Nachschneiden
|
||||||
|
|
||||||
|
Stand: 2026-05-04. Frame-Rate: 23.976 fps. Source: BehindTheRedDoor_FTR_1080P_2398_Fixed.mp4 — Trailer: BehindTheRedDoor_Trailer_REFERENCE.mp4.
|
||||||
|
|
||||||
|
Diese Datei wird automatisch aus dem Match-Cache erzeugt. Nach jedem `python cli.py match` mit `python scripts/generate_cutter_report.py` neu generieren.
|
||||||
|
|
||||||
|
## Wie diese Tabelle zu lesen ist
|
||||||
|
|
||||||
|
- **Beat**: Nummer im Referenz-Trailer.
|
||||||
|
- **Trailer In/Out**: SMPTE-Position des Beats im Trailer (h:mm:ss:ff).
|
||||||
|
- **Source In/Out**: vorgeschlagene Position im Quellfilm. Bei `MAN.` selbst aussuchen.
|
||||||
|
- **Scene**: ID der Source-Szene aus PySceneDetect (nur fuer Debug-Zwecke).
|
||||||
|
- **Score**: 0..1, je hoeher desto besser. >=0.65 ist als bestaetigt eingestuft.
|
||||||
|
- **Status**:
|
||||||
|
- `OK` — bestaetigt durch CV + Vision-Phasenpruefung, kann ohne weitere Pruefung uebernommen werden.
|
||||||
|
- `?` — vorlaeufig, korrekte Szene aber Score unter 0.65; Bewegungsphase im Vorschauclip pruefen und ggf. um wenige Frames verschieben.
|
||||||
|
- `MAN.` — kein automatischer Treffer; entweder manuell suchen oder als Schwarzfade/Titel uebernehmen.
|
||||||
|
- **Phase**: was im Trailerbeat zu sehen ist (aus Vision-Beschreibung). Hilft dir, die richtige Stelle im Source zu finden.
|
||||||
|
|
||||||
|
## Status-Uebersicht
|
||||||
|
|
||||||
|
- **Beats gesamt**: 25
|
||||||
|
- **Automatisch gefunden**: 20 (5 davon bestaetigt)
|
||||||
|
- **Manuell zu setzen**: 5
|
||||||
|
|
||||||
|
## Beat-Tabelle
|
||||||
|
|
||||||
|
| Beat | Trailer In / Out | Source In / Out | Scene | Score | Status | Was im Bild zu sehen ist |
|
||||||
|
|-----:|------------------|------------------|------:|------:|:------:|---------------------------|
|
||||||
|
| 0 | 00:00:00:00-00:00:03:00 | —-— | — | 0.000 | MAN. | logo animation assembling from distorted shapes with motion blur |
|
||||||
|
| 1 | 00:00:03:00-00:00:08:10 | 00:00:04:09-00:00:06:03 | 1 | 0.380 | ? | |
|
||||||
|
| 2 | 00:00:08:10-00:00:16:23 | —-— | — | 0.000 | MAN. | |
|
||||||
|
| 3 | 00:00:16:23-00:00:19:03 | 01:02:17:22-01:02:19:14 | 436 | 0.469 | ? | |
|
||||||
|
| 4 | 00:00:19:03-00:00:20:15 | 01:02:21:01-01:02:22:10 | 437 | 0.647 | ? | |
|
||||||
|
| 5 | 00:00:20:15-00:00:26:09 | 00:01:33:04-00:01:37:10 | 10 | 0.501 | ? | |
|
||||||
|
| 6 | 00:00:26:09-00:00:29:06 | 00:01:03:06-00:01:05:21 | 5 | 0.548 | ? | |
|
||||||
|
| 7 | 00:00:29:06-00:00:31:16 | 01:20:10:10-01:20:12:16 | 553 | 0.463 | ? | man appears to be engaged in conversation |
|
||||||
|
| 8 | 00:00:31:16-00:00:33:15 | 00:00:51:07-00:00:53:01 | 5 | 0.733 | OK | static or slow drifting |
|
||||||
|
| 9 | 00:00:33:15-00:00:36:18 | 01:20:28:20-01:20:31:17 | 557 | 0.529 | ? | speaking, transitioning from closed eyes to open mouth and focused gaze |
|
||||||
|
| 10 | 00:00:36:18-00:00:40:02 | 01:20:35:16-01:20:39:00 | 558 | 0.635 | ? | conversation |
|
||||||
|
| 11 | 00:00:40:02-00:00:42:03 | 01:20:40:18-01:20:42:18 | 559 | 0.502 | ? | static talking head with slight facial expression changes |
|
||||||
|
| 12 | 00:00:42:03-00:00:50:06 | 01:14:26:01-01:14:29:10 | 519 | 0.558 | ? | static profile shot transitioning to black/darkness |
|
||||||
|
| 13 | 00:00:50:06-00:00:53:20 | 00:43:20:02-00:43:23:10 | 308 | 0.468 | ? | static conversation; woman on right is standing and holding a cup |
|
||||||
|
| 14 | 00:00:53:20-00:00:57:02 | 00:43:24:09-00:43:27:04 | 309 | 0.444 | ? | static conversation, subject holding a white cup |
|
||||||
|
| 15 | 00:00:57:02-00:01:01:12 | 00:02:10:11-00:02:12:16 | 0 | 0.467 | ? | static conversation |
|
||||||
|
| 16 | 00:01:01:12-00:01:04:12 | 01:05:12:16-01:05:15:06 | 451 | 0.613 | ? | man reaches out and touches the red door with a small object |
|
||||||
|
| 17 | 00:01:04:12-00:01:09:03 | 01:31:22:10-01:31:24:09 | 623 | 0.684 | OK | Static intimacy transitioning to a spatial arrangement of figures |
|
||||||
|
| 18 | 00:01:09:03-00:01:10:18 | 00:09:13:12-00:09:14:19 | 75 | 0.668 | OK | Woman in foreground turns her head from profile to face the camera while speaking |
|
||||||
|
| 19 | 00:01:10:18-00:01:12:12 | 00:16:48:14-00:16:49:15 | 126 | 0.717 | OK | static conversation, subtle facial expression change |
|
||||||
|
| 20 | 00:01:12:12-00:01:15:13 | 01:28:04:17-01:28:05:14 | 613 | 0.663 | OK | man kisses woman's forehead, then they pull back slightly to face each other |
|
||||||
|
| 21 | 00:01:15:13-00:01:17:12 | —-— | — | 0.000 | MAN. | hand raised to mouth, slight facial movement |
|
||||||
|
| 22 | 00:01:17:12-00:01:19:22 | 01:03:05:16-01:03:07:10 | 442 | 0.545 | ? | |
|
||||||
|
| 23 | 00:01:19:22-00:01:25:13 | —-— | — | 0.000 | MAN. | |
|
||||||
|
| 24 | 00:01:25:13-00:01:32:07 | —-— | — | 0.000 | MAN. | |
|
||||||
|
|
||||||
|
## Beats die manuelle Aufmerksamkeit brauchen
|
||||||
|
|
||||||
|
### Manuell setzen (Status `MAN.`)
|
||||||
|
|
||||||
|
- **Beat 0** 00:00:00:00-00:00:03:00: logo animation assembling from distorted shapes with motion blur
|
||||||
|
- **Beat 2** 00:00:08:10-00:00:16:23: keine Vision-Beschreibung — vermutlich Title-Card / Fade / Logo
|
||||||
|
- **Beat 21** 00:01:15:13-00:01:17:12: hand raised to mouth, slight facial movement
|
||||||
|
- **Beat 23** 00:01:19:22-00:01:25:13: keine Vision-Beschreibung — vermutlich Title-Card / Fade / Logo
|
||||||
|
- **Beat 24** 00:01:25:13-00:01:32:07: keine Vision-Beschreibung — vermutlich Title-Card / Fade / Logo
|
||||||
|
|
||||||
|
### Vorlaeufig (Status `?`) — bitte sichten
|
||||||
|
|
||||||
|
| Beat | Score | Source In | Phase laut Vision |
|
||||||
|
|-----:|------:|-----------|--------------------|
|
||||||
|
| 1 | 0.380 | 00:00:04:09 | |
|
||||||
|
| 3 | 0.469 | 01:02:17:22 | |
|
||||||
|
| 4 | 0.647 | 01:02:21:01 | |
|
||||||
|
| 5 | 0.501 | 00:01:33:04 | |
|
||||||
|
| 6 | 0.548 | 00:01:03:06 | |
|
||||||
|
| 7 | 0.463 | 01:20:10:10 | man appears to be engaged in conversation |
|
||||||
|
| 9 | 0.529 | 01:20:28:20 | speaking, transitioning from closed eyes to open mouth and focused gaze |
|
||||||
|
| 10 | 0.635 | 01:20:35:16 | conversation |
|
||||||
|
| 11 | 0.502 | 01:20:40:18 | static talking head with slight facial expression changes |
|
||||||
|
| 12 | 0.558 | 01:14:26:01 | static profile shot transitioning to black/darkness |
|
||||||
|
| 13 | 0.468 | 00:43:20:02 | static conversation; woman on right is standing and holding a cup |
|
||||||
|
| 14 | 0.444 | 00:43:24:09 | static conversation, subject holding a white cup |
|
||||||
|
| 15 | 0.467 | 00:02:10:11 | static conversation |
|
||||||
|
| 16 | 0.613 | 01:05:12:16 | man reaches out and touches the red door with a small object |
|
||||||
|
| 22 | 0.545 | 01:03:05:16 | |
|
||||||
|
|
||||||
|
### Bestaetigt (Status `OK`) — kann uebernommen werden
|
||||||
|
|
||||||
|
| Beat | Score | Source In | Phase laut Vision |
|
||||||
|
|-----:|------:|-----------|--------------------|
|
||||||
|
| 8 | 0.733 | 00:00:51:07 | static or slow drifting |
|
||||||
|
| 17 | 0.684 | 01:31:22:10 | Static intimacy transitioning to a spatial arrangement of figures |
|
||||||
|
| 18 | 0.668 | 00:09:13:12 | Woman in foreground turns her head from profile to face the camera while speaking |
|
||||||
|
| 19 | 0.717 | 00:16:48:14 | static conversation, subtle facial expression change |
|
||||||
|
| 20 | 0.663 | 01:28:04:17 | man kisses woman's forehead, then they pull back slightly to face each other |
|
||||||
|
|
||||||
|
## Hinweise zur Pruefung
|
||||||
|
|
||||||
|
1. Source-Times sollten zur jeweiligen Trailer-Bewegungsphase passen. Wenn nicht: Source-In innerhalb derselben Source-Szene wenige Frames vor/zurueck verschieben.
|
||||||
|
2. Wenn der Source-Clip kuerzer ist als der Trailerbeat (Source-Out < Trailer-Out gerechnet ab Source-In), enthaelt der Trailerbeat eine Blende/Titelkarte; im Schnitt mit Schwarzfade oder Source-Tail auffuellen.
|
||||||
|
3. `OK`-Beats sind durch CV + Vision-Phasenpruefung doppelt verifiziert; trotzdem stichprobenartig sichten.
|
||||||
+14
@@ -88,6 +88,20 @@ Wenn das fehlschlägt:
|
|||||||
existieren, sonst ruft Vision live ab (kostet Credits; braucht Netz).
|
existieren, sonst ruft Vision live ab (kostet Credits; braucht Netz).
|
||||||
3. `match_results.json.bak` zurückspielen, falls der Cache zerschossen ist.
|
3. `match_results.json.bak` zurückspielen, falls der Cache zerschossen ist.
|
||||||
|
|
||||||
|
## Aktuelle Coverage (vor neuestem Lauf)
|
||||||
|
|
||||||
|
```
|
||||||
|
total beats: 25
|
||||||
|
matched: 20 (5 confirmed, 15 provisional)
|
||||||
|
unmatched: beats 0, 2, 21, 23, 24
|
||||||
|
```
|
||||||
|
|
||||||
|
Beat 0 ist das SHO-Logo (kein Source-Match möglich, korrekt).
|
||||||
|
Beats 22/23/24 haben keine sichtbaren Inseln (Endcredits/Title) — auch
|
||||||
|
korrekt unmatched.
|
||||||
|
Beat 2 und Beat 21 sind die echten Recovery-Kandidaten; die neue
|
||||||
|
Recovery-Stufe versucht sie beim nächsten `match`-Lauf nachzuziehen.
|
||||||
|
|
||||||
## Offene Risiken / Bekannte Schwächen
|
## Offene Risiken / Bekannte Schwächen
|
||||||
|
|
||||||
- Die Schwelle `0.06` für "Beat-Kontext gewinnt" in `realign_window` ist
|
- Die Schwelle `0.06` für "Beat-Kontext gewinnt" in `realign_window` ist
|
||||||
|
|||||||
@@ -1,27 +1,63 @@
|
|||||||
# AI Trailer Generator v2
|
# AI Trailer Generator v2
|
||||||
|
|
||||||
**Frame-accurate trailer reconstruction via pure Computer Vision**
|
**Frame-genaues Nachbauen eines Trailers aus dem Quellfilm.**
|
||||||
|
|
||||||
> Gibt einen Reference Trailer und den dazugehörigen Quellfilm hinein — bekommt eine fertige FCPXML/EDL heraus, die den Trailer Frame-genau aus dem Quellfilm nachbaut.
|
Du gibst zwei Videos rein — einen Referenz-Trailer und den dazugehörigen
|
||||||
|
Spielfilm — und bekommst eine fertige FCPXML/EDL für deinen Schnittplatz, die
|
||||||
|
den Trailer Beat für Beat aus dem Quellfilm nachbaut.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Das Kernprinzip
|
## Für den Cutter — was du wirklich brauchst
|
||||||
|
|
||||||
Standardmäßig kein LLM für visuelles Matching. Optional kann ein Vision-Layer
|
Du musst dieses Tool **nicht selbst bedienen** und musst **kein Python können**.
|
||||||
gecachte 3-Frame-Beschreibungen als zusätzliche Suchanker liefern; der finale
|
Was du bekommst sind zwei Dateien, mit denen du arbeitest:
|
||||||
Match bleibt aber CV-verifiziert.
|
|
||||||
|
|
||||||
| Phase | Was passiert | Technologie |
|
1. **`CUTTER_REPORT.md`** — die Tabelle für die manuelle Kontrolle und das
|
||||||
|-------|-------------|-------------|
|
Nachschneiden. Pro Beat steht drin:
|
||||||
| **0 — Prep** | Reference Trailer analysieren & Beats extrahieren | PySceneDetect + OpenCV |
|
- der Trailer-Zeitcode (h:mm:ss:ff),
|
||||||
| **1 — Global Scan**| Gesamten Quellfilm via FFmpeg-Stream (2 FPS) gegen alle Beats scannen | FFmpeg Pipe + Luma-Histogramm |
|
- der vorgeschlagene Source-Zeitcode aus dem Spielfilm,
|
||||||
| **1b — Optional Vision Seeds** | Unsichere Top-K Szenen mit 3-Frame-Beschreibungen cachen | OpenAI-kompatibles Vision-LLM |
|
- ein Status: `OK` (kann übernommen werden), `?` (bitte sichten) oder
|
||||||
| **2 — Refine** | Beste Treffer auf Frame-Ebene präzisieren | OpenCV `matchTemplate` |
|
`MAN.` (kein Treffer, manuell setzen),
|
||||||
| **3 — Dramaturgie** | Narrative BeatType-Klassifikation aus Dialog-Text | OpenRouter LLM |
|
- eine kurze Beschreibung, was im Trailer-Beat zu sehen ist (damit du
|
||||||
| **4 — Export** | Timeline → FCPXML 1.10 oder CMX 3600 EDL | xml.etree + eigener Timecode-Layer |
|
die richtige Stelle im Source schneller findest).
|
||||||
|
2. **`output/*.fcpxml`** und **`output/*.edl`** — die fertige Timeline für
|
||||||
|
FCP / Premiere / Avid / Resolve. Beats mit Status `OK` sind dort schon
|
||||||
|
richtig gesetzt; `?` und `MAN.` musst du im NLE prüfen bzw. selbst setzen.
|
||||||
|
|
||||||
**Text-Safe Crop:** Obere 15% und untere 30% des Frames werden vor jedem Vergleich ausgeblendet, um Title Cards, Logos und Letterbox zu ignorieren.
|
**Workflow-Empfehlung:**
|
||||||
|
|
||||||
|
1. Öffne `CUTTER_REPORT.md` und arbeite die Tabelle von oben nach unten ab.
|
||||||
|
2. Importiere die FCPXML/EDL ins NLE, lade Trailer und Spielfilm dazu.
|
||||||
|
3. Bei `OK`-Beats nur stichprobenartig sichten.
|
||||||
|
4. Bei `?`-Beats den Vorschauclip aus dem Report-HTML (siehe unten) prüfen
|
||||||
|
und im NLE den Source-In um wenige Frames vor/zurück verschieben, bis die
|
||||||
|
Bewegungsphase exakt zum Trailer passt.
|
||||||
|
5. Bei `MAN.`-Beats selbst die passende Stelle im Spielfilm suchen — die
|
||||||
|
Beschreibung im Report sagt dir was du suchst.
|
||||||
|
|
||||||
|
Alles andere unten ist Hintergrund für den Tool-Verantwortlichen.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wie das Tool die Treffer findet (Kurzfassung)
|
||||||
|
|
||||||
|
| Phase | Was passiert |
|
||||||
|
|-------|--------------|
|
||||||
|
| **0** | Trailer in Beats zerlegen (PySceneDetect). |
|
||||||
|
| **1** | Schneller Vibe-Check: für jeden Beat die Top-K ähnlichsten Szenen aus dem Spielfilm vorauswählen (Histogramm + pHash). |
|
||||||
|
| **2** | Optional: Vision-LLM beschreibt unsichere Szenen mit 3-Frame-Samples; die Beschreibungen liegen gecached vor. |
|
||||||
|
| **3** | Frame-genaue Verfeinerung pro Beat (OpenCV-Templatematching, Bewegungsphasen-Vergleich). |
|
||||||
|
| **4** | Phasen-Reparatur: bei segmentierten Beats wird die Bewegungsphase im Source mit der sichtbaren Trailerphase abgeglichen. |
|
||||||
|
| **5** | Recovery: Beats ohne Treffer werden via Vision-Phasensuche in den Top-K Szenen nochmal probiert. |
|
||||||
|
| **6** | Export als FCPXML 1.10 oder CMX-3600-EDL plus `CUTTER_REPORT.md`. |
|
||||||
|
|
||||||
|
**Text-Safe Crop:** Obere 15 % und untere 30 % jedes Frames werden vor dem
|
||||||
|
Vergleich ausgeblendet, damit Title-Cards, Logos und Letterbox die Treffer
|
||||||
|
nicht verfälschen.
|
||||||
|
|
||||||
|
**Wichtig:** Auch wenn Vision aktiviert ist — der finale Match bleibt
|
||||||
|
CV-verifiziert. Das LLM liefert nur zusätzliche Suchanker.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -310,6 +346,21 @@ beim Verbindungsaufbau. Schlägt die Vision-Verifikation während der finalen
|
|||||||
Filter-/Repair-Stufe trotzdem dauerhaft fehl, wird der bisherige gecachte
|
Filter-/Repair-Stufe trotzdem dauerhaft fehl, wird der bisherige gecachte
|
||||||
Treffer für diesen Beat behalten statt verworfen — ein Netzproblem darf keinen
|
Treffer für diesen Beat behalten statt verworfen — ein Netzproblem darf keinen
|
||||||
schon korrekt gefundenen Match aus dem Cache löschen.
|
schon korrekt gefundenen Match aus dem Cache löschen.
|
||||||
|
Die Phasen-Reparatur an gefundenen Treffern läuft nicht mehr nur in „langen"
|
||||||
|
Source-Szenen, sondern überall dort, wo die Szene mehr als nur das
|
||||||
|
Segment-Fenster trägt. Eine korrigierte Position wird übernommen, sobald sie
|
||||||
|
das Bildinhalt-Validate besteht UND nicht spürbar schlechter scort als das
|
||||||
|
Original (≤ 0.02 Verlust). Bereits bestätigte Treffer in eng zugeschnittenen
|
||||||
|
Szenen werden bewusst nicht angefasst, damit ein guter Match nicht durch eine
|
||||||
|
nominell gleichwertige Alternative ausgetauscht wird.
|
||||||
|
Beats, die nach dem CV-Lauf weder als Vollmatch noch als Segmentmatch landen,
|
||||||
|
durchlaufen anschließend eine Recovery-Stufe: Vibe-Check (Histogramm/pHash)
|
||||||
|
liefert Top-K Kandidatenszenen, die semantische Action-Window-Suche prüft
|
||||||
|
darin die Phase des sichtbaren Trailerbeat-Anteils, und der CV-Aligner setzt
|
||||||
|
den Inpoint frame-genau. Übernommen wird nur ein Kandidat, der dieselbe
|
||||||
|
Vision-Phasenvalidierung wie der Hauptpfad besteht. Beats ohne sichtbares
|
||||||
|
Bildmaterial (Logos, Titel-Karten, durchgehende Fades) werden gar nicht erst
|
||||||
|
gesucht — sie sind bewusst kein Match.
|
||||||
Lange Trailerbeats werden nicht mehr automatisch über ihre gesamte Beat-Länge
|
Lange Trailerbeats werden nicht mehr automatisch über ihre gesamte Beat-Länge
|
||||||
gegen einen einzigen Source-Clip validiert. Sobald nach einem sichtbaren
|
gegen einen einzigen Source-Clip validiert. Sobald nach einem sichtbaren
|
||||||
Source-Abschnitt eine anhaltende Schwarzblende oder Titel-/Credit-Insel beginnt,
|
Source-Abschnitt eine anhaltende Schwarzblende oder Titel-/Credit-Insel beginnt,
|
||||||
|
|||||||
@@ -92,6 +92,22 @@ def _save_results(results: list, cfg: "AppConfig") -> None: # type: ignore[name
|
|||||||
logging.getLogger(__name__).info("Match results cached → %s", p)
|
logging.getLogger(__name__).info("Match results cached → %s", p)
|
||||||
|
|
||||||
|
|
||||||
|
def _regenerate_cutter_report(cfg: "AppConfig") -> None: # type: ignore[name-defined]
|
||||||
|
"""Re-render CUTTER_REPORT.md after each cache write so it stays in sync."""
|
||||||
|
try:
|
||||||
|
from scripts.generate_cutter_report import render_report
|
||||||
|
except Exception as exc:
|
||||||
|
logging.getLogger(__name__).warning("Cutter report regen skipped: %s", exc)
|
||||||
|
return
|
||||||
|
try:
|
||||||
|
project_root = cfg.paths.cache_dir.parent
|
||||||
|
out = project_root / "CUTTER_REPORT.md"
|
||||||
|
out.write_text(render_report(project_root), encoding="utf-8")
|
||||||
|
logging.getLogger(__name__).info("Cutter report regenerated → %s", out)
|
||||||
|
except Exception as exc:
|
||||||
|
logging.getLogger(__name__).warning("Cutter report regen failed: %s", exc)
|
||||||
|
|
||||||
|
|
||||||
def _load_results(cfg: "AppConfig") -> list: # type: ignore[name-defined]
|
def _load_results(cfg: "AppConfig") -> list: # type: ignore[name-defined]
|
||||||
from src.core.models import MatchResult, MatchSegment
|
from src.core.models import MatchResult, MatchSegment
|
||||||
p = _results_cache_path(cfg)
|
p = _results_cache_path(cfg)
|
||||||
@@ -632,6 +648,171 @@ def _merge_best_results(existing: list, candidates: list, cfg) -> list:
|
|||||||
return sorted(by_id.values(), key=lambda r: r.beat_id)
|
return sorted(by_id.values(), key=lambda r: r.beat_id)
|
||||||
|
|
||||||
|
|
||||||
|
def _recover_unmatched_beats_via_vision(results: list, beats: list, cfg) -> list:
|
||||||
|
"""Try a vision-led search for beats that ended up without a match.
|
||||||
|
|
||||||
|
For each unmatched beat that has scoreable visual content (i.e. not pure
|
||||||
|
fade/title-card material), this pass:
|
||||||
|
1. Asks the vibe-check (CV histogram + pHash) for the top-K candidate
|
||||||
|
scenes.
|
||||||
|
2. For each candidate, runs the semantic action-window search with the
|
||||||
|
beat's own description, prefering windows whose phase matches the
|
||||||
|
visible part of the beat.
|
||||||
|
3. Refines the in-point with the regular CV content/motion aligner.
|
||||||
|
4. Validates the resulting window with the vision phase check, exactly
|
||||||
|
like the main filter.
|
||||||
|
5. Adds the best validated candidate as a provisional MatchResult.
|
||||||
|
|
||||||
|
Confirmed and provisional matches both stay subject to the same thresholds
|
||||||
|
used elsewhere; this only adds matches that pass the same quality gates.
|
||||||
|
"""
|
||||||
|
if not cfg.vision.enabled or not beats:
|
||||||
|
return results
|
||||||
|
|
||||||
|
from dataclasses import replace
|
||||||
|
from src.cv.global_scan import align_in_point_by_content_and_motion, estimate_usable_source_duration
|
||||||
|
from src.cv.scene_indexer import build_scene_index
|
||||||
|
from src.cv.vibe_check import run_vibe_check
|
||||||
|
from src.core.models import MatchResult
|
||||||
|
from src.llm.vision_cache import find_action_window_in_scene, validate_match_window_with_vision
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
matched_ids = {r.beat_id for r in results}
|
||||||
|
unmatched = [b for b in beats if b.beat_id not in matched_ids]
|
||||||
|
if not unmatched:
|
||||||
|
return results
|
||||||
|
|
||||||
|
scenes = build_scene_index(cfg)
|
||||||
|
if not scenes:
|
||||||
|
return results
|
||||||
|
|
||||||
|
new_results = list(results)
|
||||||
|
for beat in unmatched:
|
||||||
|
try:
|
||||||
|
islands = _reference_scoreable_segments(beat, cfg)
|
||||||
|
except Exception:
|
||||||
|
islands = []
|
||||||
|
|
||||||
|
# Anchor selection: prefer the longest visible island; if none exists,
|
||||||
|
# fall back to the full beat. The latter handles dark / low-contrast
|
||||||
|
# close-ups that drop below the scoreable luma/contrast thresholds but
|
||||||
|
# are still semantically describable. The strict vision phase
|
||||||
|
# validation later in this pass keeps us from accepting pure title-card
|
||||||
|
# or logo material.
|
||||||
|
from dataclasses import replace as _replace
|
||||||
|
if islands:
|
||||||
|
anchor_start_s, anchor_end_s = max(islands, key=lambda iv: iv[1] - iv[0])
|
||||||
|
anchor_beat = _replace(
|
||||||
|
beat,
|
||||||
|
start_s=beat.start_s + anchor_start_s,
|
||||||
|
end_s=beat.start_s + anchor_end_s,
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
anchor_beat = beat
|
||||||
|
|
||||||
|
try:
|
||||||
|
hits = run_vibe_check(
|
||||||
|
beat,
|
||||||
|
scenes,
|
||||||
|
top_k=max(cfg.cv.deep_scan.scene_seed_top_k, cfg.cv.vibe_check.top_k_candidates),
|
||||||
|
hist_method=cfg.cv.vibe_check.hist_compare_method,
|
||||||
|
phash_max_distance=64,
|
||||||
|
)
|
||||||
|
except Exception as exc:
|
||||||
|
logger.warning("Beat %d: recovery vibe-check failed (%s)", beat.beat_id, exc)
|
||||||
|
continue
|
||||||
|
|
||||||
|
scenes_by_id = {s.scene_id: s for s in scenes}
|
||||||
|
best = None # (score, scene, in_s, dur_s, reason)
|
||||||
|
seen = set()
|
||||||
|
for hit in hits[: cfg.cv.deep_scan.scene_seed_top_k]:
|
||||||
|
scene = scenes_by_id.get(hit.scene_id)
|
||||||
|
if scene is None or scene.scene_id in seen:
|
||||||
|
continue
|
||||||
|
seen.add(scene.scene_id)
|
||||||
|
|
||||||
|
try:
|
||||||
|
found = find_action_window_in_scene(anchor_beat, scene, cfg)
|
||||||
|
except Exception as exc:
|
||||||
|
logger.debug("Beat %d: action window failed for scene %d (%s)", beat.beat_id, scene.scene_id, exc)
|
||||||
|
continue
|
||||||
|
if found is None:
|
||||||
|
continue
|
||||||
|
start_s, end_s, semantic_score, reason = found
|
||||||
|
|
||||||
|
window_s = max(3.0, min(8.0, (end_s - start_s) * 4.0))
|
||||||
|
try:
|
||||||
|
aligned_in_s, combined_score, content_score, motion_score = align_in_point_by_content_and_motion(
|
||||||
|
anchor_beat,
|
||||||
|
start_s,
|
||||||
|
cfg,
|
||||||
|
search_window_s=window_s,
|
||||||
|
)
|
||||||
|
except Exception as exc:
|
||||||
|
logger.debug("Beat %d: align failed for scene %d (%s)", beat.beat_id, scene.scene_id, exc)
|
||||||
|
continue
|
||||||
|
aligned_in_s = max(scene.start_s, min(aligned_in_s, max(scene.start_s, scene.end_s - anchor_beat.duration_s)))
|
||||||
|
|
||||||
|
try:
|
||||||
|
usable_duration_s, usable_score = estimate_usable_source_duration(anchor_beat, aligned_in_s, cfg)
|
||||||
|
except Exception:
|
||||||
|
usable_duration_s, usable_score = anchor_beat.duration_s, 0.0
|
||||||
|
usable_duration_s = max(0.0, min(anchor_beat.duration_s, usable_duration_s))
|
||||||
|
if usable_duration_s < max(0.32, anchor_beat.duration_s * 0.45):
|
||||||
|
usable_duration_s = anchor_beat.duration_s
|
||||||
|
|
||||||
|
try:
|
||||||
|
ok, verify_reason = validate_match_window_with_vision(
|
||||||
|
anchor_beat,
|
||||||
|
source_path=scene.source_path,
|
||||||
|
scene_id=scene.scene_id,
|
||||||
|
in_point_s=aligned_in_s,
|
||||||
|
out_point_s=aligned_in_s + usable_duration_s,
|
||||||
|
cfg=cfg,
|
||||||
|
)
|
||||||
|
except Exception as exc:
|
||||||
|
logger.debug("Beat %d: validate failed scene=%d (%s)", beat.beat_id, scene.scene_id, exc)
|
||||||
|
continue
|
||||||
|
if not ok:
|
||||||
|
continue
|
||||||
|
|
||||||
|
final_score = max(
|
||||||
|
combined_score,
|
||||||
|
min(0.99, semantic_score * 0.65 + motion_score * 0.18 + content_score * 0.09 + usable_score * 0.08),
|
||||||
|
)
|
||||||
|
if final_score < cfg.cv.deep_scan.provisional_match_threshold:
|
||||||
|
continue
|
||||||
|
candidate = (final_score, scene, aligned_in_s, usable_duration_s, f"recovery; {reason}; {verify_reason}")
|
||||||
|
if best is None or candidate[0] > best[0]:
|
||||||
|
best = candidate
|
||||||
|
|
||||||
|
if best is None:
|
||||||
|
continue
|
||||||
|
score, scene, aligned_in_s, usable_duration_s, repair_reason = best
|
||||||
|
logger.info(
|
||||||
|
"Beat %d: recovered via vision action search scene=%d in=%.3fs score=%.3f (%s)",
|
||||||
|
beat.beat_id,
|
||||||
|
scene.scene_id,
|
||||||
|
aligned_in_s,
|
||||||
|
score,
|
||||||
|
repair_reason,
|
||||||
|
)
|
||||||
|
new_results.append(MatchResult(
|
||||||
|
beat_id=beat.beat_id,
|
||||||
|
scene_id=scene.scene_id,
|
||||||
|
source_path=scene.source_path,
|
||||||
|
in_point_s=aligned_in_s,
|
||||||
|
out_point_s=aligned_in_s + usable_duration_s,
|
||||||
|
in_point_frame=int(aligned_in_s * cfg.export.edl_frame_rate),
|
||||||
|
match_score=score,
|
||||||
|
match_location=(0, 0),
|
||||||
|
is_confirmed=score >= cfg.cv.deep_scan.match_threshold,
|
||||||
|
segments=tuple(),
|
||||||
|
))
|
||||||
|
|
||||||
|
return sorted(new_results, key=lambda r: r.beat_id)
|
||||||
|
|
||||||
|
|
||||||
def _filter_semantically_invalid_vision_matches(results: list, beats: list, cfg) -> list:
|
def _filter_semantically_invalid_vision_matches(results: list, beats: list, cfg) -> list:
|
||||||
"""Drop vision-enabled matches whose final action phase contradicts the beat."""
|
"""Drop vision-enabled matches whose final action phase contradicts the beat."""
|
||||||
if not cfg.vision.enabled or not results:
|
if not cfg.vision.enabled or not results:
|
||||||
@@ -785,7 +966,16 @@ def _filter_repair_one(result, beat, beats_by_id, scenes_by_id, kept, cfg, reali
|
|||||||
changed = False
|
changed = False
|
||||||
for segment in result.segments:
|
for segment in result.segments:
|
||||||
scene = scenes_by_id.get(segment.scene_id)
|
scene = scenes_by_id.get(segment.scene_id)
|
||||||
if scene is None or scene.duration_s <= max(segment.duration_s * 1.6, 6.0):
|
# Allow phase-realign whenever the scene has any meaningful
|
||||||
|
# slack beyond the segment, not only for "long" scenes.
|
||||||
|
# Short scenes don't need realigning because the segment
|
||||||
|
# essentially is the scene.
|
||||||
|
if scene is None or scene.duration_s <= segment.duration_s + 0.5:
|
||||||
|
new_segments.append(segment)
|
||||||
|
continue
|
||||||
|
# For already-confirmed segments, skip the realign to avoid
|
||||||
|
# destabilizing a strong original match.
|
||||||
|
if segment.is_confirmed and scene.duration_s <= max(segment.duration_s * 1.6, 6.0):
|
||||||
new_segments.append(segment)
|
new_segments.append(segment)
|
||||||
continue
|
continue
|
||||||
segment_beat = replace(
|
segment_beat = replace(
|
||||||
@@ -801,6 +991,11 @@ def _filter_repair_one(result, beat, beats_by_id, scenes_by_id, kept, cfg, reali
|
|||||||
if abs(aligned_in_s - segment.in_point_s) <= 1.0 / cfg.export.edl_frame_rate:
|
if abs(aligned_in_s - segment.in_point_s) <= 1.0 / cfg.export.edl_frame_rate:
|
||||||
new_segments.append(segment)
|
new_segments.append(segment)
|
||||||
continue
|
continue
|
||||||
|
# Don't commit a repair that scores meaningfully worse than
|
||||||
|
# the original; phase realign should improve, not regress.
|
||||||
|
if score < segment.match_score - 0.02:
|
||||||
|
new_segments.append(segment)
|
||||||
|
continue
|
||||||
changed = True
|
changed = True
|
||||||
repair_reasons.append(repair_reason)
|
repair_reasons.append(repair_reason)
|
||||||
new_segments.append(replace(
|
new_segments.append(replace(
|
||||||
@@ -833,11 +1028,22 @@ def _filter_repair_one(result, beat, beats_by_id, scenes_by_id, kept, cfg, reali
|
|||||||
repaired = True
|
repaired = True
|
||||||
else:
|
else:
|
||||||
scene = scenes_by_id.get(result.scene_id)
|
scene = scenes_by_id.get(result.scene_id)
|
||||||
if scene is not None and scene.duration_s > max(result.duration_s * 1.6, 6.0):
|
wide_scene = (
|
||||||
|
scene is not None
|
||||||
|
and scene.duration_s > result.duration_s + 0.5
|
||||||
|
)
|
||||||
|
already_confirmed_in_tight_scene = (
|
||||||
|
result.is_confirmed
|
||||||
|
and scene is not None
|
||||||
|
and scene.duration_s <= max(result.duration_s * 1.6, 6.0)
|
||||||
|
)
|
||||||
|
if wide_scene and not already_confirmed_in_tight_scene:
|
||||||
repair = realign_window(beat, result.scene_id)
|
repair = realign_window(beat, result.scene_id)
|
||||||
if repair is not None:
|
if repair is not None:
|
||||||
repair_scene, aligned_in_s, usable_duration_s, score, repair_reason = repair
|
repair_scene, aligned_in_s, usable_duration_s, score, repair_reason = repair
|
||||||
if abs(aligned_in_s - result.in_point_s) > 1.0 / cfg.export.edl_frame_rate:
|
moved = abs(aligned_in_s - result.in_point_s) > 1.0 / cfg.export.edl_frame_rate
|
||||||
|
improved = score >= result.match_score - 0.02
|
||||||
|
if moved and improved:
|
||||||
logger.info(
|
logger.info(
|
||||||
"Beat %d: realigned semantically valid long scene by motion/action window (%s)",
|
"Beat %d: realigned semantically valid long scene by motion/action window (%s)",
|
||||||
result.beat_id,
|
result.beat_id,
|
||||||
@@ -1271,6 +1477,7 @@ def cmd_match(args: argparse.Namespace, cfg) -> list:
|
|||||||
)
|
)
|
||||||
results = _attach_visual_segments(results, beats, cfg)
|
results = _attach_visual_segments(results, beats, cfg)
|
||||||
results = _filter_semantically_invalid_vision_matches(results, beats, cfg)
|
results = _filter_semantically_invalid_vision_matches(results, beats, cfg)
|
||||||
|
results = _recover_unmatched_beats_via_vision(results, beats, cfg)
|
||||||
|
|
||||||
# A targeted one-beat match should improve the cache without deleting
|
# A targeted one-beat match should improve the cache without deleting
|
||||||
# automatic matches for other beats.
|
# automatic matches for other beats.
|
||||||
@@ -1283,6 +1490,7 @@ def cmd_match(args: argparse.Namespace, cfg) -> list:
|
|||||||
results_to_save = results
|
results_to_save = results
|
||||||
|
|
||||||
_save_results(results_to_save, cfg)
|
_save_results(results_to_save, cfg)
|
||||||
|
_regenerate_cutter_report(cfg)
|
||||||
|
|
||||||
print(f"\n✅ {len(results)} / {len(beats)} beats matched.")
|
print(f"\n✅ {len(results)} / {len(beats)} beats matched.")
|
||||||
for r in results:
|
for r in results:
|
||||||
|
|||||||
@@ -0,0 +1,207 @@
|
|||||||
|
"""
|
||||||
|
scripts/generate_cutter_report.py — generate CUTTER_REPORT.md from current cache
|
||||||
|
|
||||||
|
Regenerates CUTTER_REPORT.md from .cache/match_results.json,
|
||||||
|
.cache/trailer_beats.json and .cache/vision_descriptions.json. The report is a
|
||||||
|
hand-off document for a video editor (Cutter) doing the manual recut: it lists,
|
||||||
|
per beat, the trailer position, the proposed source position in SMPTE
|
||||||
|
timecodes, the match score, and what the vision model saw in the trailer beat.
|
||||||
|
|
||||||
|
Usage (from project root):
|
||||||
|
python scripts/generate_cutter_report.py
|
||||||
|
|
||||||
|
Run this any time after `python cli.py match` to keep CUTTER_REPORT.md in sync
|
||||||
|
with the latest cache.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from datetime import date
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
def smpte(t: float | None, fps: int) -> str:
|
||||||
|
if t is None:
|
||||||
|
return "--:--:--:--"
|
||||||
|
total = int(round(t * fps))
|
||||||
|
h = total // (3600 * fps)
|
||||||
|
m = (total // (60 * fps)) % 60
|
||||||
|
s = (total // fps) % 60
|
||||||
|
f = total % fps
|
||||||
|
return f"{h:02d}:{m:02d}:{s:02d}:{f:02d}"
|
||||||
|
|
||||||
|
|
||||||
|
def best_beat_description(items: dict, beat_id: int, start_s: float, end_s: float) -> str | None:
|
||||||
|
best, best_diff = None, 1e9
|
||||||
|
for key, value in items.items():
|
||||||
|
if not key.startswith(f"beat:{beat_id}:") or not isinstance(value, dict):
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
parts = key.split(":")
|
||||||
|
ks, ke = float(parts[2]), float(parts[3])
|
||||||
|
except (IndexError, ValueError):
|
||||||
|
continue
|
||||||
|
diff = abs(ks - start_s) + abs(ke - end_s)
|
||||||
|
if diff < best_diff:
|
||||||
|
best_diff = diff
|
||||||
|
best = value
|
||||||
|
return best.get("description", "") if best else None
|
||||||
|
|
||||||
|
|
||||||
|
def parse_field(desc: str | None, key: str) -> str:
|
||||||
|
if not desc:
|
||||||
|
return ""
|
||||||
|
match = re.search(rf'"{key}"\s*:\s*"([^"]+)"', desc)
|
||||||
|
return match.group(1) if match else ""
|
||||||
|
|
||||||
|
|
||||||
|
def render_report(project_root: Path) -> str:
|
||||||
|
sys.path.insert(0, str(project_root))
|
||||||
|
from src.core.config import load_config
|
||||||
|
|
||||||
|
cfg = load_config(project_root / "config.toml")
|
||||||
|
fps = int(round(cfg.export.edl_frame_rate))
|
||||||
|
|
||||||
|
cache = project_root / ".cache"
|
||||||
|
results = {r["beat_id"]: r for r in json.loads((cache / "match_results.json").read_text())}
|
||||||
|
beats = json.loads((cache / "trailer_beats.json").read_text())
|
||||||
|
vis_path = cache / "vision_descriptions.json"
|
||||||
|
vis_items = json.loads(vis_path.read_text())["items"] if vis_path.exists() else {}
|
||||||
|
|
||||||
|
lines: list[str] = []
|
||||||
|
lines.append("# Cutter-Report — manuelles Nachschneiden")
|
||||||
|
lines.append("")
|
||||||
|
lines.append(
|
||||||
|
f"Stand: {date.today().isoformat()}. Frame-Rate: {cfg.export.edl_frame_rate} fps. "
|
||||||
|
f"Source: {Path(cfg.paths.source_movie).name} — Trailer: {Path(cfg.paths.reference_trailer).name}."
|
||||||
|
)
|
||||||
|
lines.append("")
|
||||||
|
lines.append(
|
||||||
|
"Diese Datei wird automatisch aus dem Match-Cache erzeugt. "
|
||||||
|
"Nach jedem `python cli.py match` mit `python scripts/generate_cutter_report.py` neu generieren."
|
||||||
|
)
|
||||||
|
lines.append("")
|
||||||
|
lines.append("## Wie diese Tabelle zu lesen ist")
|
||||||
|
lines.append("")
|
||||||
|
lines.append("- **Beat**: Nummer im Referenz-Trailer.")
|
||||||
|
lines.append("- **Trailer In/Out**: SMPTE-Position des Beats im Trailer (h:mm:ss:ff).")
|
||||||
|
lines.append("- **Source In/Out**: vorgeschlagene Position im Quellfilm. Bei `MAN.` selbst aussuchen.")
|
||||||
|
lines.append("- **Scene**: ID der Source-Szene aus PySceneDetect (nur fuer Debug-Zwecke).")
|
||||||
|
lines.append("- **Score**: 0..1, je hoeher desto besser. >=0.65 ist als bestaetigt eingestuft.")
|
||||||
|
lines.append("- **Status**:")
|
||||||
|
lines.append(" - `OK` — bestaetigt durch CV + Vision-Phasenpruefung, kann ohne weitere Pruefung uebernommen werden.")
|
||||||
|
lines.append(" - `?` — vorlaeufig, korrekte Szene aber Score unter 0.65; Bewegungsphase im Vorschauclip pruefen und ggf. um wenige Frames verschieben.")
|
||||||
|
lines.append(" - `MAN.` — kein automatischer Treffer; entweder manuell suchen oder als Schwarzfade/Titel uebernehmen.")
|
||||||
|
lines.append("- **Phase**: was im Trailerbeat zu sehen ist (aus Vision-Beschreibung). Hilft dir, die richtige Stelle im Source zu finden.")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
matched = sum(1 for b in beats if b["beat_id"] in results)
|
||||||
|
confirmed = sum(1 for b in beats if b["beat_id"] in results and results[b["beat_id"]]["is_confirmed"])
|
||||||
|
lines.append("## Status-Uebersicht")
|
||||||
|
lines.append("")
|
||||||
|
lines.append(f"- **Beats gesamt**: {len(beats)}")
|
||||||
|
lines.append(f"- **Automatisch gefunden**: {matched} ({confirmed} davon bestaetigt)")
|
||||||
|
lines.append(f"- **Manuell zu setzen**: {len(beats) - matched}")
|
||||||
|
lines.append("")
|
||||||
|
lines.append("## Beat-Tabelle")
|
||||||
|
lines.append("")
|
||||||
|
lines.append("| Beat | Trailer In / Out | Source In / Out | Scene | Score | Status | Was im Bild zu sehen ist |")
|
||||||
|
lines.append("|-----:|------------------|------------------|------:|------:|:------:|---------------------------|")
|
||||||
|
|
||||||
|
def status_for(rec: dict | None) -> str:
|
||||||
|
if rec is None:
|
||||||
|
return "MAN."
|
||||||
|
return "OK" if rec.get("is_confirmed") else "?"
|
||||||
|
|
||||||
|
for beat in beats:
|
||||||
|
bid = beat["beat_id"]
|
||||||
|
rec = results.get(bid)
|
||||||
|
ti, to = smpte(beat["start_s"], fps), smpte(beat["end_s"], fps)
|
||||||
|
if rec is not None:
|
||||||
|
si, so = smpte(rec["in_point_s"], fps), smpte(rec["out_point_s"], fps)
|
||||||
|
scn = rec["scene_id"]
|
||||||
|
sc = rec["match_score"]
|
||||||
|
else:
|
||||||
|
si = so = "—"
|
||||||
|
scn = "—"
|
||||||
|
sc = 0.0
|
||||||
|
desc = best_beat_description(vis_items, bid, beat["start_s"], beat["end_s"]) or ""
|
||||||
|
phase = (parse_field(desc, "action_phase") or parse_field(desc, "subject"))[:90]
|
||||||
|
lines.append(f"| {bid:>4} | {ti}-{to} | {si}-{so} | {scn} | {sc:.3f} | {status_for(rec)} | {phase} |")
|
||||||
|
|
||||||
|
lines.append("")
|
||||||
|
lines.append("## Beats die manuelle Aufmerksamkeit brauchen")
|
||||||
|
lines.append("")
|
||||||
|
lines.append("### Manuell setzen (Status `MAN.`)")
|
||||||
|
lines.append("")
|
||||||
|
for beat in beats:
|
||||||
|
bid = beat["beat_id"]
|
||||||
|
if bid in results:
|
||||||
|
continue
|
||||||
|
ti, to = smpte(beat["start_s"], fps), smpte(beat["end_s"], fps)
|
||||||
|
desc = best_beat_description(vis_items, bid, beat["start_s"], beat["end_s"]) or ""
|
||||||
|
phase = parse_field(desc, "action_phase")
|
||||||
|
note = phase or "keine Vision-Beschreibung — vermutlich Title-Card / Fade / Logo"
|
||||||
|
lines.append(f"- **Beat {bid}** {ti}-{to}: {note}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
lines.append("### Vorlaeufig (Status `?`) — bitte sichten")
|
||||||
|
lines.append("")
|
||||||
|
lines.append("| Beat | Score | Source In | Phase laut Vision |")
|
||||||
|
lines.append("|-----:|------:|-----------|--------------------|")
|
||||||
|
for beat in beats:
|
||||||
|
bid = beat["beat_id"]
|
||||||
|
rec = results.get(bid)
|
||||||
|
if rec is None or rec.get("is_confirmed"):
|
||||||
|
continue
|
||||||
|
desc = best_beat_description(vis_items, bid, beat["start_s"], beat["end_s"]) or ""
|
||||||
|
phase = parse_field(desc, "action_phase")
|
||||||
|
lines.append(f"| {bid:>4} | {rec['match_score']:.3f} | {smpte(rec['in_point_s'], fps)} | {phase[:90]} |")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
lines.append("### Bestaetigt (Status `OK`) — kann uebernommen werden")
|
||||||
|
lines.append("")
|
||||||
|
lines.append("| Beat | Score | Source In | Phase laut Vision |")
|
||||||
|
lines.append("|-----:|------:|-----------|--------------------|")
|
||||||
|
for beat in beats:
|
||||||
|
bid = beat["beat_id"]
|
||||||
|
rec = results.get(bid)
|
||||||
|
if rec is None or not rec.get("is_confirmed"):
|
||||||
|
continue
|
||||||
|
desc = best_beat_description(vis_items, bid, beat["start_s"], beat["end_s"]) or ""
|
||||||
|
phase = parse_field(desc, "action_phase")
|
||||||
|
lines.append(f"| {bid:>4} | {rec['match_score']:.3f} | {smpte(rec['in_point_s'], fps)} | {phase[:90]} |")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
lines.append("## Hinweise zur Pruefung")
|
||||||
|
lines.append("")
|
||||||
|
lines.append(
|
||||||
|
"1. Source-Times sollten zur jeweiligen Trailer-Bewegungsphase passen. "
|
||||||
|
"Wenn nicht: Source-In innerhalb derselben Source-Szene wenige Frames vor/zurueck verschieben."
|
||||||
|
)
|
||||||
|
lines.append(
|
||||||
|
"2. Wenn der Source-Clip kuerzer ist als der Trailerbeat (Source-Out < Trailer-Out gerechnet ab Source-In), "
|
||||||
|
"enthaelt der Trailerbeat eine Blende/Titelkarte; im Schnitt mit Schwarzfade oder Source-Tail auffuellen."
|
||||||
|
)
|
||||||
|
lines.append(
|
||||||
|
"3. `OK`-Beats sind durch CV + Vision-Phasenpruefung doppelt verifiziert; trotzdem stichprobenartig sichten."
|
||||||
|
)
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
here = Path(__file__).resolve().parent
|
||||||
|
project_root = here.parent
|
||||||
|
out = project_root / "CUTTER_REPORT.md"
|
||||||
|
out.write_text(render_report(project_root), encoding="utf-8")
|
||||||
|
print(f"Wrote {out}")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
raise SystemExit(main())
|
||||||
Reference in New Issue
Block a user