Fix forehead_touch action group + always-fresh cutter assets

1. Action-group classifier conflated object-touches and person-touches.
   "man touches the red door with a small object" was being tagged as
   forehead_touch because "touch" was in the forehead_touch needles set.
   That made the realign pass yank Beat 16 from scene 451 (correct: man
   painting red door, IV stand) over to scene 623 (woman/man in bed) —
   a totally wrong shot at score 0.344.

   Fix: removed generic "touch*" verbs from forehead_touch's needle set.
   forehead_touch is now added in _semantic_action_groups() only when a
   touch verb is paired with an explicit body-part target (forehead,
   face, cheek, head, hand, ...) and not paired with an object target
   (door, handle, brush, tool, lock, ...).

   Effect on Beat 16 after `match --beat 16 --vision`:
   scene 623 in=5476.28 score=0.344 -> scene 451 in=3912.48 score=0.626.

2. Cutter-report stills/clips were keyed by source-video mtime, so a
   match-position change without a video change served stale frames from
   the previous match. Dropped the mtime cache; both extractors now
   render fresh every time. Slower (~minute per full regen) but correct.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Melbar
2026-05-05 05:23:24 +02:00
parent dbadc3fc26
commit 8aa6fe8323
52 changed files with 73 additions and 46 deletions
+31 -1
View File
@@ -53,7 +53,11 @@ _CREDIT_ERROR_PATTERNS = (
_ACTION_GROUPS = {
"kiss": {"kiss", "kisses", "kissing", "kissed"},
"forehead_touch": {"forehead", "foreheads", "touch", "touches", "touching", "touched"},
# "touch" is intentionally NOT in forehead_touch — a generic "touch" can
# mean "touches the door / handle / brush" which is unrelated to person
# contact. forehead_touch is added in _semantic_action_groups() only
# when an explicit body-part target is present.
"forehead_touch": {"forehead", "foreheads"},
"approach": {"approach", "approaches", "approaching", "closer", "lean", "leans", "leaning"},
"talk": {"talk", "talking", "speak", "speaking", "conversation", "conversing"},
"hand": {"hand", "hands", "holding", "holds", "raise", "raises", "raising", "lift", "lifting"},
@@ -61,6 +65,21 @@ _ACTION_GROUPS = {
"look_down": {"down", "lowering", "lowers"},
"turn": {"turn", "turns", "turning"},
}
# Words that, when paired with "touch"-family verbs, signal an object touch
# (door, handle, brush, tool, ...) rather than a person-on-person touch.
_OBJECT_TOUCH_TARGETS = {
"door", "doors", "handle", "knob", "lock", "mechanism", "brush", "tool",
"pole", "stand", "rail", "button", "switch", "wall", "surface", "object",
"knife", "blade", "weapon", "phone", "glass", "bottle", "cup",
}
# Words that, when paired with "touch", signal a person-on-person touch
# (forehead/face/skin/...). These keep forehead_touch as a strong action.
_PERSON_TOUCH_TARGETS = {
"forehead", "foreheads", "face", "faces", "cheek", "cheeks",
"head", "skin", "lips", "lip", "neck", "shoulder", "shoulders",
"arm", "arms", "chest", "hand", "hands", "hair", "body",
}
_STRONG_ACTION_GROUPS = {"kiss", "forehead_touch", "approach", "hand", "cutting"}
@@ -285,6 +304,17 @@ def _semantic_action_groups(text: str) -> set[str]:
for name, needles in _ACTION_GROUPS.items()
if terms & needles
}
# Distinguish person-on-person touches from object touches. "touches the
# red door" must NOT count as forehead_touch; "touches her forehead"
# must. We look at the action_phase first (most specific), fall back to
# the full description.
phase = _action_phase_text(text)
touch_present = any(w in phase for w in ("touch", "touches", "touching", "touched"))
if touch_present:
person_target = any(w in phase for w in _PERSON_TOUCH_TARGETS)
object_target = any(w in phase for w in _OBJECT_TOUCH_TARGETS)
if person_target and not object_target:
groups.add("forehead_touch")
if "moving closer" in lowered or "move closer" in lowered:
groups.add("approach")
if "face-to-face" in lowered or "faces facing" in lowered: