Lab · Tap Lines v3

Right words. Right line.

Each line now matches its section (verse/chorus/outro) before pulling vocab — so words from other sections stop sneaking in as substring pollution. Offset tucked into Tools.

What's different in v3

Section-aware matching. Each LRCLIB line is matched to its section first (exact text match against context_lines, then a fallback score of word-overlap with longest-match tie-breaker). Only that section's vocab is used for the dropdown and highlight tokens — with a global fallback so shared words (like repeated スキキライスキ in both choruses) still show up.

Why it was broken: v2 filtered global vocab by substring, so a 1-char particle like に (defined in Verse 3) was being pulled into every line that contained に anywhere — inside には, inside になりました, everywhere.

Tools hidden. Offset lives behind a small Tools button. Pencil/lab-notes removed from this version.

🌸

イノチミジカシコイセヨオトメ

クリープハイプ · Life is Short, Fall in Love, Maiden

src: lrclib.net #12182810 · 19 lines · 3:12

line — · 0:00

loading…

— lyrics —

loading lines…

Matching pipeline: 1) normalize line text (strip (×2) etc.), 2) split context_lines on / dividers, 3) try exact match, 4) try substring, 5) fallback to word-overlap scoring (longest single match wins ties).

Tokenization: section vocab tried first (longest-first), falls back to global vocab for shared words, falls back to single-char tokens for unmatched (with smaller weight on punctuation).