Each line now matches its section (verse/chorus/outro) before pulling vocab — so words from other sections stop sneaking in as substring pollution. Offset tucked into Tools.
Section-aware matching. Each LRCLIB line is matched to its section first (exact text match against context_lines, then a fallback score of word-overlap with longest-match tie-breaker). Only that section's vocab is used for the dropdown and highlight tokens — with a global fallback so shared words (like repeated スキキライスキ in both choruses) still show up.
Why it was broken: v2 filtered global vocab by substring, so a 1-char particle like に (defined in Verse 3) was being pulled into every line that contained に anywhere — inside には, inside になりました, everywhere.
Tools hidden. Offset lives behind a small Tools button. Pencil/lab-notes removed from this version.
Matching pipeline: 1) normalize line text (strip (×2) etc.), 2) split context_lines on / dividers, 3) try exact match, 4) try substring, 5) fallback to word-overlap scoring (longest single match wins ties).
Tokenization: section vocab tried first (longest-first), falls back to global vocab for shared words, falls back to single-char tokens for unmatched (with smaller weight on punctuation).