textSimilarity
Compute a similarity score (0–1) between two transcript strings.
function textSimilarity(a, b): number;
Defined in: src/utils/textSimilarity.ts:42
Compute a similarity score (0–1) between two transcript strings.
Parameters
| Parameter | Type | Description |
|---|---|---|
a | string | First text (typically the preflight/eager transcript). |
b | string | Second text (typically the confirmed/final transcript). |
Returns
number
A similarity score from 0 (completely different) to 1 (identical).
Remarks
Uses an order-aware word overlap approach: normalizes both strings to lowercase, splits into words, then checks how many words from the shorter string appear — in order — in the longer string. This handles the most common eager-pipeline scenario where EndOfTurn text is a superset of EagerEndOfTurn text (e.g., “hello world” → “hello world how are you”).
- Returns
1.0when the texts are identical (after normalization). - Returns
1.0when the eager text is a perfect prefix of the final text. - Returns a value between 0 and 1 based on the proportion of matching words when the texts diverge.
- Returns
0when the texts share no words.
Example
textSimilarity('hello world', 'hello world'); // 1.0
textSimilarity('hello world', 'hello world how are you'); // 1.0 (prefix match)
textSimilarity('hello world', 'goodbye world'); // 0.5
textSimilarity('hello', 'goodbye'); // 0.0