Prioritising full phrase matches over individual tokens in JavaScript regex search
12:10 25 Feb 2026

I'm building a DOM search highlighter that supports multi-word queries.

If the user searches:

power shell

I want:

  1. The full phrase "power shell" to match first

  2. If the full phrase doesn’t exist, fall back to matching "power" OR "shell"

Here is the builder I'm using:

function buildSearchRegex(q) {
  q = (q || "").trim().replace(/\s+/g, " ");
  if (!q) return null;

  const tokens = q.split(" ").filter((t) => t.length >= 1);
  if (!tokens.length) return null;

  const escapedTokens = tokens.map(t =>
    t.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")
  );

  const phrasePattern =
    escapedTokens.length >= 2
      ? escapedTokens.join("[\\s\\u00A0]+")
      : null;

  const tokenPattern = escapedTokens.join("|");

  const pattern = phrasePattern
    ? `${phrasePattern}|${tokenPattern}`
    : tokenPattern;

  return new RegExp(pattern, "gi");
}

However, because the phrase and tokens are in the same alternation group:

phrase|token1|token2

the token matches can sometimes occur before the phrase match in the text flow.

What is the correct way to:

  • Prefer full phrase matches

  • But still fall back to token matches

  • Without double-matching parts of the phrase?

Is this best solved via regex alone, or should I perform two passes (phrase first, then tokens)?

javascript dom search