Matching multiple pairs of values between two tables; looking for an efficient maintainable method
07:07 10 Jun 2026

I have the following two tables:

Table article_description:

article_code locale article_description
0001 EN SAMPLE DESCRIPTION THAT MAY CONTAIN A KEYWORD ASC
0001 NL DUMMY OMSCHRIJVING MET KEYWOORD ASC
1234 EN SAMPLE DESCRIPTION ASC THAT MAY CONTAIN A KEYWORD
1234 NL DUMMY OMSCHRIJVING ASC MET KEYWOORD
4567 EN SAMPLE DESCRIPTION WITHOUT A KEYWORD

Table article_labels:

article_code locale label_code
0001 EN QM0029
0001 NL QM0029

I would like to identify article_codes that have potentially missing quality label codes by comparing keywords that occur in the article_description, with the quality label(s) present in the other table. (Sadly we lack other more robust information to do a better check.)
One example keyword is "ASC"; each article that has this keyword in the description should have the corresponding label "QM0029". I have about 30 pairs of labels and keywords.
I'm struggling to make a nice maintainable query that can use this logic.

I started by making a query to handle 1 pair using a regex to match the isolated pattern (see code below).
I'd like to get a scaled up query to check for all the pairs of keywords. But I'm not sure how to do this in an effective and efficient manner. The only way I know to do it would be copy pasting the query below multiple times and use UNION ALL, which is not very maintainable. Is there a better way?

Some notes that might help:

  • I would like to return each article_code that has at least 1 locale with a missing label

  • Article descriptions are always capitalised; so I don't need case sensitive to search. They should only contain (capitalised) letters and/or numbers

  • In principle, each article_code has either 0 or 1 quality labels, but robustness to handle multiple labels per article would be a bonus

  • Theres about 100.000 different article_codes, and 5 different locales. Some degree of efficiency would be nice

  • I consider a keyword to be present when it is separated by at least one space from other words (can be in the middle, start, or end of the string)

  • Given the example tables, I would expect the article_code 1234 to be returned, since it has the keyword "ASC" in 2 descriptions, but the corresponding label qm0029 is missing

Please let me know if you have any questions, or have suggestions to improve this post. Additionally, I am by no means very good with SQL yet, so any other improvements/advice would also be greatly appreciated.

Cheers!


WITH article_labels AS (
  SELECT article_code, locale, label_code
  FROM article_label
  WHERE label_code = 'QM0029'
),
article_descriptions AS (
  SELECT article_code, locale
  FROM article_description
  WHERE article_description ~ '(^| )ASC( |$)'
  AND article_description IS NOT NULL
)
SELECT DISTINCT t1.article_code, 'QM0029 missing' AS check_value
FROM article_descriptions AS t1
LEFT JOIN article_labels AS t2
ON 
  t1.article_code = t2.article_code
  AND
  t1.locale = t2.locale
WHERE t2.label_code IS NULL

postgresql