Let's assume that I have two tables with columns (product_id, user_id, event_id, event_payload) AS T1 and (product_id, user_id, event_id, event_metadata) AS T2.
Then let's assume that T1 is ordered by product_id and event_id and T2 is ordered by user_id and event_id.
I have two options to join these two tables:
-- First option, join on one field because it's enough.
... from T1 join T2 on (T1.event_id = T2.event_id);
-- Second option, join on more fields because it seems like such query is more
-- optimized due to T1 and T2 being sorted (i.e. indexed) by some columns.
... from T1 join T2 on (
T1.product_id = T2.product_id
and T1.user_id = T2.user_id
and T1.event_id = T2.event_id
);
What is the preferred way? Is the second option optimal or does ClickHouse only optimize filters such as ... on (T1.date = today()), i.e. filters depending on just one table?