dplyr join on column A OR column B
09:39 16 Aug 2019

How can I join 2 tables with an OR statement in R using dplyrs join functions?

E.g. Join df1 on df2 with the key: df1_ColumnA == df2_ColumnA OR df1_ColumnA == df2_ColumnB?

library(dplyr)

df1 = data.frame(V1 = c('A', 'B', 'C','D'), v2 = c(1,2,3,4))
df2 = data.frame(VA = c('A', 'B', 'C'), VB = c('D', 'E', 'F'), vC = c(4,5,6))

## Join on column 1
df3 = df1 %>% left_join(df2, by = c('V1' = 'VA'))

## join on column 1 AND column 2
df4 = df1 %>% left_join(df2, by = c('V1' = 'VA', 'V1' = 'VB'))

This is not working:

## join on column 1 OR column 2
df4 = df1 %>% left_join(df2, by = c('V1' = 'VA' | 'V1' = 'VB'))

Edit: expected output

   V1 v2 VA VB vC
1  A  1  A  D  4
2  B  2  B  E  5
3  C  3  C  F  6
4  D  4  A  D  4
r dplyr