This demo will use two custom datasets; df1 and df2:
df1 %>%
kable() %>%
kable_styling(full_width = F)
id | a |
---|---|
1 | name 1 |
2 | name 2 |
3 | name 3 |
4 | name 4 |
5 | name 5 |
6 | name 6 |
7 | name 7 |
8 | name 8 |
df2 %>%
kable() %>%
kable_styling(full_width = F)
id | a |
---|---|
1 | cat |
2 | dog |
3 | piano |
4 | bear |
5 | yakult |
6 | air fryer |
Unfortunately, not all joins are available when using
sqldf, those which are will be shown here. The 4 main
joins are:
1. INNER JOIN
2. OUTER JOIN
3. LEFT JOIN
4. RIGHT JOIN
The left join takes all the data in the first (left) dataset and any matching values in the right (second) table
sqldf("
SELECT df1.a, df2.a
FROM df1
LEFT JOIN df2 ON df1.id=df2.id;
")
## a a
## 1 name 1 cat
## 2 name 2 dog
## 3 name 3 piano
## 4 name 4 bear
## 5 name 5 yakult
## 6 name 6 air fryer
## 7 name 7 <NA>
## 8 name 8 <NA>
The RIGHT JOIN takes all the data from the second (right) dataset and any matching entries from the left (first). Not currently supported by sqldf.
sqldf("
SELECT df1.name, df2.name
FROM df1
RIGHT JOIN df2.id=df1.id
")
The INNER JOIN keeps any data which is in both tables.
sqldf("
SELECT df1.a, df2.a
FROM df1
INNER JOIN df2 ON df1.id=df2.id
")
## a a
## 1 name 1 cat
## 2 name 2 dog
## 3 name 3 piano
## 4 name 4 bear
## 5 name 5 yakult
## 6 name 6 air fryer
FULL OUTER JOIN keeps all data, in both tables, whether they match or not. Not currently supported by sqldf.
sqldf("
SELECT df1.name, df2.name
FROM df1
OUTER JOIN df2.id=df1.id
")