I could not understand difference between inner_join and semi_join? could you provide me with examples?
According to R
The rows from x
returned by semi_join()
and inner_join()
are the same. The difference is that inner_join
will add columns present in y
but not present in x
, but a semi_join
will not add any columns from y
.
x = data.frame(a = 1:3)
y = data.frame(a = 2:4, b = 10:12)
## with an inner join, the `b` column is part of the result
inner_join(x, y)
# Joining, by = "a"
# a b
# 1 2 10
# 2 3 11
## with a semi join, the `b` column is not part of the result
## because it is not part of `x`
semi_join(x, y)
# Joining, by = "a"
# a
# 1 2
# 2 3
The joins documented together as "mutating joins", which are described at ?inner_join
as
mutating joins add columns from
y
tox
, matching rows based on the key
Compare to the "filtering joins" documented together at ?semi_join
Filtering joins filter rows from
x
based on the presence or absence of matches iny
Filtering joins only filter x
, they do not add columns from y
. The other filtering join is anti_join
, which does the opposite of semi_join
, returning only the rows without a match.