A.16 Joining two data frames with left_join
Now that dm
is two columns we can work on join the distance information with our data frame.
# A tibble: 82 × 11
# Groups: team [1]
date team score opp opp.score gid ha days.rest loc prev.loc
<date> <chr> <dbl> <chr> <dbl> <chr> <chr> <dbl> <chr> <chr>
1 2021-10-20 PHI 117 NOP 97 221000… away NA NOP <NA>
2 2021-10-22 PHI 109 BKN 114 221000… home 2 PHI NOP
3 2021-10-24 PHI 115 OKC 103 221000… away 2 OKC PHI
4 2021-10-26 PHI 99 NYK 112 221000… away 2 NYK OKC
5 2021-10-28 PHI 110 DET 102 221000… home 2 PHI NYK
6 2021-10-30 PHI 122 ATL 94 221000… home 2 PHI PHI
7 2021-11-01 PHI 113 POR 103 221000… home 2 PHI PHI
8 2021-11-03 PHI 103 CHI 98 221001… home 2 PHI PHI
9 2021-11-04 PHI 109 DET 98 221001… away 1 DET PHI
10 2021-11-06 PHI 114 CHI 105 221001… away 2 CHI DET
# ℹ 72 more rows
# ℹ 1 more variable: miles <int>
There is now a column miles
that gives the distance between the current game’s location and the location of that team’s previous game.
The left
in left_join
means that we want to keep all rows of the first data frame (in this case dd
), even if there is no match in the second data frame (in this case dm
). We do not necessarily want to keep all the rows of the second data frame.
right_join
- keep all rows of the second data frame, regardless whether or not there is a match.inner_join
- keep only rows in the first data frame where there is a match. (All rows in 1st AND 2nd)full_join
- keep all rows from both data frames, regardless of whether there is a match. (All rows in 1st OR 2nd).