A.10 Gluing data.frames using bind_rows
Suppose we want to find average score by team. We’ll have to rearrange the data a little since teams can appear in both the home and away column. Ideally, we would have one column with the team name, and one column with the score. Each game would then have two rows, one for the away team and one for the home team. We’ll create two data.frames, one for home teams and one for away teams, and then bind them together using bind_rows
.
Code
da = d %>% select(date, away, ascore, home, hscore, gid) %>% mutate(ha = 'away')
dh = d %>% select(date, home, hscore, away, ascore, gid) %>% mutate(ha = 'home')
colnames(da) = c('date', 'team', 'score', 'opp', 'opp.score', 'gid', 'ha')
colnames(dh) = c('date', 'team', 'score', 'opp', 'opp.score', 'gid', 'ha')
dd = bind_rows(da, dh) %>%
arrange(date, gid)
head(dd)
date team score opp opp.score gid ha
1 2021-10-19 BKN 104 MIL 127 22100001 away
2 2021-10-19 MIL 127 BKN 104 22100001 home
3 2021-10-19 GSW 121 LAL 114 22100002 away
4 2021-10-19 LAL 114 GSW 121 22100002 home
5 2021-10-20 IND 122 CHA 123 22100003 away
6 2021-10-20 CHA 123 IND 122 22100003 home
We see the same games, teams, and scores at the top, but now we have two rows for each game. In this case, I put the column names in the select
and colnames
rows on the same row so that it is easier to compare and contrast the first and second lines, and third and fourth lines.