A.4 Choosing columns with select
Let’s get rid of some columns we won’t use. In base R, we could subset columns like this
date gid away home ascore hscore
1 2021-10-19 22100001 BKN MIL 104 127
2 2021-10-19 22100002 GSW LAL 121 114
3 2021-10-20 22100011 OKC UTA 86 107
4 2021-10-20 22100013 SAC POR 124 121
5 2021-10-20 22100012 DEN PHX 110 98
6 2021-10-20 22100010 ORL SAS 97 123
In tidyverse, we can use select
and avoid typing quotes repeatedly.
date gid away home ascore hscore
1 2021-10-19 22100001 BKN MIL 104 127
2 2021-10-19 22100002 GSW LAL 121 114
3 2021-10-20 22100011 OKC UTA 86 107
4 2021-10-20 22100013 SAC POR 124 121
5 2021-10-20 22100012 DEN PHX 110 98
6 2021-10-20 22100010 ORL SAS 97 123
We put new lines in to split up the columns given to select
in a reasonable way. The first row date
and gid
are game information (id and date), the second row has the teams, and the third row has the scores. Note that ascore
, the away team’s score, is under away
, the away team, and likewise with hscore
and home
.
Note that we can also use -
to specify which columns we don’t want. This is equivalent to the above:
date away home ascore hscore gid
1 2021-10-19 BKN MIL 104 127 22100001
2 2021-10-19 GSW LAL 121 114 22100002
3 2021-10-20 OKC UTA 86 107 22100011
4 2021-10-20 SAC POR 124 121 22100013
5 2021-10-20 DEN PHX 110 98 22100012
6 2021-10-20 ORL SAS 97 123 22100010
If we want to do choose rows and columns in the same step, we can use filter and select together in the same block of code using the pipe. Let’s finalize this data by saving as the object d
instead of temp
, and also add in that we want only regular season data.
Code
date gid away home ascore hscore
1 2021-10-19 22100001 BKN MIL 104 127
2 2021-10-19 22100002 GSW LAL 121 114
3 2021-10-20 22100011 OKC UTA 86 107
4 2021-10-20 22100013 SAC POR 124 121
5 2021-10-20 22100012 DEN PHX 110 98
6 2021-10-20 22100010 ORL SAS 97 123
We put a new line after every %>%
, put a new line for each logical expression, and split up the column names given to select
in a reasonable way, all for improved readability.