A.8 Summaries using summarise

Now that we have our data set, we can start answering some questions that we had. Let’s start with trying to determine what impacts that outcomes of games. We expected the teams involved, which team is playing at home, and travel miles leading up to the game are all important. Let’s see if we find evidence of that in the data.

First we’ll look at home advantage. Let’s find the mean of the columns ascore, hscore and diff.

Code
d %>% 
  summarise(mean.ascore = mean(ascore), 
            mean.hscore = mean(hscore), 
            mean.diff   = mean(diff  ))
  mean.ascore mean.hscore mean.diff
1    109.7545    111.4772  1.722764

This suggests that in 2021-22, the home team score about 1.72 points more than the away team on average.

Note that functions other than mean (e.g. sum and sd) can be used with summarise and we’ll do that later.

You can use summarise or summarize (s or z). I usually use s because it is in the home position (middle row of keyboard, left hand: a, s, d, f; right hand j, k, l, ;) so I find it easier to type.