4.3 Referencing variables of a data frame
A data frame (or tibble) is a two-dimensional structure in which each variable forms a column and each observation forms a row. Each element is the value that a given observation takes on for a particular variable.
How do you reference or identify the variables in a data frame (e.g. to calculate the mean number of students tested, NumTested from the dcps schools data)? The $ and [[ extraction operators both pull variables from a dataframe or items from a list. Note that $ requires the variable name, whereas [[ allows you to use either the variable’s name (in quotes) or column number in the data frame. Do what you want. I prefer [[ for anyone in more advanced programming, but I’ll typically use $ in this course.
## [1] 72 147 67 180 213 224 158 112 157 375 222 149 167 67 83
## [16] 87 96 121 289 109 495 62 1423 146 117 156 109 189 212 159
## [31] 112 77 137 112 362 310 110 160 94 91 178 334 290 235 354
## [46] 114 156 137 115 181 306 111 96 193 246 132 12 129 93 153
## [61] 143 140 212 143 105 148 258 129 66 144 399 133 117 79 144
## [76] 213 120 288 96 170 20 61 121 261 135 102 113 134 121 73
## [91] 217 211 175 409 239 126 102 374 199 180 173 190 23 253 172
## [106] 154 152 421
## [1] 180.1019
## [1] 180.1019
## [1] 180.1019