<- c(2, 4, 4, NA, 6)
heights mean(heights)
[1] NA
max(heights)
[1] NA
mean(heights, na.rm = TRUE)
[1] 4
max(heights, na.rm = TRUE)
[1] 6
As R was designed to analyze datasets, it includes the concept of missing data (which is uncommon in other programming languages). Missing data are represented in vectors as NA
.
When doing operations on numbers, most functions will return NA
if the data you are working with include missing values. This feature makes it harder to overlook the cases where you are dealing with missing data. You can add the argument na.rm = TRUE
to calculate the result while ignoring the missing values.
[1] NA
[1] NA
[1] 4
[1] 6
If your data include missing values, you may want to become familiar with the functions is.na()
, na.omit()
, and complete.cases()
. See below for examples.
[1] 2 4 4 6
## Returns the object with incomplete cases removed.
## The returned object is an atomic vector of type `"numeric"`
## (or `"double"`).
na.omit(heights)
[1] 2 4 4 6
attr(,"na.action")
[1] 4
attr(,"class")
[1] "omit"
## Extract those elements which are complete cases.
## The returned object is an atomic vector of type `"numeric"`
## (or `"double"`).
heights[complete.cases(heights)]
[1] 2 4 4 6
median()
to calculate the median of the heights
vector.There exists some functions to generate vectors of different type. To generate a vector of numerics, one can use the numeric()
constructor, providing the length of the output vector as parameter. The values will be initialised with 0.
Note that if we ask for a vector of numerics of length 0, we obtain exactly that:
There are similar constructors for characters and logicals, named character()
and logical()
respectively.
What are the defaults for character and logical vectors?
The materials in this lesson have been adapted from work created by the HBC and Data Carpentry, as well as materials created by Laurent Gatto, Charlotte Soneson, Jenny Drnevich, Robert Castelo, and Kevin Rue-Albert. These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.