• NaN (“Not a Number”) means 0/0
  • NA (“Not Available”) is generally interpreted as a missing value and has various forms – NA_integer_, NA_real_, etc. 
  • Therefore, NaN ≠ NA and there is a need for NaN and NA.
  • is.na() returns TRUE for both NA and NaN, however is.nan() return TRUE for NaN (0/0) and FALSE for NA.
  • NULL represents that the value in question simply does not exist, rather than being existent but unknown.


• 
is.na(x) # returns TRUE of x is missing

y <- c(1,2,3,NA)
is.na(y) # returns a vector (F F F T)
				
 
x <- c(1,2,NA,3)

				  mean(x) # returns NA

				  mean(x, na.rm=TRUE) # returns 2
                  

 

The function na.omit() returns the object with listwise deletion of missing values.

# create new dataset without missing data 

				  newdata <- na.omit(mydata)

				  

 

They are not supposed to give the same result. Consider this example:

exdf<-data.frame(a=c(1,NA,5),b=c(3,2,2))
#   a b
#1  1 3
#2 NA 2
#3  5 2
colMeans(exdf,na.rm=TRUE) ## remove only "NA"
#       a        b 
#3.000000 2.333333
colMeans(na.omit(exdf)) ## remove "NA 2"
#  a   b 
#3.0 2.5

Why is this? In the first case, the mean of column b is calculated through (3+2+2)/3. In the second case, the second row is removed in its entirety (also the value of b which is not-NA and therefore considered in the first case) by na.omit and so the b mean is just (3+2)/2.