R: NULL, NA, and NaN

原创

emanlee 2023-11-06 15:12:41 ©著作权

文章标签 ide sed html 文章分类 JavaScript 前端开发

©著作权归作者所有：来自51CTO博客作者emanlee的原创作品，请联系作者获取转载授权，否则将追究法律责任

NaN (“Not a Number”) means 0/0
NA (“Not Available”) is generally interpreted as a missing value and has various forms – NA_integer_, NA_real_, etc.
Therefore, NaN ≠ NA and there is a need for NaN and NA.
is.na() returns TRUE for both NA and NaN, however is.nan() return TRUE for NaN (0/0) and FALSE for NA.
NULL represents that the value in question simply does not exist, rather than being existent but unknown.

• 
is.na(x) # returns TRUE of x is missing

y <- c(1,2,3,NA)
is.na(y) # returns a vector (F F F T)
				
 
x <- c(1,2,NA,3)

				  mean(x) # returns NA

				  mean(x, na.rm=TRUE) # returns 2

The function na.omit() returns the object with listwise deletion of missing values.

# create new dataset without missing data 

				  newdata <- na.omit(mydata)

They are not supposed to give the same result. Consider this example:

exdf<-data.frame(a=c(1,NA,5),b=c(3,2,2))
#   a b
#1  1 3
#2 NA 2
#3  5 2
colMeans(exdf,na.rm=TRUE) ## remove only "NA"
#       a        b 
#3.000000 2.333333
colMeans(na.omit(exdf)) ## remove "NA 2"
#  a   b 
#3.0 2.5

Why is this? In the first case, the mean of column b is calculated through (3+2+2)/3. In the second case, the second row is removed in its entirety (also the value of b which is not-NA and therefore considered in the first case) by na.omit and so the b mean is just (3+2)/2.