Thursday, October 24, 2019

r - Clean bad data automatically

To convert a factor to numeric, you need to convert to character first:




no2<-factor(c(5,4,"c1",54,"c5",seq(2:50)))
no2_num <- as.numeric(as.character(no2))
#Warning message:
# NAs introduced by coercion
no2_clean <- na.omit(no2_num) #remove NAs resulting from the bad data

# [1] 5 4 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
# [40] 37 38 39 40 41 42 43 44 45 46 47 48 49
# attr(,"na.action")

# [1] 3 5
# attr(,"class")
# [1] "omit"

length(attr(no2_clean,"na.action"))/length(no2)*100
#[1] 3.703704

No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?

Laptop was acting really weird, and copy and seek times were really slow, so I decided to scan the hard drive surface. I have a couple hundr...