Data frame challenge [🇷 for BE/BA]
Dear ElMaestro.
I think there should be a warning message when you have apples and bananas in your data. x)
Function as.numeric() produces Warning message because there is no numeric value in case of "Banana" (etc.) resulting in NA.
If you expect numeric values with some text known before you can use:
If colClasses set to numeric and there is a string not mentioned in na.strings (e.g. "Orange"), then there will be an error in function read.table().
If Bananas and Apples should be managed in different way, I would read the data as character (no specification of colClasses). Then change e.g. "Banana" for "0" and "Apple" for NA and change type to numeric using as.numeric().
If "Orange" will be also in text/data, it will be treated as NA by coercion with the warning message.
In such data with numeric+character values in one variable I would store it as character only (with no statistics or whatever).
The principle of change all string values (in many of numeric values) to zero without warnings seems strange to me to me (simple example is to have a wrong value 1,25 in the data instead of correct value 1.25). It seems Ok. to have a warning about changing the value to NA. But of course, when all data are correct and there are many of string values which should be a zero, the warning is unwished.
To avoid supressing using suppressWarnings(as.numeric(text)) it must be done as mittyri wrote, i.e. replace all character values to "0" before using the function as.numeric(). (Nevertheless the result is similar, the values of type "1,25" or "1.25*" will be changed to zero without warnings.)
Best regards,
zizou
No one likes warnings even though they can help!
I think there should be a warning message when you have apples and bananas in your data. x)
Function as.numeric() produces Warning message because there is no numeric value in case of "Banana" (etc.) resulting in NA.
If you expect numeric values with some text known before you can use:
A=read.table("SomeFile.csv", colClasses="numeric", na.strings=c("Banana","Apple"), header=T)
If colClasses set to numeric and there is a string not mentioned in na.strings (e.g. "Orange"), then there will be an error in function read.table().
If Bananas and Apples should be managed in different way, I would read the data as character (no specification of colClasses). Then change e.g. "Banana" for "0" and "Apple" for NA and change type to numeric using as.numeric().
text=c("Apple","Banana")
#text=c("Apple","Banana","Orange")
for (i in 1:length(text)){
if(text[i]=="Apple"){text[i]="0"}
if(text[i]=="Banana"){text[i]=NA}
}
as.numeric(text)
#[1] 0 NA # no error and no warning
If "Orange" will be also in text/data, it will be treated as NA by coercion with the warning message.
In such data with numeric+character values in one variable I would store it as character only (with no statistics or whatever).
The principle of change all string values (in many of numeric values) to zero without warnings seems strange to me to me (simple example is to have a wrong value 1,25 in the data instead of correct value 1.25). It seems Ok. to have a warning about changing the value to NA. But of course, when all data are correct and there are many of string values which should be a zero, the warning is unwished.
To avoid supressing using suppressWarnings(as.numeric(text)) it must be done as mittyri wrote, i.e. replace all character values to "0" before using the function as.numeric(). (Nevertheless the result is similar, the values of type "1,25" or "1.25*" will be changed to zero without warnings.)
Best regards,
zizou
No one likes warnings even though they can help!
Complete thread:
- Data frame challenge ElMaestro 2016-12-17 19:40 [🇷 for BE/BA]
- Data frame challenge mittyri 2016-12-18 01:41
- Data frame challenge ElMaestro 2016-12-18 11:47
- Data frame challenge wligtenberg 2016-12-19 14:29
- Data frame challengezizou 2016-12-19 18:41
- Data frame challenge ElMaestro 2016-12-19 21:06
- Data frame challenge mittyri 2016-12-18 01:41