33 LEC: Special Values
You can follow along with the slides here if you would like to open them full-screen.
R has a few special values you’ll bump into when your calculations don’t work as expected.
Try dividing by zero in R, and you’ll get Inf (infinity) - unlike other languages that might crash or throw an error. R is pretty chill about it (to an annoying degree):
But if you try something truly undefined like dividing zero by zero, you’ll get NaN (Not a Number):
These special values appear when dealing with unusual or undefined operations in R:
- When dividing by zero:
pi / 0givesInf - When performing undefined math:
0 / 0givesNaN - With contradictory operations:
1/0 - 1/0givesNaN - With consistent operations:
1/0+1/0givesInf
The most common special value you’ll encounter is NA - missing data. NAs are sneaky because they’re “contagious” - almost any calculation involving an NA will give you NA as the result:
The cool thing about NAs is that they’re logically consistent. When you work with them in logical operations:
TRUE | NAisTRUE(because “true or anything” is always true)FALSE | NAisNA(because we need to know whatNAis to determine the result)
It’s like NA is saying “I don’t know what I am, but I’ll follow the rules of logic!”
33.1 Data classes
You can follow along with the slides here if you would like to open them full-screen.
Think of R’s data classes as Lego sets built from basic building blocks. The basic types (logical, character, numeric) are the individual Lego pieces, but classes are the cool structures you build with them.
Take factors - they look like character strings when you print them, but under the hood they’re actually integers with labels:
x <- factor(c("BS", "MS", "PhD", "MS"))
x # Looks like text
#> [1] BS MS PhD MS
#> Levels: BS MS PhD
typeof(x) # But it's stored as integers!
#> [1] "integer"
as.integer(x) # See the numbers behind the scenes
#> [1] 1 2 3 2Or dates - they look like calendar dates when you print them:
But they’re actually just counting days since January 1, 1970:
Because you’re just counting time since a fixed date. This explains why you can do math with dates, like finding out what date is 30 days from now:
Even data frames are secretly just lists where all the elements have the same length:
Understanding these “secret identities” helps you avoid common pitfalls and work more effectively with your data.