33 LEC: Special Values

You can follow along with the slides here if you would like to open them full-screen.

R has a few special values you’ll bump into when your calculations don’t work as expected.

Try dividing by zero in R, and you’ll get Inf (infinity) - unlike other languages that might crash or throw an error. R is pretty chill about it (to an annoying degree):

pi / 0 # Returns Inf
#> [1] Inf

But if you try something truly undefined like dividing zero by zero, you’ll get NaN (Not a Number):

0 / 0 # Returns NaN
#> [1] NaN

These special values appear when dealing with unusual or undefined operations in R:

  • When dividing by zero: pi / 0 gives Inf
  • When performing undefined math: 0 / 0 gives NaN
  • With contradictory operations: 1/0 - 1/0 gives NaN
  • With consistent operations: 1/0 + 1/0 gives Inf

The most common special value you’ll encounter is NA - missing data. NAs are sneaky because they’re “contagious” - almost any calculation involving an NA will give you NA as the result:

mean(c(1, 2, NA, 4)) # Returns NA
#> [1] NA

The cool thing about NAs is that they’re logically consistent. When you work with them in logical operations:

  • TRUE | NA is TRUE (because “true or anything” is always true)
  • FALSE | NA is NA (because we need to know what NA is to determine the result)

It’s like NA is saying “I don’t know what I am, but I’ll follow the rules of logic!”

33.1 Data classes

You can follow along with the slides here if you would like to open them full-screen.

Think of R’s data classes as Lego sets built from basic building blocks. The basic types (logical, character, numeric) are the individual Lego pieces, but classes are the cool structures you build with them.

Take factors - they look like character strings when you print them, but under the hood they’re actually integers with labels:

x <- factor(c("BS", "MS", "PhD", "MS"))
x # Looks like text
#> [1] BS  MS  PhD MS 
#> Levels: BS MS PhD
typeof(x) # But it's stored as integers!
#> [1] "integer"
as.integer(x) # See the numbers behind the scenes
#> [1] 1 2 3 2

Or dates - they look like calendar dates when you print them:

Y2kday <- as.Date("2000-01-01")
Y2kday # Shows as "2000-01-01"
#> [1] "2000-01-01"

But they’re actually just counting days since January 1, 1970:

as.integer(Y2kday) # Days since 1970-01-01
#> [1] 10957

Because you’re just counting time since a fixed date. This explains why you can do math with dates, like finding out what date is 30 days from now:

Y2kday + 30 # Adds 30 days
#> [1] "2000-01-31"

Even data frames are secretly just lists where all the elements have the same length:

df <- data.frame(x = 1:2, y = 3:4)
typeof(df) # "list"
#> [1] "list"

Understanding these “secret identities” helps you avoid common pitfalls and work more effectively with your data.