class: center, middle, inverse, title-slide .title[ # Visualizing data with ggplot2
👩🎨 ] .author[ ### S. Mason Garrison ] --- layout: true <div class="my-footer"> <span> <a href="https://DataScience4Psych.github.io/DataScience4Psych/" target="_blank">Data Science for Psychologists</a> </span> </div> --- class: middle # ggplot2 ❤️ 🐧 --- ## ggplot2 `\(\in\)` tidyverse .pull-left-narrow[ <img src="img/ggplot2-part-of-tidyverse.png" width="60%" style="display: block; margin: auto;" /> ] <!-- markdownlint-disable error --> .pull-right-wide[ - **ggplot2** is tidyverse's data visualization package - Structure of the code for plots can be summarized as ``` r ggplot(data = [[dataset]], mapping = aes(x = [[x-variable]], y = [[y-variable]])) + geom_xxx() + other options ``` ] <!-- markdownlint-enable --> --- ## Data: Palmer Penguins Measurements for penguin species, island in Palmer Archipelago, size (flipper length, body mass, bill dimensions), and sex. .pull-left[ <img src="img/penguins.png" width="80%" style="display: block; margin: auto;" /> ] --- ## Data: Palmer Penguins Measurements for penguin species, island in Palmer Archipelago, size (flipper length, body mass, bill dimensions), and sex. ``` r library(palmerpenguins) glimpse(penguins) ``` ``` ## Rows: 344 ## Columns: 8 ## $ species <fct> Adelie, Adelie, Adelie, Adelie, Adeli… ## $ island <fct> Torgersen, Torgersen, Torgersen, Torg… ## $ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.… ## $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.… ## $ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195… ## $ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 362… ## $ sex <fct> male, female, female, NA, female, mal… ## $ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2… ``` --- # Plot <img src="d04_ggplot2_files/figure-html/unnamed-chunk-6-1.png" width="70%" style="display: block; margin: auto;" /> --- # Code ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", color = "Species") ``` ``` ## Warning: Removed 2 rows containing missing values or values outside the ## scale range (`geom_point()`). ``` --- class: middle # Wrapping Up... --- class: middle # Coding out loud --- .midi[ > **Start with the `penguins` data frame** ] .pull-left[ ``` r *ggplot(data = penguins) ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-7-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > **map bill depth to the x-axis** ] .pull-left[ ``` r ggplot(data = penguins, * mapping = aes(x = bill_depth_mm)) ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-8-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > **and map bill length to the y-axis.** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, * y = bill_length_mm)) ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-9-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > **Represent each observation with a point** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm)) + * geom_point() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-10-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > **and map species to the color of each point.** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, * color = species)) + geom_point() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-11-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > **Title the plot "Bill depth and length"** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + * labs(title = "Bill depth and length") ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-12-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > Title the plot "Bill depth and length", > **add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins"** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", * subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins") ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-13-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > **label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", * x = "Bill depth (mm)", y = "Bill length (mm)") ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-14-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, > **label the legend "Species"** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", * color = "Species") ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-15-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, > label the legend "Species", > **and add a caption for the data source.** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", color = "Species", * caption = "Source: Palmer Station LTER / palmerpenguins package") ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-16-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, > label the legend "Species", > and add a caption for the data source. > **Finally, use a discrete color scale that is designed to be perceived by viewers with common forms of color blindness.** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", color = "Species", caption = "Source: Palmer Station LTER / palmerpenguins package") + * scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-17-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Plot <img src="d04_ggplot2_files/figure-html/unnamed-chunk-18-1.png" width="70%" style="display: block; margin: auto;" /> --- # Code ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", color = "Species", caption = "Source: Palmer Station LTER / palmerpenguins package") + scale_color_viridis_d() ``` ``` ## Warning: Removed 2 rows containing missing values or values outside the ## scale range (`geom_point()`). ``` --- # Narrative .midi[ + Start with the `penguins` data frame, map bill depth to the x-axis and map bill length to the y-axis. + Represent each observation with a point and map species to the color of each point. + Title the plot "Bill depth and length", add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, label the legend "Species", and add a caption for the data source. + Finally, use a discrete color scale that is designed to be perceived by viewers with common forms of color blindness. ] --- ## Argument names .tip[ You can omit the names of first two arguments when building plots with `ggplot()`. ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + scale_color_viridis_d() ``` ] --- class: middle # Wrapping Up... --- class: middle # Aesthetics --- ## Aesthetics options Commonly used characteristics of plotting characters that can be **mapped to a specific variable** in the data are - `color` - `shape` - `size` - `alpha` (transparency) --- ## Color .pull-left[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, * color = species)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-19-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Shape Mapped to a different variable than `color` .pull-left[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species, * shape = island)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-20-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Shape Mapped to same variable as `color` .pull-left[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species, * shape = species)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-21-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Size .pull-left[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species, shape = species, * size = body_mass_g)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-22-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Alpha .pull-left[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species, shape = species, size = body_mass_g, * alpha = flipper_length_mm)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-23-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .pull-left[ **Mapping** ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, * size = body_mass_g, * alpha = flipper_length_mm)) + geom_point() ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-24-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ **Setting** ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + * geom_point(size = 2, alpha = 0.5) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-25-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Mapping vs. setting - **Mapping:** Determine the size, alpha, etc. of points based on the values of a variable in the data - goes into `aes()` - **Setting:** Determine the size, alpha, etc. of points **not** based on the values of a variable in the data - goes into `geom_*()` - (in the previous example, we used `geom_point()` , - but we'll learn about other geoms soon!) --- class: middle # Faceting --- ## Faceting - Smaller plots that display different subsets of the data - Useful for exploring conditional relationships and large data --- ### Plot <img src="d04_ggplot2_files/figure-html/unnamed-chunk-26-1.png" width="70%" style="display: block; margin: auto;" /> --- ### Code ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(species ~ island) ``` ``` ## Warning: Removed 2 rows containing missing values or values outside the ## scale range (`geom_point()`). ``` --- ## Various ways to facet In the next few slides describe what each plot displays. Think about how the code relates to the output. .question[ **Note:** The plots in the next few slides do not have proper titles, axis labels, etc. because we want you to figure out what's happening in the plots. But you should always label your plots! ] --- ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(species ~ sex) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-27-1.png" width="60%" style="display: block; margin: auto;" /> --- ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(sex ~ species) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-28-1.png" width="60%" style="display: block; margin: auto;" /> --- ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_wrap(~ species) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-29-1.png" width="60%" style="display: block; margin: auto;" /> --- ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(. ~ species) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-30-1.png" width="60%" style="display: block; margin: auto;" /> --- ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_wrap(~ species, ncol = 2) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-31-1.png" width="60%" style="display: block; margin: auto;" /> --- ## Faceting summary - `facet_grid()`: - 2d grid - `rows ~ cols` - use `.` for no split - `facet_wrap()`: 1d ribbon wrapped according to number of rows and columns specified or available plotting area --- ## Facet and color .pull-left-narrow[ ``` r ggplot( penguins, aes(x = bill_depth_mm, y = bill_length_mm, * color = species)) + geom_point() + facet_grid(species ~ sex) + * scale_color_viridis_d() ``` ] .pull-right-wide[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-32-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Facet and color, no legend .pull-left-narrow[ ``` r ggplot( penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + facet_grid(species ~ sex) + scale_color_viridis_d() + * guides(color = FALSE) ``` ] .pull-right-wide[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-33-1.png" width="100%" style="display: block; margin: auto;" /> ] --- class: middle # Wrapping Up...