class: center, middle, inverse, title-slide .title[ # Visualizing data with ggplot2
👩🎨 ] .author[ ### S. Mason Garrison ] --- layout: true <div class="my-footer"> <span> <a href="https://DataScience4Psych.github.io/DataScience4Psych/" target="_blank">Data Science for Psychologists</a> </span> </div> --- class: middle # ggplot2 ❤️ 🐧 --- class: middle # Learning Goals --- ## Learning Goals By the end of this session, you will be able to... - Construct layered plots using ggplot2's grammar (data + aesthetics + geoms) - Map variables to aesthetics and distinguish mapping from setting - Use faceting to explore conditional relationships in data - Create publication-ready plots with proper labels and scales --- ## ggplot2 `\(\in\)` tidyverse .pull-left-narrow[ <img src="img/ggplot2-part-of-tidyverse.png" alt="" width="60%" style="display: block; margin: auto;" /> ] <!-- markdownlint-disable error --> .pull-right-wide[ - **ggplot2** is tidyverse's data visualization package - Structure of the code for plots can be summarized as ``` r ggplot(data = [[dataset]], mapping = aes(x = [[x-variable]], y = [[y-variable]])) + geom_xxx() + other options ``` ] <!-- markdownlint-enable --> --- ## Data: Palmer Penguins Measurements for penguin species, island in Palmer Archipelago, size (flipper length, body mass, bill dimensions), and sex. .pull-left[ <img src="img/penguins.png" alt="" width="80%" style="display: block; margin: auto;" /> ] --- ## Data: Palmer Penguins Measurements for penguin species, island in Palmer Archipelago, size (flipper length, body mass, bill dimensions), and sex. ``` r library(palmerpenguins) glimpse(penguins) ``` ``` ## Rows: 344 ## Columns: 8 ## $ species <fct> Adelie, Adelie, Adelie, Adelie, Adeli~ ## $ island <fct> Torgersen, Torgersen, Torgersen, Torg~ ## $ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.~ ## $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.~ ## $ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195~ ## $ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 362~ ## $ sex <fct> male, female, female, NA, female, mal~ ## $ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2~ ``` --- # Plot <img src="d04_ggplot2_files/figure-html/unnamed-chunk-6-1.png" alt="" width="70%" style="display: block; margin: auto;" /> --- # Code ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", color = "Species") ``` --- class: middle # Wrapping Up... --- class: middle # Coding out loud --- .midi[ > **Start with the `penguins` data frame** ] .pull-left[ ``` r *ggplot(data = penguins) ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-7-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > **map bill depth to the x-axis** ] .pull-left[ ``` r ggplot(data = penguins, * mapping = aes(x = bill_depth_mm)) ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-8-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > **and map bill length to the y-axis.** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, * y = bill_length_mm)) ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-9-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > **Represent each observation with a point** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm)) + * geom_point() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-10-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > **and map species to the color of each point.** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, * color = species)) + geom_point() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-11-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > **Title the plot "Bill depth and length"** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + * labs(title = "Bill depth and length") ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-12-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > Title the plot "Bill depth and length", > **add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins"** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", * subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins") ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-13-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > **label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", * x = "Bill depth (mm)", y = "Bill length (mm)") ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-14-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, > **label the legend "Species"** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", * color = "Species") ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-15-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, > label the legend "Species", > **and add a caption for the data source.** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", color = "Species", * caption = "Source: Palmer Station LTER / palmerpenguins package") ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-16-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the color of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, > label the legend "Species", > and add a caption for the data source. > **Finally, use a discrete color scale that is designed to be perceived by viewers with common forms of color blindness.** ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", color = "Species", caption = "Source: Palmer Station LTER / palmerpenguins package") + * scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-17-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- # Plot <img src="d04_ggplot2_files/figure-html/unnamed-chunk-18-1.png" alt="" width="70%" style="display: block; margin: auto;" /> --- # Code ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", color = "Species", caption = "Source: Palmer Station LTER / palmerpenguins package") + scale_color_viridis_d() ``` --- # Narrative .midi[ + Start with the `penguins` data frame, map bill depth to the x-axis and map bill length to the y-axis. + Represent each observation with a point and map species to the color of each point. + Title the plot "Bill depth and length", add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, label the legend "Species", and add a caption for the data source. + Finally, use a discrete color scale that is designed to be perceived by viewers with common forms of color blindness. ] --- ## Argument names .tip[ You can omit the names of first two arguments when building plots with `ggplot()`. ] .pull-left[ ``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + scale_color_viridis_d() ``` ] --- class: middle # Wrapping Up... --- class: middle # Aesthetics --- ## Aesthetics options Commonly used characteristics of plotting characters that can be **mapped to a specific variable** in the data are - `color` - `shape` - `size` - `alpha` (transparency) --- ## Color .pull-left[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, * color = species)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-19-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- ## Shape Mapped to a different variable than `color` .pull-left[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species, * shape = island)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-20-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- ## Shape Mapped to same variable as `color` .pull-left[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species, * shape = species)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-21-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- ## Size .pull-left[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species, shape = species, * size = body_mass_g)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-22-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- ## Alpha .pull-left[ ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species, shape = species, size = body_mass_g, * alpha = flipper_length_mm)) + geom_point() + scale_color_viridis_d() ``` ] .pull-right[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-23-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- .pull-left[ **Mapping** ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, * size = body_mass_g, * alpha = flipper_length_mm)) + geom_point() ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-24-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ **Setting** ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + * geom_point(size = 2, alpha = 0.5) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-25-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- ## Mapping vs. setting - **Mapping:** Determine the size, alpha, etc. of points based on the values of a variable in the data - goes into `aes()` - **Setting:** Determine the size, alpha, etc. of points **not** based on the values of a variable in the data - goes into `geom_*()` - (in the previous example, we used `geom_point()` , - but we'll learn about other geoms soon!) --- class: middle # Faceting --- ## Faceting - Smaller plots that display different subsets of the data - Useful for exploring conditional relationships and large data --- ### Plot <img src="d04_ggplot2_files/figure-html/unnamed-chunk-26-1.png" alt="" width="70%" style="display: block; margin: auto;" /> --- ### Code ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(species ~ island) ``` --- ## Various ways to facet In the next few slides describe what each plot displays. Think about how the code relates to the output. .question[ **Note:** The plots in the next few slides do not have proper titles, axis labels, etc. because we want you to figure out what's happening in the plots. But you should always label your plots! ] --- ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(species ~ sex) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-27-1.png" alt="" width="60%" style="display: block; margin: auto;" /> --- ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(sex ~ species) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-28-1.png" alt="" width="60%" style="display: block; margin: auto;" /> --- ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_wrap(~ species) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-29-1.png" alt="" width="60%" style="display: block; margin: auto;" /> --- ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(. ~ species) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-30-1.png" alt="" width="60%" style="display: block; margin: auto;" /> --- ``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_wrap(~ species, ncol = 2) ``` <img src="d04_ggplot2_files/figure-html/unnamed-chunk-31-1.png" alt="" width="60%" style="display: block; margin: auto;" /> --- ## Faceting summary - `facet_grid()`: - 2d grid - `rows ~ cols` - use `.` for no split - `facet_wrap()`: 1d ribbon wrapped according to number of rows and columns specified or available plotting area --- ## Facet and color .pull-left-narrow[ ``` r ggplot( penguins, aes(x = bill_depth_mm, y = bill_length_mm, * color = species)) + geom_point() + facet_grid(species ~ sex) + * scale_color_viridis_d() ``` ] .pull-right-wide[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-32-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- ## Facet and color, no legend .pull-left-narrow[ ``` r ggplot( penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + geom_point() + facet_grid(species ~ sex) + scale_color_viridis_d() + * guides(color = FALSE) ``` ] .pull-right-wide[ <img src="d04_ggplot2_files/figure-html/unnamed-chunk-33-1.png" alt="" width="100%" style="display: block; margin: auto;" /> ] --- class: middle # Summary: Learning Goals Achieved --- ## What We've Learned Today, you should now be able to... .pull-left[ ### Understanding - ✅ ggplot2's layered grammar - ✅ Mapping vs. setting aesthetics ] .pull-right[ ### Skills - ✅ Build complex plots with layers - ✅ Use faceting effectively - ✅ Create publication-ready visualizations ] --- class: middle # Wrapping Up...