84 Lab: So what if you smoke when pregnant?

Non-parametric-based inference

In 2004, North Carolina released a comprehensive birth record dataset that holds valuable insights for researchers examining the connection between expectant mothers’ habits and practices and their child’s birth. In this lab, we’ll be exploring a randomly selected subset of that data to evaluate whether maternal smoking influences infant birthweight, and how other factors (like maternal age) relate to birth outcomes.

You’ll practice non-parametric inference techniques such as bootstrapping and permutation testing. These methods allow you to test hypotheses and estimate uncertainty without assuming normally distributed populations. You’ll also practice key data science skills: cleaning, filtering, visualizing, and interpreting real-world health data. You can find the lab here

Getting started

Packages

In this lab, we will work with the tidyverse, infer, and openintro packages. We can install and load them with the following code:

library(tidyverse)
library(infer)
library(openintro)

Housekeeping

Git configuration / password caching

Configure your Git user name and email. If you cannot remember the instructions, refer to an earlier lab. Also remember that you can cache your password for a limited amount of time.

Project name

Update the name of your project to match the lab’s title.

Warm up

Before diving into the dataset, let’s warm up with some simple exercises.

YAML

Open the R Markdown (Rmd) file in your project, change the author name to your name, and knit the document. Doing so will allow you to personalize your Rmd file.

Commiting and pushing changes

Go to the Git pane in your RStudio.
View the Diff and confirm that you are happy with the changes.
Add a commit message like “Update team name” in the Commit message box and hit Commit.
Click on Push. This will prompt a dialogue box where you first need to enter your user name, and then your password.

Set a seed!

In this lab, we’ll be generating random samples. To make sure our results stay consistent, make sure to set a seed before you dive into the sampling process. Otherwise, the data will change each time you knit your lab. Set your seed in the chunk labeled “setup” so that your sampling and bootstrapping results stay consistent across runs and when knitting.

84.1 The data

Load the ncbirths data from the openintro package:

data(ncbirths)

We have observations on 13 different variables, some categorical and some numerical. The meaning of each variable is provided in the following table.

variable	description
`fage`	father’s age in years.
`mage`	mother’s age in years.
`mature`	maturity status of mother.
`weeks`	length of pregnancy in weeks.
`premie`	whether the birth was classified as premature (premie) or full-term.
`visits`	number of hospital visits during pregnancy.
`marital`	whether mother is `married` or `not married` at birth.
`gained`	weight gained by mother during pregnancy in pounds.
`weight`	weight of the baby at birth in pounds.
`lowbirthweight`	whether baby was classified as low birthweight (`low`) or not (`not low`).
`gender`	gender of the baby, `female` or `male`.
`habit`	status of the mother as a `nonsmoker` or a `smoker`.
`whitemom`	whether mom is `White` or `not White`.

Before analyzing any new dataset, it’s important to get to know your data…

84.2 Exercises

Which variables in this dataset are numeric, and what do their distributions look like? Are there any outliers or extreme values that might affect your analysis?

Support your answers with both summary statistics and visualizations. You can use tools such as summary(), map_dfr(), or skim() for numerical summaries, and faceted boxplots or histograms to visualize distributions. If you’re not sure how to check efficiently, consider transforming the data into long format and using faceted boxplots.

84.2.1 Baby weights

Wen, Shi Wu, Michael S. Kramer, and Robert H. Usher. “Comparison of birth weight distributions between Chinese and Caucasian infants.” American Journal of Epidemiology 141.12 (1995): 1177-1187.

A 1995 study suggests that average weight of White babies born in the US is 3,369 grams (7.43 pounds). In this dataset, we have pretty limited information on race, we only know whether the mother is White. We will make the simplifying assumption that babies of White mothers are also White, i.e. whitemom = "white". (Yes, I know that this assumption is a gross oversimplification).

We want to evaluate whether the average weight of White babies has changed since 1995.

Our null hypothesis should state “there is nothing going on”, i.e. no change since 1995: \(H_0: \mu = 7.43~pounds\).

Our alternative hypothesis should reflect the research question, i.e., some change since 1995. Since the research question doesn’t state a direction for the change, we use a two-sided alternative hypothesis: \(H_A: \mu \ne 7.43~pounds\).

To test whether the average birth weight of White babies has changed since 1995, we first need to isolate that group in our data. Create a filtered data frame called ncbirths_white that contains data only from White mothers. Then, calculate the mean of the weights of their babies.

This sample mean will serve as your observed statistic when we build a null distribution in the next exercise.

Are the criteria necessary for conducting simulation-based inference satisfied? Briefly consider whether simulation-based inference (e.g., bootstrapping) is appropriate in this context by addressing the following:

Are observations in the sample independent of each other?
Is the sample size reasonably large?
Does the shape of the distribution pose any concerns (e.g., extreme skew or clustering)?

Use graphical summaries (histogram, boxplot) and/or numerical summaries to support your answer. Refer back to your plots or output from Exercise 1.

Let’s discuss how this test would work. Our goal is to simulate what the distribution of sample means might look like if the true population mean were 7.43 pounds. To do that, we’ll use bootstrapping and then shift the resulting distribution to be centered under the null hypothesis.

You saw this process in the bootstrapping lecture (the “Rent in Edinburgh” example)—where we estimated a sampling distribution from a real dataset, then aligned it with a hypothetical population value in order to test whether the data were consistent with that value. The same logic applies here.

Here’s the general procedure: - Take a bootstrap sample (with replacement) from ncbirths_white. - Calculate the mean of this bootstrapped resample. - Repeat these two steps many times (e.g., 1,000 or more) to create a bootstrap distribution of sample means. - This distribution will be centered at the observed sample mean. - To simulate the null hypothesis, shift the entire distribution so that its mean equals 7.43. You can do this by subtracting the bootstrap distribution’s mean from each value, and then adding 7.43 back. - Once shifted, compare the observed mean to this distribution: calculate the proportion of simulated means that are at least as extreme as the observed mean (i.e., the p-value).

📌 If this logic feels abstract, go back and review the Lecture on Bootstrapping. The same shifting strategy was used there to simulate a null centered at a hypothesized rent value.

Run a hypothesis test using simulation-based inference. Your goal is to simulate a null distribution of sample means assuming a population mean of 7.43 pounds, and then assess how unusual the observed sample mean is within that distribution.

To structure your analysis, work through the following steps:

4a. Use bootstrapping to simulate a distribution of sample means from ncbirths_white.

4b. Shift the distribution so that its center aligns with the null hypothesis value (7.43 pounds).

4c. Create a histogram of your shifted null distribution. Overlay a dashed vertical line to show your observed sample mean.

4d. Calculate the two-tailed p-value: what proportion of simulated means are at least as extreme (in both directions) as your observed value?

4e. Interpret your results. Based on the p-value and your visualization, what do you conclude about whether birth weight has changed since 1995?

84.2.2 Baby weight vs. smoking

Consider the relationship between a mother’s smoking habit and the weight of her baby. Plotting the data is a useful first step because it helps us quickly visualize trends, identify strong associations, and develop research questions.

Make side-by-side box plots displaying the relationship between habit and weight. What do you notice about the shape, center, and spread of the two groups? Are there any clear differences?

Use this visualization to inform the question you’ll test in the next few exercises.

Before comparing the two groups, remove any rows with missing values in either habit or weight. Name this version ncbirths_clean. Why is it important to do this before calculating group summaries?
What is the observed difference in mean birth weight between babies born to smoking and non-smoking mothers?

Use the cleaned dataset to calculate the average birth weight for each group. Then compute the difference in means (non-smokers minus smokers).

ncbirths_clean %>%
  group_by(habit) %>%
  summarize(mean_weight = mean(weight))

This observed difference is the statistic you’ll use to anchor your inference in the next step.

We can see that there’s an observable difference, but is this difference meaningful? Is it statistically significant? We can answer this question by conducting a hypothesis test.

What are the hypotheses for testing whether smoking is associated with a difference in birth weight?

Formulate your hypotheses in both words and symbols.

What does the null hypothesis say about the difference in average weights?
What does the alternative hypothesis suggest?

Write your hypotheses clearly using:

A contextual statement (in words)
A symbolic version using \(\mu_1\) and \(\mu_2\), where each symbol refers to one of the two groups

Optional: Sketch or describe what the null distribution might look like if the null hypothesis were true.

How can we use simulation-based inference to test whether maternal smoking is associated with a difference in birth weight?

Run the appropriate hypothesis test using either a bootstrap or permutation method. Use your cleaned dataset (ncbirths_clean) and compare the average birth weight between smokers and non-smokers.

To organize your analysis, consider the following:

What kind of inference method is most appropriate here—bootstrap or permutation?
What is your observed test statistic?
How can you generate the sampling distribution under the null?
What is the resulting p-value?

You saw a similar structure when testing whether the average birth weight of White babies had changed. This time, you’re comparing two groups, so consider how the structure of the null distribution and test statistic changes.

How large could the difference in average birth weights plausibly be, based on your data?

Construct a 95% confidence interval for the difference in means between babies born to smoking and non-smoking mothers. Use a non-parametric method (e.g., bootstrapped difference in means) and report your results.

84.2.3 Mother’s age vs. baby weight

In this portion of the analysis, we focus on two variables. The first one is maturemom, which classifies mothers as either “younger” or “mature.”

The second variable is lowbirthweight, a binary indicator of whether the baby’s birth weight was classified as “low.”

In the following exercises, you’ll determine how maturemom is constructed and then test whether the rate of low birth weight differs between the two age groups.

First, a non-inference task: Determine the age cutoff for younger and mature mothers.

Use a method of your choice to determine where the cutoff lies between “younger” and “mature” mothers, based on the mature variable. How does this variable appear to be coded, and what is the minimum age that qualifies as a “mature” mother?

Is the proportion of low birth weight babies higher for mature mothers than for younger mothers?

Conduct a hypothesis test to evaluate whether the proportion of low birth weight babies differs between younger and mature mothers. Use a significance level of \(\alpha = 0.05\).

Work through the following steps:

Clearly state your null and alternative hypotheses (in words and in symbols).
Consider the conditions for simulation-based inference. Are they met here?
Choose an appropriate test (e.g., bootstrapped or permuted difference in proportions).
Calculate the p-value and interpret the result in the context of the research question.

What is a plausible range for the difference in proportions of low birth weight babies between mature and younger mothers?

Construct a 95% confidence interval for the difference in proportions of low birth weight babies between the two groups. Then interpret the interval: what do the bounds tell you?

84.3 Wrap up

In this lab, you practiced using non-parametric inference to analyze the impact of maternal smoking during pregnancy on the weight of the baby. Specifically, you used bootstrapping and permutation testing to analyze real-world health data from North Carolina birth records. You tested whether: - The average birth weight of White babies has changed since 1995 - Maternal smoking is associated with lower average birth weight - Mature mothers are more likely to have babies with low birth weight

Along the way, you also: - Cleaned and filtered messy real-world data

Explored variable distributions and checked model assumptions
Interpreted p-values and confidence intervals in context
Applied core concepts from earlier lectures, including simulation-based inference.

Remember to save your work, commit your changes, and push them to your Git repository!