Materials

BDSI Workshop
Published

April 8, 2024

🕙 Schedule

Time Content
10:00–10:20 Tidy data
10:20–10:40 Exercise 1
10:40–10:50 Discuss exercise 1
10:50–11:30 Welcome to Tidyverse
11:30–11:40 Break
11:40–12:10 Exercise 2
12:10–12:30 Pivotting data
12:30–12:50 Exercise 3
12:50–13:00 Wrap up

📑 Resources

  • Learn R Chapter 5: Data wrangling with R

🏋️‍♀️ Exercises

From the print handouts (given out in the workshop), discuss with your group about whether the data are organised as tidy data for each spreadsheet. If it is not a tidy data, then how would you organise it so that it is tidy?

Download spreadsheet here.

Reflect on learning objectives
You should be able to:
  • Recognize the characteristics of tidy data
  • Use the smartpill data from the medicaldata package to complete Table 1 by filling in the missing values.
Table 1: The table below shows the mean (to 2 decimal places), sample standard deviation (to 3 decimal places), and the total observed (non-missing) samples of gastric emptying transit time by two groups (critically ill trauma patients and healthy volunteers).
Group Mean SD N
Critically ill trauma patients
Healthy volunteer
  • Using the strep_tb data in the medicaldata package, calculate the number of patients in each baseline condition by gender. Using this calculation, fill in the missing values in Table 2.
Table 2: The table below shows the number of patients in each baseline condition by gender.
Baseline condition Gender N
Good Female
Good Male
Fair Female
Fair Male
Poor Female
Poor Male
  • Using the aastveit.barley.covs from the agridat package, calculate the total rainfall for each year and then fill out the missing values in the table below.
Year Total Rainfall
1974
1975
1976
1977
1978
1979
1980
1981
1982
Reflect on learning objectives
You should be able to:
  • Differentiate between the Base and Tidyverse paradigms
  • Acquire the skills to add/modify columns, subset data by rows and columns, rename column names, and perform group operations using dplyr
  • Transform the crampton.pig data from agridat package into the format as shown in Table 3.
Table 3: Weight gain in pigs for different treatments.
Figure 1: Table 4 from Aastveit and Martens (1986)
  • Figure 1 shows the data contained in aastveit.barley.covs from the agridat package. Transform this data into a longer format like in Table 4.
Table 4: The data from Figure 1 transformed to a longer format.
Reflect on learning objectives
You should be able to:
  • Pivot data into longer or wider format using tidyr

This website is brought to you by the ANU Biological Data Science Institute.

References

Aastveit, Are Halvor, and Harald Martens. 1986. ANOVA Interactions Interpreted by Partial Least Squares Regression.” Biometrics 42 (4): 829. https://doi.org/10.2307/2530697.