Welcome to the Tidyverse

BDSI R Training I

Emi Tanaka

Biological Data Science Institute

3rd November 2023

Today’s learning objectives

  • Be familiar with the tidyverse ecosystem
  • Understand the concept of tidy data
  • Tidy data with tidyr
  • Wrangle data with dplyr
  • Plot a layer with ggplot2

Tidyverse

  • Tidyverse refers to a collection of R-packages that share a common design philosophy, grammar and data structure.
  • This trains your mental model to do data science tasks in a manner which may make it easier, faster, and/or fun for you to do these tasks.
  • library(tidyverse) is a shorthand for loading the 9 core tidyverse packages.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.3     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Tidy data

Definition of a tidy data (Wickham, 2014)

  • Each variable must have its own column
  • Each observation must have its own row
  • Each value must have its own cell
  • dplyr, tidyr and ggplot2 are downstream packages to work with tidy data

Wickham (2014) Tidy Data. Journal of Statistical Software

Is this tidy data? Part 1

Estimated residential population in December
state 2019 2018 2017
NSW 8130159 80366651 7919815
VIC 6655284 6528601 6387081
ACT 427892 423169 415874
Estimated residential population in December
state year population
NSW 2019 8130159
NSW 2018 80366651
NSW 2017 7919815
VIC 2019 6655284
VIC 2018 6528601
VIC 2017 6387081
ACT 2019 427892
ACT 2018 423169
ACT 2017 415874

Values adapted from Australian Bureau of Statistics. (2020). Table 04. Estimated Resident Population, States and Territories [Time series spreadsheet]. National, state and territory population, Australia Mar 2020. Retrieved Nov 24, 2020. https://www.abs.gov.au/statistics/people/population/national-state-and-territory-population/mar-2020/310104.xls

Is this tidy data? Part 2

Total weekly personal income by age and sex from ABS 2016 Census

STE_CODE_2016 M_Neg_Nil_income_15_19_yrs M_Neg_Nil_income_20_24_yrs M_Neg_Nil_income_25_34_yrs M_Neg_Nil_income_35_44_yrs M_Neg_Nil_income_45_54_yrs M_Neg_Nil_income_55_64_yrs M_Neg_Nil_income_65_74_yrs M_Neg_Nil_income_75_84_yrs M_Negtve_Nil_incme_85_yrs_ovr M_Neg_Nil_income_Tot M_1_149_15_19_yrs M_1_149_20_24_yrs M_1_149_25_34_yrs M_1_149_35_44_yrs M_1_149_45_54_yrs M_1_149_55_64_yrs M_1_149_65_74_yrs M_1_149_75_84_yrs M_1_149_85ov M_1_149_Tot M_150_299_15_19_yrs M_150_299_20_24_yrs M_150_299_25_34_yrs M_150_299_35_44_yrs M_150_299_45_54_yrs M_150_299_55_64_yrs M_150_299_65_74_yrs M_150_299_75_84_yrs M_150_299_85ov M_150_299_Tot M_300_399_15_19_yrs M_300_399_20_24_yrs M_300_399_25_34_yrs M_300_399_35_44_yrs M_300_399_45_54_yrs M_300_399_55_64_yrs M_300_399_65_74_yrs M_300_399_75_84_yrs M_300_399_85ov M_300_399_Tot M_400_499_15_19_yrs M_400_499_20_24_yrs M_400_499_25_34_yrs M_400_499_35_44_yrs M_400_499_45_54_yrs M_400_499_55_64_yrs M_400_499_65_74_yrs M_400_499_75_84_yrs M_400_499_85ov M_400_499_Tot M_500_649_15_19_yrs M_500_649_20_24_yrs M_500_649_25_34_yrs M_500_649_35_44_yrs M_500_649_45_54_yrs M_500_649_55_64_yrs M_500_649_65_74_yrs M_500_649_75_84_yrs M_500_649_85ov M_500_649_Tot M_650_799_15_19_yrs M_650_799_20_24_yrs M_650_799_25_34_yrs M_650_799_35_44_yrs M_650_799_45_54_yrs M_650_799_55_64_yrs M_650_799_65_74_yrs M_650_799_75_84_yrs M_650_799_85ov M_650_799_Tot M_800_999_15_19_yrs M_800_999_20_24_yrs M_800_999_25_34_yrs M_800_999_35_44_yrs M_800_999_45_54_yrs M_800_999_55_64_yrs M_800_999_65_74_yrs M_800_999_75_84_yrs M_800_999_85ov M_800_999_Tot M_1000_1249_15_19_yrs M_1000_1249_20_24_yrs M_1000_1249_25_34_yrs M_1000_1249_35_44_yrs M_1000_1249_45_54_yrs M_1000_1249_55_64_yrs M_1000_1249_65_74_yrs M_1000_1249_75_84_yrs M_1000_1249_85ov M_1000_1249_Tot M_1250_1499_15_19_yrs M_1250_1499_20_24_yrs M_1250_1499_25_34_yrs M_1250_1499_35_44_yrs M_1250_1499_45_54_yrs M_1250_1499_55_64_yrs M_1250_1499_65_74_yrs M_1250_1499_75_84_yrs M_1250_1499_85ov M_1250_1499_Tot M_1500_1749_15_19_yrs M_1500_1749_20_24_yrs M_1500_1749_25_34_yrs M_1500_1749_35_44_yrs M_1500_1749_45_54_yrs M_1500_1749_55_64_yrs M_1500_1749_65_74_yrs M_1500_1749_75_84_yrs M_1500_1749_85ov M_1500_1749_Tot M_1750_1999_15_19_yrs M_1750_1999_20_24_yrs M_1750_1999_25_34_yrs M_1750_1999_35_44_yrs M_1750_1999_45_54_yrs M_1750_1999_55_64_yrs M_1750_1999_65_74_yrs M_1750_1999_75_84_yrs M_1750_1999_85ov M_1750_1999_Tot M_2000_2999_15_19_yrs M_2000_2999_20_24_yrs M_2000_2999_25_34_yrs M_2000_2999_35_44_yrs M_2000_2999_45_54_yrs M_2000_2999_55_64_yrs M_2000_2999_65_74_yrs M_2000_2999_75_84_yrs M_2000_2999_85ov M_2000_2999_Tot M_3000_more_15_19_yrs M_3000_more_20_24_yrs M_3000_more_25_34_yrs M_3000_more_35_44_yrs M_3000_more_45_54_yrs M_3000_more_55_64_yrs M_3000_more_65_74_yrs M_3000_more_75_84_yrs M_3000_more_85ov M_3000_more_Tot M_PI_NS_15_19_yrs M_PI_NS_ns_20_24_yrs M_PI_NS_ns_25_34_yrs M_PI_NS_ns_35_44_yrs M_PI_NS_ns_45_54_yrs M_PI_NS_ns_55_64_yrs M_PI_NS_ns_65_74_yrs M_PI_NS_ns_75_84_yrs M_PI_NS_ns_85_yrs_ovr M_PI_NS_ns_Tot M_Tot_15_19_yrs M_Tot_20_24_yrs M_Tot_25_34_yrs M_Tot_35_44_yrs M_Tot_45_54_yrs M_Tot_55_64_yrs M_Tot_65_74_yrs M_Tot_75_84_yrs M_Tot_85ov M_Tot_Tot F_Neg_Nil_income_15_19_yrs F_Neg_Nil_income_20_24_yrs F_Neg_Nil_income_25_34_yrs F_Neg_Nil_income_35_44_yrs F_Neg_Nil_income_45_54_yrs F_Neg_Nil_income_55_64_yrs F_Neg_Nil_income_65_74_yrs F_Neg_Nil_income_75_84_yrs F_Neg_Nil_incme_85_yrs_ovr F_Neg_Nil_income_Tot F_1_149_15_19_yrs F_1_149_20_24_yrs F_1_149_25_34_yrs F_1_149_35_44_yrs F_1_149_45_54_yrs F_1_149_55_64_yrs F_1_149_65_74_yrs F_1_149_75_84_yrs F_1_149_85ov F_1_149_Tot F_150_299_15_19_yrs F_150_299_20_24_yrs F_150_299_25_34_yrs F_150_299_35_44_yrs F_150_299_45_54_yrs F_150_299_55_64_yrs F_150_299_65_74_yrs F_150_299_75_84_yrs F_150_299_85ov F_150_299_Tot F_300_399_15_19_yrs F_300_399_20_24_yrs F_300_399_25_34_yrs F_300_399_35_44_yrs F_300_399_45_54_yrs F_300_399_55_64_yrs F_300_399_65_74_yrs F_300_399_75_84_yrs F_300_399_85ov F_300_399_Tot F_400_499_15_19_yrs F_400_499_20_24_yrs F_400_499_25_34_yrs F_400_499_35_44_yrs F_400_499_45_54_yrs F_400_499_55_64_yrs F_400_499_65_74_yrs F_400_499_75_84_yrs F_400_499_85ov F_400_499_Tot F_500_649_15_19_yrs F_500_649_20_24_yrs F_500_649_25_34_yrs F_500_649_35_44_yrs F_500_649_45_54_yrs F_500_649_55_64_yrs F_500_649_65_74_yrs F_500_649_75_84_yrs F_500_649_85ov F_500_649_Tot F_650_799_15_19_yrs F_650_799_20_24_yrs F_650_799_25_34_yrs F_650_799_35_44_yrs F_650_799_45_54_yrs F_650_799_55_64_yrs F_650_799_65_74_yrs F_650_799_75_84_yrs F_650_799_85ov F_650_799_Tot F_800_999_15_19_yrs F_800_999_20_24_yrs F_800_999_25_34_yrs F_800_999_35_44_yrs F_800_999_45_54_yrs F_800_999_55_64_yrs F_800_999_65_74_yrs F_800_999_75_84_yrs F_800_999_85ov F_800_999_Tot F_1000_1249_15_19_yrs F_1000_1249_20_24_yrs F_1000_1249_25_34_yrs F_1000_1249_35_44_yrs F_1000_1249_45_54_yrs F_1000_1249_55_64_yrs F_1000_1249_65_74_yrs F_1000_1249_75_84_yrs F_1000_1249_85ov F_1000_1249_Tot F_1250_1499_15_19_yrs F_1250_1499_20_24_yrs F_1250_1499_25_34_yrs F_1250_1499_35_44_yrs F_1250_1499_45_54_yrs F_1250_1499_55_64_yrs F_1250_1499_65_74_yrs F_1250_1499_75_84_yrs F_1250_1499_85ov F_1250_1499_Tot F_1500_1749_15_19_yrs F_1500_1749_20_24_yrs F_1500_1749_25_34_yrs F_1500_1749_35_44_yrs F_1500_1749_45_54_yrs F_1500_1749_55_64_yrs F_1500_1749_65_74_yrs F_1500_1749_75_84_yrs F_1500_1749_85ov F_1500_1749_Tot F_1750_1999_15_19_yrs F_1750_1999_20_24_yrs F_1750_1999_25_34_yrs F_1750_1999_35_44_yrs F_1750_1999_45_54_yrs F_1750_1999_55_64_yrs F_1750_1999_65_74_yrs F_1750_1999_75_84_yrs F_1750_1999_85ov F_1750_1999_Tot F_2000_2999_15_19_yrs F_2000_2999_20_24_yrs F_2000_2999_25_34_yrs F_2000_2999_35_44_yrs F_2000_2999_45_54_yrs F_2000_2999_55_64_yrs F_2000_2999_65_74_yrs F_2000_2999_75_84_yrs F_2000_2999_85ov F_2000_2999_Tot F_3000_more_15_19_yrs F_3000_more_20_24_yrs F_3000_more_25_34_yrs F_3000_more_35_44_yrs F_3000_more_45_54_yrs F_3000_more_55_64_yrs F_3000_more_65_74_yrs F_3000_more_75_84_yrs F_3000_more_85ov F_3000_more_Tot F_PI_NS_15_19_yrs F_PI_NS_ns_20_24_yrs F_PI_NS_ns_25_34_yrs F_PI_NS_ns_35_44_yrs F_PI_NS_ns_45_54_yrs F_PI_NS_ns_55_64_yrs F_PI_NS_ns_65_74_yrs F_PI_NS_ns_75_84_yrs F_PI_NS_ns_85_yrs_ovr F_PI_NS_ns_Tot F_Tot_15_19_yrs F_Tot_20_24_yrs F_Tot_25_34_yrs F_Tot_35_44_yrs F_Tot_45_54_yrs F_Tot_55_64_yrs F_Tot_65_74_yrs F_Tot_75_84_yrs F_Tot_85ov F_Tot_Tot P_Neg_Nil_income_15_19_yrs P_Neg_Nil_income_20_24_yrs P_Neg_Nil_income_25_34_yrs P_Neg_Nil_income_35_44_yrs P_Neg_Nil_income_45_54_yrs P_Neg_Nil_income_55_64_yrs P_Neg_Nil_income_65_74_yrs P_Neg_Nil_income_75_84_yrs P_Negtve_Nil_incme_85_yrs_ovr P_Neg_Nil_income_Tot P_1_149_15_19_yrs P_1_149_20_24_yrs P_1_149_25_34_yrs P_1_149_35_44_yrs P_1_149_45_54_yrs P_1_149_55_64_yrs P_1_149_65_74_yrs P_1_149_75_84_yrs P_1_149_85ov P_1_149_Tot P_150_299_15_19_yrs P_150_299_20_24_yrs P_150_299_25_34_yrs P_150_299_35_44_yrs P_150_299_45_54_yrs P_150_299_55_64_yrs P_150_299_65_74_yrs P_150_299_75_84_yrs P_150_299_85ov P_150_299_Tot P_300_399_15_19_yrs P_300_399_20_24_yrs P_300_399_25_34_yrs P_300_399_35_44_yrs P_300_399_45_54_yrs P_300_399_55_64_yrs P_300_399_65_74_yrs P_300_399_75_84_yrs P_300_399_85ov P_300_399_Tot P_400_499_15_19_yrs P_400_499_20_24_yrs P_400_499_25_34_yrs P_400_499_35_44_yrs P_400_499_45_54_yrs P_400_499_55_64_yrs P_400_499_65_74_yrs P_400_499_75_84_yrs P_400_499_85ov P_400_499_Tot P_500_649_15_19_yrs P_500_649_20_24_yrs P_500_649_25_34_yrs P_500_649_35_44_yrs P_500_649_45_54_yrs P_500_649_55_64_yrs P_500_649_65_74_yrs P_500_649_75_84_yrs P_500_649_85ov P_500_649_Tot P_650_799_15_19_yrs P_650_799_20_24_yrs P_650_799_25_34_yrs P_650_799_35_44_yrs P_650_799_45_54_yrs P_650_799_55_64_yrs P_650_799_65_74_yrs P_650_799_75_84_yrs P_650_799_85ov P_650_799_Tot P_800_999_15_19_yrs P_800_999_20_24_yrs P_800_999_25_34_yrs P_800_999_35_44_yrs P_800_999_45_54_yrs P_800_999_55_64_yrs P_800_999_65_74_yrs P_800_999_75_84_yrs P_800_999_85ov P_800_999_Tot P_1000_1249_15_19_yrs P_1000_1249_20_24_yrs P_1000_1249_25_34_yrs P_1000_1249_35_44_yrs P_1000_1249_45_54_yrs P_1000_1249_55_64_yrs P_1000_1249_65_74_yrs P_1000_1249_75_84_yrs P_1000_1249_85ov P_1000_1249_Tot P_1250_1499_15_19_yrs P_1250_1499_20_24_yrs P_1250_1499_25_34_yrs P_1250_1499_35_44_yrs P_1250_1499_45_54_yrs P_1250_1499_55_64_yrs P_1250_1499_65_74_yrs P_1250_1499_75_84_yrs P_1250_1499_85ov P_1250_1499_Tot P_1500_1749_15_19_yrs P_1500_1749_20_24_yrs P_1500_1749_25_34_yrs P_1500_1749_35_44_yrs P_1500_1749_45_54_yrs P_1500_1749_55_64_yrs P_1500_1749_65_74_yrs P_1500_1749_75_84_yrs P_1500_1749_85ov P_1500_1749_Tot P_1750_1999_15_19_yrs P_1750_1999_20_24_yrs P_1750_1999_25_34_yrs P_1750_1999_35_44_yrs P_1750_1999_45_54_yrs P_1750_1999_55_64_yrs P_1750_1999_65_74_yrs P_1750_1999_75_84_yrs P_1750_1999_85ov P_1750_1999_Tot P_2000_2999_15_19_yrs P_2000_2999_20_24_yrs P_2000_2999_25_34_yrs P_2000_2999_35_44_yrs P_2000_2999_45_54_yrs P_2000_2999_55_64_yrs P_2000_2999_65_74_yrs P_2000_2999_75_84_yrs P_2000_2999_85ov P_2000_2999_Tot P_3000_more_15_19_yrs P_3000_more_20_24_yrs P_3000_more_25_34_yrs P_3000_more_35_44_yrs P_3000_more_45_54_yrs P_3000_more_55_64_yrs P_3000_more_65_74_yrs P_3000_more_75_84_yrs P_3000_more_85ov P_3000_more_Tot P_PI_NS_15_19_yrs P_PI_NS_ns_20_24_yrs P_PI_NS_ns_25_34_yrs P_PI_NS_ns_35_44_yrs P_PI_NS_ns_45_54_yrs P_PI_NS_ns_55_64_yrs P_PI_NS_ns_65_74_yrs P_PI_NS_ns_75_84_yrs P_PI_NS_ns_85_yrs_ovr P_PI_NS_ns_Tot P_Tot_15_19_yrs P_Tot_20_24_yrs P_Tot_25_34_yrs P_Tot_35_44_yrs P_Tot_45_54_yrs P_Tot_55_64_yrs P_Tot_65_74_yrs P_Tot_75_84_yrs P_Tot_85ov P_Tot_Tot
1 109473 33797 25398 14472 15235 22736 15465 6576 2320 245480 41865 14976 5674 4202 5561 8627 8244 3449 1119 93723 18444 25465 19864 15614 18341 25927 27772 16496 4870 172799 9018 17374 17339 14480 20156 29997 60479 43533 13188 225558 10497 18935 20677 16252 20964 27706 43545 27376 11489 197429 9213 24033 26629 19382 21431 26066 31057 16175 5043 179031 5040 27515 41764 29222 30239 30707 23657 11167 3568 202871 2600 26711 61986 44391 44280 40397 22083 7154 2296 251897 1151 20070 73023 54800 51333 44385 21655 6139 1931 274488 385 9340 53495 43951 38954 30553 12202 2936 923 192736 184 4420 44394 43349 37858 26989 9013 2165 716 169085 90 1878 29445 36276 30618 20560 5783 1349 503 126501 87 1514 42982 68876 55456 34158 9018 2108 672 214871 241 779 15687 48586 49374 29279 9539 2774 1008 157264 21813 21685 48201 40406 38664 34973 30961 21518 12098 270327 230103 248490 526560 494264 478458 433059 330471 170912 61753 2974063 95424 32797 53643 46531 40486 51106 21012 8866 4068 353937 51928 16390 14381 16661 15217 18126 11252 4423 1562 149938 23031 28295 30674 30305 30488 41208 34625 17970 6087 242685 9541 20982 28659 29309 32941 44933 73308 45792 15552 301014 7222 21547 32701 31009 34903 38925 61759 48408 27794 304260 5884 26247 46536 43262 43802 38121 35855 18431 10845 268982 3355 26587 50935 45197 46748 36994 22959 10531 4470 247778 1606 23106 56094 45396 48155 37096 16239 6573 3196 237465 576 16429 60714 45794 45889 35036 12968 4623 2287 224322 169 6349 44811 33903 31641 21973 6725 1910 866 148357 90 2023 33527 30863 28217 19438 4998 1507 794 121458 59 602 18558 23047 21168 14263 3335 1030 505 82559 76 481 21559 33731 27799 16469 4389 1491 707 106694 224 411 6038 18714 17426 8948 3891 1747 982 58377 19126 18948 42136 34906 34646 34068 33252 28889 26038 272003 218321 241181 540962 508625 499528 456703 346555 202200 105755 3119834 204894 66595 79047 61006 55721 73844 36478 15448 6389 599413 93787 31363 20056 20868 20778 26750 19501 7870 2681 243659 41477 53758 50544 45919 48828 67136 62396 34466 10963 415486 18559 38357 45993 43786 53101 74931 133782 89322 28739 526570 17725 40481 53377 47257 55870 66626 105300 75784 39280 501698 15101 50283 73161 62645 65235 64188 66904 34607 15895 448014 8395 54097 92698 74420 76984 67698 46609 21701 8040 450649 4211 49813 118084 89786 92434 77495 38326 13730 5497 489370 1732 36494 133738 100591 97224 79423 34625 10760 4216 498807 554 15688 98307 77854 70595 52528 18922 4853 1786 341092 272 6442 77918 74217 66074 46427 14015 3673 1510 290547 148 2477 47997 59326 51780 34823 9116 2374 1011 209058 164 1998 64540 102608 83247 50630 13407 3602 1379 321562 465 1190 21722 67297 66796 38226 13424 4523 1990 215643 40942 40631 90340 75312 73314 69037 64211 50411 38132 542326 448425 489673 1067524 1002886 977984 889763 677020 373115 167506 6093895
2 88338 31685 21321 12176 12700 16883 11502 4864 1736 201199 38027 15443 5314 3872 4598 6578 6248 2831 947 83859 14404 24502 18377 13035 14432 19362 21286 12944 4000 142347 6041 16083 15153 11440 14479 20680 43541 32914 10052 170390 6633 16767 17420 12871 15611 19490 31744 20256 8549 149345 5249 20317 23775 15826 16990 19775 24533 12661 4180 143307 2890 21927 38051 25091 24766 24018 18372 8652 2893 166657 1600 20837 56308 38378 37087 31737 16527 5393 1990 209859 672 14079 63881 47236 43346 35021 15012 4330 1529 225102 214 5767 45712 38351 33054 24126 8626 2113 683 158640 138 2598 34901 36477 31611 21816 6398 1549 538 136031 63 1085 21647 29005 24774 16129 4212 960 380 98252 116 951 28713 49459 41738 25319 6466 1558 535 154852 201 671 9675 31944 34203 20247 6749 1903 724 106312 17255 17031 36907 30837 29984 26386 23609 16507 8828 207345 181849 209733 437167 395979 379374 327567 244826 129451 47567 2353499 77647 31317 47176 39001 32724 39129 16906 6789 3159 293852 46359 17240 14080 15564 13261 14321 8753 3584 1493 134653 18099 28026 28760 27562 25839 32391 26881 15068 4998 207614 5983 18708 24559 25164 26693 33671 54764 34779 12164 236491 4020 17474 26607 26466 29789 31568 47499 37154 21386 241956 3205 20235 37882 36319 37225 31226 28445 14558 8426 217522 1810 20111 42785 37279 38889 29372 17729 7881 3589 199446 867 17972 49988 37838 39970 28577 12049 4675 2420 194360 392 11564 54077 36905 37977 26801 9151 3324 1761 181940 96 3658 38495 27640 25747 16926 4799 1356 740 119460 71 1125 24977 23872 21781 13900 3405 1079 577 90793 40 395 11895 15600 14463 9176 2189 731 364 54847 63 328 13073 21020 17329 9786 2806 1014 525 65948 183 375 3511 11690 11238 5823 2728 1318 691 37568 15667 15527 34151 28009 28119 27218 26679 22253 18139 215766 174492 204065 452031 409936 401040 349886 264772 155554 80427 2492203 165978 63007 68500 51179 45422 56016 28406 11651 4892 495052 84384 32685 19394 19436 17858 20896 14999 6416 2440 218506 32505 52526 47141 40596 40273 51755 48162 28014 8996 349958 12023 34790 39718 36611 41179 54351 98309 67688 22219 406878 10653 34240 44035 39330 45405 51060 79246 57407 29935 391308 8450 40552 61664 52138 54212 51004 52981 27216 12610 360838 4702 42035 80835 62373 63652 53390 36103 16532 6478 366105 2466 38808 106298 76216 77055 60313 28575 10077 4407 404215 1061 25642 117956 84132 81324 61821 24164 7657 3287 407041 311 9424 84206 65993 58799 41049 13420 3469 1422 278098 210 3720 59880 60349 53396 35718 9803 2624 1115 226824 103 1480 33544 44600 39237 25306 6403 1687 741 153095 174 1279 41788 70485 59071 35105 9266 2575 1061 220801 382 1044 13185 43637 45438 26071 9480 3222 1417 143877 32924 32554 71062 58843 58102 53603 50289 38765 26966 423108 356340 413792 889190 805920 780420 677453 509599 285006 127993 4845710
3 63756 15815 11810 8048 9494 13657 7853 2670 780 133870 29566 8070 3259 2779 3511 5757 5123 1622 509 60202 15010 16798 14541 10942 11848 16449 17179 8294 2144 113204 6698 10229 10619 9115 11810 18296 40112 26359 7095 140326 6812 11299 12101 9754 12503 17173 30418 17336 6298 123693 6609 15468 16045 11384 12749 16101 21868 10461 2771 113459 3907 19719 26678 18811 18597 18857 15601 6841 1940 130948 2276 19153 39760 29901 29340 25525 13749 4071 1153 164930 985 14021 44692 37696 35814 28680 12765 3215 958 178826 316 6943 32519 29711 27380 20034 6786 1468 381 125537 160 3082 25925 29341 27317 18252 4825 1011 290 110200 77 1520 16626 22195 20367 12896 3074 611 192 77556 64 1445 23681 37675 35414 19870 4377 926 288 123750 143 541 8217 21792 24262 14366 4378 1215 361 75272 14806 15179 31391 28008 28141 24362 20482 12598 6621 181590 151196 159280 317852 307148 308537 270282 208597 98704 31767 1853368 52001 15493 27641 25285 24099 29817 10525 3376 1413 189657 36162 8622 9416 10685 9467 11680 6193 1989 690 94903 18823 18178 20716 19395 18746 25796 20765 8918 2762 154102 8044 13369 19363 18741 20237 27473 47741 25533 7930 188429 6252 14368 21552 21370 22914 25413 41879 28681 14633 197060 5634 19851 32742 31229 30589 25907 24457 11576 6190 188167 2942 21027 36188 32906 33235 24383 14147 5811 2205 172855 1243 16307 37115 32092 33713 23698 9313 3342 1463 158274 438 10333 36199 31202 32438 22229 6943 2365 1091 143241 141 4014 24611 22159 21619 13325 3482 877 369 90596 68 1486 17222 20125 19969 12110 2435 647 284 74354 29 521 8414 11709 11743 6784 1547 461 195 41412 38 390 8935 14505 13737 7466 1871 629 249 47817 143 274 2269 6641 6472 3691 1763 834 360 22446 13129 13344 26456 22932 24181 22835 20952 16060 13928 173819 145089 157576 328837 320977 323168 282601 214009 111109 53759 1937126 115755 31303 39452 33337 33591 43469 18379 6046 2192 323525 65733 16692 12675 13464 12979 17436 11315 3611 1194 155103 33836 34976 35254 30335 30600 42247 37944 17213 4905 267300 14747 23594 29983 27855 32042 45764 87855 51895 15025 328755 13063 25663 33649 31126 35416 42585 72300 46017 20928 320752 12242 35322 48784 42611 43339 42008 46323 22036 8961 301624 6846 40745 62869 51713 51833 43243 29751 12658 4148 303800 3520 35457 76876 61995 63049 49225 23061 7410 2614 323206 1421 24355 80894 68898 68252 50905 19714 5582 2048 322062 463 10950 57133 51873 48993 33361 10264 2350 753 216133 229 4571 43146 49466 47282 30361 7260 1662 572 184552 109 2043 25042 33903 32111 19681 4616 1075 384 118970 107 1836 32616 52180 49150 27337 6250 1555 538 171566 288 815 10480 28436 30739 18059 6138 2049 723 97722 27938 28530 57850 50934 52324 47197 41431 28659 20549 355411 296287 316860 646694 628124 631705 552886 422601 209818 85528 3790497
4 24566 7007 4731 3164 3450 4559 2448 851 416 51186 10753 4333 1593 1163 1670 2376 1807 674 218 24599 4789 7677 6957 5216 5997 7550 6495 3688 1107 49480 1851 4049 4483 3952 5520 7915 15945 11811 3795 59317 1903 4141 4783 4199 5683 7161 11922 7321 3294 50416 1644 5399 6353 4784 5375 6440 8750 4501 1489 44729 843 6212 10545 7433 7803 7407 6396 3347 1166 51150 413 5887 14729 11673 11762 10309 5912 2104 757 63542 183 3739 15869 13577 13673 11849 5552 1531 591 66560 60 1625 10382 9900 9807 7647 2915 605 218 43158 33 662 7922 9283 9013 6867 1882 421 151 36239 25 269 4788 7038 7022 4979 1297 257 111 25791 17 265 5467 10427 10885 7057 1708 374 145 36340 55 155 1581 4931 6332 4628 1744 460 199 20086 4392 3801 8092 7450 8124 7418 6525 4497 2687 52984 51516 55223 108274 104188 112117 104169 81316 42437 16340 675578 20327 6123 9210 7496 7072 10079 3577 1277 704 65862 13071 4265 3625 3761 3479 4654 2894 969 376 37093 6151 7971 8348 7794 8029 10753 8615 4541 1631 63838 2128 5084 7199 6954 8282 11232 20394 12509 4623 78394 1546 5106 7876 7790 9393 10561 17433 13436 8228 81372 1092 6393 11619 11040 11557 10328 10039 5160 3342 70571 538 5732 11950 11114 11717 9237 5993 2922 1315 60512 236 4459 12611 10876 12336 9391 4000 1473 859 56233 90 2752 11837 10240 11923 9077 2827 953 578 50279 32 1030 8085 6927 7488 5211 1427 360 202 30756 19 316 5580 6075 6285 4632 1032 304 188 24421 8 82 2500 3782 4189 3181 656 189 100 14690 13 84 2010 3907 4357 2921 752 261 125 14436 57 64 550 1588 1907 1393 706 371 233 6875 3858 3292 6849 5942 6774 7092 7181 6098 5646 52735 49169 52766 109837 105275 114770 109756 87533 50821 28143 708074 44891 13132 13939 10661 10517 14632 6025 2129 1122 117050 23825 8601 5218 4923 5148 7037 4703 1642 594 61699 10942 15641 15309 13015 14028 18310 15115 8228 2737 113323 3979 9139 11678 10902 13802 19144 36334 24314 8417 137711 3446 9247 12659 11992 15073 17728 29358 20756 11524 131786 2734 11793 17971 15822 16935 16764 18784 9662 4828 115297 1383 11945 22498 18543 19518 16642 12387 6263 2480 111663 653 10348 27338 22548 24096 19698 9907 3578 1618 119780 267 6498 27708 23821 25594 20929 8386 2482 1167 116846 92 2657 18461 16823 17294 12861 4343 966 417 73918 53 979 13501 15354 15295 11502 2914 722 338 60660 38 350 7289 10815 11208 8165 1959 448 211 40479 27 343 7469 14336 15248 9981 2469 638 268 50778 107 221 2127 6517 8245 6015 2455 831 440 26956 8248 7095 14945 13384 14899 14514 13711 10599 8329 105720 100686 107986 218107 209468 226891 213923 168852 93258 44479 1383650
5 36101 9806 7795 5029 5494 7347 4435 1474 408 77889 14767 4965 2007 1384 1645 2594 2447 890 265 30969 6332 8529 7554 5196 5501 7024 7687 4120 1088 53029 3033 5367 5144 3718 4633 6769 16282 12106 3321 60367 3028 5310 5478 3864 4524 6226 12244 7596 2649 50919 2899 6955 7363 4748 5038 6374 9950 5055 1437 49831 1699 8017 10768 6792 6567 7047 7365 3536 1064 52852 1005 8530 17896 11758 11438 10387 6700 2037 719 70472 511 7293 24146 17766 16490 13899 6667 1692 597 89062 169 3854 19066 15739 14182 10656 3920 834 264 68694 86 2362 17269 16076 15118 10636 2980 592 171 65283 40 1311 12843 14172 12936 8821 2104 381 134 52747 50 1723 22839 29537 27076 16268 3361 609 170 101625 83 626 9774 20480 21512 12308 3159 701 212 68846 7311 7913 20992 17274 15975 12588 9460 6237 3345 101097 77112 82563 190956 173520 168121 138950 98758 47859 15846 993685 30362 8684 22370 20562 15600 17709 6421 2093 772 124570 18177 5237 6753 7206 5242 5906 3449 1268 456 53708 8032 9247 11700 11010 9558 11684 9836 4899 1575 77546 3090 6644 10087 9800 9420 11634 20701 12695 3856 87937 2376 6536 10550 10286 10227 10791 17605 13272 6441 88080 2184 8166 14958 14315 13721 11701 11068 5418 2779 84312 1207 8542 15715 14246 14531 11022 6862 3097 1272 76497 567 7629 17704 14857 16130 11352 4613 1808 891 75548 233 5567 19935 15050 16728 12258 3712 1249 582 75320 62 2967 14698 11080 11769 7857 2005 462 169 51066 44 1067 11535 9669 10041 6578 1480 381 155 40953 26 428 6895 6927 7211 4694 959 252 114 27506 22 344 7564 9530 9842 5988 1282 378 158 35094 55 154 2012 4579 4523 2644 1038 476 190 15665 6430 6562 15763 12411 12289 11262 9867 8498 7165 90250 72882 77777 188224 171523 166831 143084 100911 56237 26579 1004041 66467 18491 30162 25589 21092 25059 10855 3571 1176 202460 32950 10203 8760 8585 6892 8505 5898 2160 723 84677 14364 17778 19254 16206 15060 18715 17525 9013 2663 130576 6123 12013 15229 13518 14049 18407 36982 24798 7181 148309 5407 11843 16029 14148 14750 17011 29846 20864 9084 139000 5088 15122 22321 19064 18756 18077 21026 10475 4212 134140 2901 16554 26486 21034 21100 18071 14228 6638 2345 129347 1572 16159 35599 26610 27572 21742 11308 3843 1610 146021 744 12860 44089 32816 33219 26159 10380 2941 1177 164378 228 6821 33767 26813 25947 18519 5926 1294 434 119761 129 3430 28804 25749 25160 17216 4463 972 321 106232 69 1742 19740 21103 20146 13515 3061 635 248 80255 73 2061 30401 39061 36916 22250 4644 980 328 136713 140 781 11782 25054 26032 14951 4196 1171 400 84514 13738 14471 36749 29691 28266 23851 19328 14731 10515 191347 149997 160332 379178 345038 334959 282042 199669 104095 42420 1997726
6 7040 1320 950 645 833 1436 736 246 66 13264 3213 986 426 346 453 837 594 181 30 7069 1545 2253 1919 1576 1893 2722 2496 1129 271 15809 602 1096 1277 1296 1892 3097 5996 3804 960 20024 712 1193 1382 1353 2015 2616 4233 2517 941 16953 623 1658 1847 1515 1731 2408 3003 1432 469 14688 361 2052 2819 2209 2550 2703 2174 1053 336 16254 155 1740 4225 3227 3657 3531 1859 649 198 19246 62 1041 4198 3756 4142 3760 1765 477 156 19358 13 420 2717 2830 2962 2380 800 184 75 12382 16 170 1843 2554 2849 2229 532 133 54 10380 6 76 1065 1874 2140 1466 298 62 21 7009 4 52 1075 2317 2715 1851 454 112 36 8615 19 37 320 943 1424 1194 446 110 45 4542 1592 1192 2261 2172 2648 2834 2681 1593 692 17671 15960 15289 28338 28598 33896 35070 28078 13679 4347 203256 5542 1020 1679 1533 1900 3066 909 279 109 16039 3968 938 904 1043 1102 1534 811 227 80 10624 1948 2161 2471 2351 2513 3841 2931 1239 403 19850 761 1430 2285 2368 2906 4432 7162 3690 1092 26120 639 1560 2538 2682 3257 3960 6048 4532 2355 27572 487 2021 3650 3788 3799 3611 3428 1684 1021 23489 189 1890 3599 3498 3803 3010 1996 911 392 19291 72 1264 3215 3185 3684 2875 1236 518 243 16281 33 768 3066 2909 3601 2839 786 321 152 14468 8 296 1906 1967 2273 1669 386 96 49 8648 0 76 1243 1564 1989 1400 256 77 44 6657 3 22 609 1027 1241 889 155 52 30 4026 0 22 411 911 1266 803 170 66 32 3687 13 47 135 393 482 364 191 102 54 1774 1454 1061 1996 2014 2336 2793 2840 2125 1367 17974 15118 14574 29695 31227 36154 37087 29305 15926 7413 216496 12580 2339 2626 2174 2731 4501 1647 527 179 29301 7181 1925 1338 1392 1561 2370 1401 414 114 17698 3493 4413 4391 3921 4406 6564 5424 2373 677 35655 1365 2530 3563 3658 4802 7526 13159 7490 2049 46144 1351 2751 3918 4036 5275 6573 10283 7047 3293 44523 1114 3681 5501 5302 5527 6019 6438 3111 1487 38180 553 3945 6416 5712 6350 5713 4173 1964 727 35546 230 3002 7441 6413 7340 6400 3095 1175 440 35525 90 1806 7266 6662 7746 6598 2553 803 304 33831 24 714 4615 4797 5236 4051 1185 276 117 21023 16 253 3080 4119 4836 3637 793 211 97 17032 12 99 1675 2901 3376 2357 452 117 54 11031 3 72 1487 3224 3981 2659 624 173 68 12294 30 84 458 1339 1903 1555 635 211 100 6317 3042 2254 4260 4185 4980 5633 5512 3721 2060 35649 31078 29865 58034 59830 70050 72160 57381 29604 11767 419752
7 2669 550 524 344 353 505 309 81 13 5347 1073 440 616 404 327 233 160 43 3 3294 742 990 1581 1181 994 683 539 215 35 6958 248 329 494 460 486 558 771 346 72 3759 204 308 489 403 391 447 680 282 71 3287 287 473 674 488 456 433 465 159 36 3484 228 673 933 595 559 552 375 109 20 4047 214 966 1595 1057 974 771 382 79 16 6054 104 976 2489 1612 1364 1033 434 80 15 8118 28 662 2124 1462 1332 914 292 39 7 6868 23 346 1806 1532 1327 915 226 37 6 6220 4 134 1198 1297 1280 833 187 16 3 4947 12 172 2175 2432 2193 1397 267 34 4 8688 5 69 1101 1478 1433 874 209 26 7 5207 1636 1756 4187 3107 2763 1951 1016 339 79 16831 7476 8850 21992 17847 16238 12108 6312 1891 392 93101 2147 518 1547 1129 755 829 322 87 22 7352 1117 393 753 579 431 285 154 55 18 3781 853 1148 1918 1454 1158 950 598 226 49 8365 231 433 1006 803 720 695 843 356 82 5167 207 434 857 739 628 627 709 375 125 4700 211 541 1273 1008 794 669 439 199 58 5191 228 791 1486 1036 955 668 361 104 18 5638 153 892 1846 1295 1158 767 309 61 13 6488 66 777 2305 1522 1542 1022 277 50 21 7577 12 342 1826 1268 1255 782 172 24 5 5681 7 138 1431 1256 1126 683 118 17 0 4791 4 70 756 905 957 631 87 11 3 3413 4 28 794 1188 1149 746 108 17 3 4027 8 25 161 364 385 265 66 16 0 1286 1361 1268 3092 2225 2053 1588 794 293 127 12804 6597 7795 21045 16781 15072 11193 5349 1878 545 86254 4818 1065 2073 1469 1112 1330 628 161 38 12694 2185 832 1366 984 761 516 315 97 20 7077 1597 2145 3500 2629 2148 1642 1134 443 85 15315 478 764 1497 1267 1200 1252 1612 697 157 8922 405 745 1352 1136 1019 1074 1389 663 198 7988 496 1014 1948 1503 1259 1104 902 355 95 8675 456 1469 2420 1630 1518 1221 732 210 39 9691 367 1857 3442 2352 2133 1539 689 138 22 12545 169 1751 4798 3138 2908 2056 715 132 36 15695 41 998 3948 2731 2582 1698 466 60 8 12546 31 486 3239 2790 2450 1600 343 58 5 11013 7 197 1953 2205 2237 1462 279 22 3 8359 14 195 2967 3616 3341 2144 374 51 9 12720 17 90 1260 1849 1820 1135 269 44 6 6485 2996 3033 7271 5333 4818 3537 1815 631 201 29640 14077 16644 43036 34630 31310 23296 11661 3771 938 179359
8 5515 2251 1231 624 517 689 435 165 62 11481 2739 928 326 211 202 257 229 98 35 5026 1145 1575 811 590 492 605 690 440 115 6458 425 1057 717 504 560 691 1205 947 279 6377 434 1181 974 666 603 648 1006 632 239 6379 564 1464 1371 755 636 682 996 577 191 7239 355 1576 1839 1072 844 905 1054 572 175 8398 231 1735 2993 1712 1390 1448 1328 680 211 11719 101 1580 4391 2539 1928 1978 1612 582 210 14926 18 837 4460 2563 1893 1739 1169 327 117 13130 8 317 3927 3233 2424 1704 951 254 96 12927 8 97 2450 2722 2045 1408 611 132 54 9525 14 82 3487 6884 6151 3529 993 201 81 21423 12 37 947 2933 3688 2388 669 137 49 10859 797 1141 2503 1937 1635 1243 1008 695 405 11357 12366 15863 32435 28940 25003 19908 13946 6445 2316 157220 4687 2073 2516 1845 1266 1631 778 248 107 15154 3255 948 721 721 545 667 646 246 75 7821 1527 1795 1252 1020 823 1140 1335 742 203 9834 532 1307 1133 1007 852 1199 1907 1264 395 9601 403 1239 1347 1091 966 1259 2003 1420 691 10423 432 1417 2033 1491 1217 1399 1771 884 397 11042 270 1424 2215 1754 1432 1488 1488 678 276 11032 166 1495 2821 2112 1904 1729 1247 486 206 12170 77 1664 4147 2701 2397 2022 1091 370 178 14642 16 821 4846 2920 2483 1818 650 202 92 13855 9 231 3898 3219 2940 1845 512 146 61 12859 3 57 1981 2387 2001 1197 292 81 42 8039 8 40 2310 4396 4246 2127 432 99 52 13711 14 22 296 1178 1451 767 209 109 45 4093 745 1039 2172 1614 1435 1310 1142 947 1016 11424 12145 15565 33679 29460 25979 21597 15506 7916 3841 165694 10201 4322 3743 2472 1786 2322 1211 409 166 26630 5993 1877 1044 926 752 921 874 342 108 12845 2673 3369 2060 1606 1316 1739 2020 1183 321 16295 956 2362 1844 1513 1412 1890 3112 2211 675 15979 837 2424 2324 1755 1573 1905 3010 2051 931 16803 994 2877 3403 2251 1851 2080 2762 1465 595 18283 633 3000 4061 2826 2277 2392 2540 1250 449 19426 398 3226 5816 3821 3293 3177 2573 1166 416 23886 171 3244 8538 5239 4323 4003 2706 953 386 29572 36 1658 9308 5483 4371 3562 1815 533 214 26984 19 554 7831 6451 5366 3549 1467 400 151 25781 9 152 4427 5110 4048 2600 905 214 99 17564 23 120 5793 11276 10398 5655 1430 299 132 35131 21 58 1243 4107 5142 3155 874 244 98 14955 1545 2178 4675 3552 3070 2553 2141 1648 1419 22778 24507 31430 66119 58407 50989 41506 29453 14363 6158 322917
9 37 3 4 10 7 13 15 3 0 97 12 3 0 5 3 5 6 0 0 36 7 3 3 6 7 24 19 11 3 82 0 0 5 8 6 12 31 20 0 94 3 3 8 4 10 17 24 16 9 92 6 13 5 18 28 25 26 3 6 129 5 5 20 22 29 35 17 9 3 145 3 16 24 35 32 31 17 3 0 163 0 10 34 31 33 27 19 0 0 159 0 3 22 21 27 21 19 0 0 124 0 7 18 21 25 28 13 3 0 109 0 0 20 27 28 22 8 0 0 104 0 3 27 38 49 35 12 0 0 153 0 0 7 13 7 11 10 0 0 43 27 58 139 97 79 57 38 15 4 514 106 136 330 345 376 364 263 95 28 2037 36 6 25 26 22 45 14 3 0 182 21 3 7 10 12 13 8 0 0 73 11 3 16 12 20 29 24 11 0 126 3 0 16 15 24 26 45 24 3 160 10 7 19 16 25 25 29 21 10 154 0 4 31 35 28 36 19 9 3 158 0 10 28 31 38 25 10 7 0 154 0 12 14 35 28 19 14 0 0 119 0 11 18 28 27 23 7 0 0 113 0 0 8 18 14 18 3 3 0 65 0 3 11 16 17 14 3 0 0 62 0 0 5 17 9 12 3 0 0 48 0 3 12 9 18 13 0 0 0 48 0 0 3 3 4 0 0 0 0 15 26 10 35 27 45 47 33 13 4 243 110 72 244 296 334 350 214 91 33 1730 78 14 30 32 29 58 31 5 0 280 30 4 7 21 11 20 11 3 3 116 21 5 18 25 27 47 44 28 3 214 3 4 23 17 30 41 73 48 7 247 12 6 25 19 36 42 55 32 16 247 7 16 36 49 56 56 43 10 8 282 8 20 46 57 69 57 30 13 3 300 3 20 43 68 57 58 25 3 0 281 0 15 49 56 63 58 26 0 0 276 3 8 30 37 38 42 24 0 0 185 0 12 24 32 44 36 17 5 0 172 0 0 23 43 36 32 11 0 0 152 0 4 31 48 64 41 12 0 0 206 0 0 7 15 11 13 6 0 0 59 49 72 179 124 125 102 68 30 12 762 215 210 573 642 711 709 478 182 58 3773

What do you think the headers mean?

Based on Australian Bureau of Statistics 2016 Census Data

STE Code
code name abbrv
1 New South Wales NSW
2 Victoria Vic
3 Queensland Qld
4 South Australia SA
5 Western Australia WA
6 Tasmania Tas
7 Northern Territory NT
8 Australian Capital Territory ACT
9 Other Territories OT

Is this tidy data? Part 3

Total weekly personal income by age and sex
state sex income_min income_max age_min age_max count
New South Wales M -Inf 0 15 19 109473
New South Wales M -Inf 0 20 24 33797
New South Wales M -Inf 0 25 34 25398
New South Wales M -Inf 0 35 44 14472
New South Wales M -Inf 0 45 54 15235
New South Wales M -Inf 0 55 64 22736
New South Wales M -Inf 0 65 74 15465
New South Wales M -Inf 0 75 84 6576
New South Wales M -Inf 0 85 Inf 2320
New South Wales M 1 149 15 19 41865

Based on Australian Bureau of Statistics 2016 Census Data

Is this tidy data? Part 4

U.S. historical crop yields by state
year state crop yield
1900 Iowa barley 28.5
1900 Kansas barley 18.0
2000 Kansas barley 35.0
1900 Iowa wheat 14.4
1900 Kansas wheat 18.2
2000 Iowa wheat 47.0
2000 Kansas wheat 37.0
U.S. historical crop yields by state
year state barley_yield wheat_yield
1900 Iowa 28.5 14.4
1900 Kansas 18.0 18.2
2000 Kansas 35.0 37.0
2000 Iowa NA 47.0

United States Department of Agriculture, National Agricultural Statistics Service. http://quickstats.nass.usda.gov/

Kevin Wright (2020). agridat: Agricultural Datasets. R package version 1.17

Is this tidy data? Part 5

U.S. historical crop yields by state
year state crop metric value
1900 Iowa barley yield 28.5
1900 Iowa barley acres 620,000.0
1900 Kansas barley yield 18.0
1900 Kansas barley acres 127,000.0
2000 Kansas barley yield 35.0
2000 Kansas barley acres 7,000.0
1900 Iowa wheat yield 14.4
1900 Iowa wheat acres 1,450,000.0
1900 Kansas wheat yield 18.2
1900 Kansas wheat acres 4,290,000.0
2000 Iowa wheat yield 47.0
2000 Iowa wheat acres 18,000.0
2000 Kansas wheat yield 37.0
2000 Kansas wheat acres 9,400,000.0
U.S. historical crop yields by state
year state barley_yield wheat_yield barley_acres wheat_acres
1900 Iowa 28.5 14.4 620000 1450000
1900 Kansas 18.0 18.2 127000 4290000
2000 Kansas 35.0 37.0 7000 9400000
2000 Iowa NA 47.0 NA 18000

United States Department of Agriculture, National Agricultural Statistics Service. http://quickstats.nass.usda.gov/

Kevin Wright (2020). agridat: Agricultural Datasets. R package version 1.17

United States Department of Agriculture, National Agricultural Statistics Service. http://quickstats.nass.usda.gov/

Kevin Wright (2020). agridat: Agricultural Datasets. R package version 1.17

year state crop yield acres
1900 Iowa barley 28.5 620000
1900 Kansas barley 18.0 127000
2000 Kansas barley 35.0 7000
1900 Iowa wheat 14.4 1450000
1900 Kansas wheat 18.2 4290000
2000 Iowa wheat 47.0 18000
2000 Kansas wheat 37.0 9400000

Wide and long data formats

  • Transforming data from wide to long or vice versa is referred to as pivotting
  • Sometimes the long form is referred to as molten data
  • Some people may refer to transforming data from wide to long as melting the data and vice-versa as casting

Picture credit: Garrick Aden-Buie

Evolving language

  • Earlier efforts to transform data from wide to long was in the reshape, first released on CRAN in 2005-08-05
  • It was then superseded by reshape2 released on CRAN in 2010-09-10
  • Then finally tidyr released on CRAN in 2014-07-21 *v1.0.0 released 2019-09-12

Wide to long

  • reshape::melt
  • reshape2::melt
  • tidyr::gather
  • tidyr::pivot_longer*

Long to wide

  • reshape::cast
  • reshape2::dcast
  • tidyr::spread
  • tidyr::pivot_wider*

Hadley Wickham (2020). tidyr: Tidy Messy Data. R package version 1.1.2.

Hadley Wickham (2007). Reshaping Data with the reshape Package. Journal of Statistical Software, 21(12), 1-20

Lifecycle

Canvas 1 Layer 1 experimental stable superseded deprecated

lifecycle: archived lifecycle archived lifecycle: defunct lifecycle defunct lifecycle: deprecated lifecycle deprecated lifecycle: experimental lifecycle experimental lifecycle: maturing lifecycle maturing lifecycle: questioning lifecycle questioning lifecycle: retired lifecycle retired lifecycle: soft-deprecated lifecycle soft-deprecated lifecycle: stable lifecycle stable lifecycle: superseded lifecycle superseded

  • Functions (and sometimes arguments of functions) in tidyverse packages often are labelled with a badge like on the left
  • Find the definitions of badges here
  • Check out documentations below
help(mutate, package = "dplyr")
help(mutate_each, package = "dplyr")

Lionel Henry (2020). lifecycle: Manage the Life Cycle of your Package Functions. R package version 0.2.0.

Pivotting data with tidyr Part 1

Data
df_wide <- tibble::tibble(state = c("NSW", "VIC", "ACT"),
                          `2019` = c(8130159, 6655284, 427892),
                          `2018` = c(80366651, 6528601, 423169),
                          `2017` = c(7919815, 6387081, 415874))
df_long <- pivot_longer(df_wide, cols = `2019`:`2017`, 
                        names_to = "year",
                        values_to = "population")
df_wide
state 2019 2018 2017
NSW 8130159 80366651 7919815
VIC 6655284 6528601 6387081
ACT 427892 423169 415874
pivot_longer(df_wide, 
             cols = `2019`:`2017`, 
             names_to = "year",
             values_to = "population")
pivot_wider(df_long, 
            id_cols = state,
            names_from = year, 
            values_from = population)
df_long
state year population
NSW 2019 8130159
NSW 2018 80366651
NSW 2017 7919815
VIC 2019 6655284
VIC 2018 6528601
VIC 2017 6387081
ACT 2019 427892
ACT 2018 423169
ACT 2017 415874

Values adapted from Australian Bureau of Statistics. (2020). Table 04. Estimated Resident Population, States and Territories [Time series spreadsheet]. National, state and territory population, Australia Mar 2020. Retrieved Nov 24, 2020. https://www.abs.gov.au/statistics/people/population/national-state-and-territory-population/mar-2020/310104.xls

Pivotting data with tidyr Part 2

Data
yield_long <- data.frame(year = c(1900, 1900, 2000, 1900, 1900, 2000, 2000),
                         state = c("Iowa", "Kansas", "Kansas", "Iowa", "Kansas", "Iowa", "Kansas"),
                         crop = c("barley", "barley", "barley", "wheat", "wheat", "wheat", "wheat"),
                         yield = c(28.5, 18, 35, 14.4, 18.2, 47, 37))
yield_wide <- pivot_wider(yield_long, names_from = crop, values_from = yield, names_glue = "{crop}_yield")
yield_long
year state crop yield
1900 Iowa barley 28.5
1900 Kansas barley 18.0
2000 Kansas barley 35.0
1900 Iowa wheat 14.4
1900 Kansas wheat 18.2
2000 Iowa wheat 47.0
2000 Kansas wheat 37.0
yield_wide
year state barley_yield wheat_yield
1900 Iowa 28.5 14.4
1900 Kansas 18.0 18.2
2000 Kansas 35.0 37.0
2000 Iowa NA 47.0

yield_long yield_wide

pivot_wider(yield_long, 
            id_cols = c(year, state),
            names_from = crop, 
            values_from = yield, 
            names_glue = "{crop}_yield") #<<

United States Department of Agriculture, National Agricultural Statistics Service. http://quickstats.nass.usda.gov/

Kevin Wright (2020). agridat: Agricultural Datasets. R package version 1.17

Pivotting data with tidyr Part 3

Data
yield_long <- data.frame(year = c(1900, 1900, 2000, 1900, 1900, 2000, 2000),
                         state = c("Iowa", "Kansas", "Kansas", "Iowa", "Kansas", "Iowa", "Kansas"),
                         crop = c("barley", "barley", "barley", "wheat", "wheat", "wheat", "wheat"),
                         yield = c(28.5, 18, 35, 14.4, 18.2, 47, 37))
yield_wide <- pivot_wider(yield_long, names_from = crop, values_from = yield, names_glue = "{crop}_yield")
yield_long
year state crop yield
1900 Iowa barley 28.5
1900 Kansas barley 18.0
2000 Kansas barley 35.0
1900 Iowa wheat 14.4
1900 Kansas wheat 18.2
2000 Iowa wheat 47.0
2000 Kansas wheat 37.0
yield_wide
year state barley_yield wheat_yield
1900 Iowa 28.5 14.4
1900 Kansas 18.0 18.2
2000 Kansas 35.0 37.0
2000 Iowa NA 47.0

yield_wide yield_long

pivot_longer(yield_wide, 
        cols = contains("yield"), 
        names_to = "crop", 
        names_pattern = "(.+)_yield",  
        values_to = "yield", 
        values_drop_na = TRUE)

United States Department of Agriculture, National Agricultural Statistics Service. http://quickstats.nass.usda.gov/

Kevin Wright (2020). agridat: Agricultural Datasets. R package version 1.17

Pivotting data with tidyr Part 4

crop_long
year state crop metric value
1900 Iowa barley yield 28.5
1900 Iowa barley acres 620,000.0
1900 Kansas barley yield 18.0
1900 Kansas barley acres 127,000.0
2000 Kansas barley yield 35.0
2000 Kansas barley acres 7,000.0
1900 Iowa wheat yield 14.4
1900 Iowa wheat acres 1,450,000.0
1900 Kansas wheat yield 18.2
1900 Kansas wheat acres 4,290,000.0
2000 Iowa wheat yield 47.0
2000 Iowa wheat acres 18,000.0
2000 Kansas wheat yield 37.0
2000 Kansas wheat acres 9,400,000.0
crop_wide
year state barley_yield wheat_yield barley_acres wheat_acres
1900 Iowa 28.5 14.4 620000 1450000
1900 Kansas 18.0 18.2 127000 4290000
2000 Kansas 35.0 37.0 7000 9400000
2000 Iowa NA 47.0 NA 18000

crop_long crop_wide

pivot_wider(crop_long, 
    names_from = c(crop, metric), 
    values_from = value, 
    names_glue = "{crop}_{metric}") 

United States Department of Agriculture, National Agricultural Statistics Service. http://quickstats.nass.usda.gov/

Kevin Wright (2020). agridat: Agricultural Datasets. R package version 1.17

Pivotting data with tidyr Part 5

crop_long
year state crop metric value
1900 Iowa barley yield 28.5
1900 Iowa barley acres 620,000.0
1900 Kansas barley yield 18.0
1900 Kansas barley acres 127,000.0
2000 Kansas barley yield 35.0
2000 Kansas barley acres 7,000.0
1900 Iowa wheat yield 14.4
1900 Iowa wheat acres 1,450,000.0
1900 Kansas wheat yield 18.2
1900 Kansas wheat acres 4,290,000.0
2000 Iowa wheat yield 47.0
2000 Iowa wheat acres 18,000.0
2000 Kansas wheat yield 37.0
2000 Kansas wheat acres 9,400,000.0
crop_wide
year state barley_yield wheat_yield barley_acres wheat_acres
1900 Iowa 28.5 14.4 620000 1450000
1900 Kansas 18.0 18.2 127000 4290000
2000 Kansas 35.0 37.0 7000 9400000
2000 Iowa NA 47.0 NA 18000

crop_wide crop_long

pivot_longer(crop_wide, 
      cols = -c(year, state),
      names_to = c("crop", "metric"), 
      names_pattern = "(.+)_(.+)",
      values_to = "value")

United States Department of Agriculture, National Agricultural Statistics Service. http://quickstats.nass.usda.gov/

Kevin Wright (2020). agridat: Agricultural Datasets. R package version 1.17

Separate values into columns

Data
pkg_dat <- tribble(~package, ~maintainer,
        "dplyr", "Hadley Wickham",
        "magrittr", "Lionel Henry",
        "tidyr", "Hadley Wickham",
        "stringr", "Hadley Wickham",
        "rlang", "Lionel Henry",
        "tibble", "Kirill Müller",
        "tidyselect", "Lionel Henry") %>% 
  arrange(package)
pkg_dat
package maintainer
dplyr Hadley Wickham
magrittr Lionel Henry
rlang Lionel Henry
stringr Hadley Wickham
tibble Kirill Müller
tidyr Hadley Wickham
tidyselect Lionel Henry

🎯 separate maintainer name to columns, first name and last name

pkg_dat %>% 
  separate(maintainer, 
           into = c("first_name", "last_name"),
           sep = " ")
# A tibble: 7 × 3
  package    first_name last_name
  <chr>      <chr>      <chr>    
1 dplyr      Hadley     Wickham  
2 magrittr   Lionel     Henry    
3 rlang      Lionel     Henry    
4 stringr    Hadley     Wickham  
5 tibble     Kirill     Müller   
6 tidyr      Hadley     Wickham  
7 tidyselect Lionel     Henry    

Separate values into rows

Data
author_dat <- tribble(~package, ~author,
        "dplyr", "Hadley Wickham, Romain François, Lionel Henry, Kirill Müller",
        "magrittr", "Lionel Henry, Stefan Milton Bache, Hadley Wickham",
        "tidyr", "Hadley Wickham",
        "stringr", "Hadley Wickham",
        "rlang", "Lionel Henry, Hadley Wickham",
        "tibble", "Kirill Müller, Hadley Wickham",
        "tidyselect", "Lionel Henry, Hadley Wickham") %>% 
  arrange(package)
author_dat
package author
dplyr Hadley Wickham, Romain François, Lionel Henry, Kirill Müller
magrittr Lionel Henry, Stefan Milton Bache, Hadley Wickham
rlang Lionel Henry, Hadley Wickham
stringr Hadley Wickham
tibble Kirill Müller, Hadley Wickham
tidyr Hadley Wickham
tidyselect Lionel Henry, Hadley Wickham
author_dat %>% 
  separate_rows(author, sep = ", ")
# A tibble: 15 × 2
   package    author             
   <chr>      <chr>              
 1 dplyr      Hadley Wickham     
 2 dplyr      Romain François    
 3 dplyr      Lionel Henry       
 4 dplyr      Kirill Müller      
 5 magrittr   Lionel Henry       
 6 magrittr   Stefan Milton Bache
 7 magrittr   Hadley Wickham     
 8 rlang      Lionel Henry       
 9 rlang      Hadley Wickham     
10 stringr    Hadley Wickham     
11 tibble     Kirill Müller      
12 tibble     Hadley Wickham     
13 tidyr      Hadley Wickham     
14 tidyselect Lionel Henry       
15 tidyselect Hadley Wickham