Tables

Lecture 22

Dr. Greg Chism

University of Arizona
INFO 526 - Fall 2024

Warm up

Announcements

  • Project 02 presentations are Friday, 1:00-3:00pm (scheduled final time)
  • HW 05 is due today, 5:00pm

Setup

# load packages
if(!require(pacman))
  install.packages("pacman")

pacman::p_load(countdown,
               tidyverse,
               gt,
               scales,
               colorspace,
               ggthemes)

# set theme for ggplot2
ggplot2::theme_set(ggplot2::theme_minimal(base_size = 14))

# set width of code output
options(width = 65)

# set figure parameters for knitr
knitr::opts_chunk$set(
  fig.width = 7, # 7" width
  fig.asp = 0.618, # the golden ratio
  fig.retina = 3, # dpi multiplier for displaying HTML output on retina
  fig.align = "center", # center align figures
  dpi = 300 # higher dpi, sharper image
)

Data in tables

Tables vs. plots

Tables:

  • To look up or compare individual values
  • To display precise values
  • To include detail and summary values
  • To display quantitative values including more than one unit of measure

Plots:

  • To reveal relationships among whole sets of values
  • To display a message that is contained in the shape of the values (e.g., patterns, trends, exceptions)

Bachelor’s degrees

BA_degrees <- read_csv("data/BA_degrees.csv")
BA_degrees
# A tibble: 594 × 4
   field                                      year  count    perc
   <chr>                                     <dbl>  <dbl>   <dbl>
 1 Agriculture and natural resources          1971  12672 1.51e-2
 2 Architecture and related services          1971   5570 6.63e-3
 3 Area, ethnic, cultural, gender, and grou…  1971   2579 3.07e-3
 4 Biological and biomedical sciences         1971  35705 4.25e-2
 5 Business                                   1971 115396 1.37e-1
 6 Communication, journalism, and related p…  1971  10324 1.23e-2
 7 Communications technologies                1971    478 5.69e-4
 8 Computer and information sciences          1971   2388 2.84e-3
 9 Education                                  1971 176307 2.10e-1
10 Engineering                                1971  45034 5.36e-2
# ℹ 584 more rows

In the next few slides…



Degrees awarded in 2015



# A tibble: 33 × 2
   field                                                     perc
   <chr>                                                    <dbl>
 1 Agriculture and natural resources                      1.91e-2
 2 Architecture and related services                      4.80e-3
 3 Area, ethnic, cultural, gender, and group studies      4.11e-3
 4 Biological and biomedical sciences                     5.80e-2
 5 Business                                               1.92e-1
 6 Communication, journalism, and related programs        4.78e-2
 7 Communications technologies                            2.71e-3
 8 Computer and information sciences                      3.14e-2
 9 Education                                              4.84e-2
10 Engineering                                            5.16e-2
11 Engineering technologies                               9.10e-3
12 English language and literature/letters                2.42e-2
13 Family and consumer sciences/human sciences            1.30e-2
14 Foreign languages, literatures, and linguistics        1.03e-2
15 Health professions and related programs                1.14e-1
16 Homeland security, law enforcement, and firefighting   3.31e-2
17 Legal professions and studies                          2.33e-3
18 Liberal arts and sciences, general studies, and human… 2.30e-2
19 Library science                                        5.22e-5
20 Mathematics and statistics                             1.15e-2
21 Military technologies and applied sciences             1.46e-4
22 Multi/interdisciplinary studies                        2.51e-2
23 Parks, recreation, leisure, and fitness studies        2.59e-2
24 Philosophy and religious studies                       5.84e-3
25 Physical sciences and science technologies             1.59e-2
26 Precision production                                   2.53e-5
27 Psychology                                             6.20e-2
28 Public administration and social services              1.81e-2
29 Social sciences and history                            8.81e-2
30 Theology and religious vocations                       5.12e-3
31 Transportation and materials moving                    2.49e-3
32 Visual and performing arts                             5.06e-2
33 Not classified by field of study                       0      

# A tibble: 33 × 2
   field                                                     perc
   <chr>                                                    <dbl>
 1 Business                                               1.92e-1
 2 Health professions and related programs                1.14e-1
 3 Social sciences and history                            8.81e-2
 4 Psychology                                             6.20e-2
 5 Biological and biomedical sciences                     5.80e-2
 6 Engineering                                            5.16e-2
 7 Visual and performing arts                             5.06e-2
 8 Education                                              4.84e-2
 9 Communication, journalism, and related programs        4.78e-2
10 Homeland security, law enforcement, and firefighting   3.31e-2
11 Computer and information sciences                      3.14e-2
12 Parks, recreation, leisure, and fitness studies        2.59e-2
13 Multi/interdisciplinary studies                        2.51e-2
14 English language and literature/letters                2.42e-2
15 Liberal arts and sciences, general studies, and human… 2.30e-2
16 Agriculture and natural resources                      1.91e-2
17 Public administration and social services              1.81e-2
18 Physical sciences and science technologies             1.59e-2
19 Family and consumer sciences/human sciences            1.30e-2
20 Mathematics and statistics                             1.15e-2
21 Foreign languages, literatures, and linguistics        1.03e-2
22 Engineering technologies                               9.10e-3
23 Philosophy and religious studies                       5.84e-3
24 Theology and religious vocations                       5.12e-3
25 Architecture and related services                      4.80e-3
26 Area, ethnic, cultural, gender, and group studies      4.11e-3
27 Communications technologies                            2.71e-3
28 Transportation and materials moving                    2.49e-3
29 Legal professions and studies                          2.33e-3
30 Military technologies and applied sciences             1.46e-4
31 Library science                                        5.22e-5
32 Precision production                                   2.53e-5
33 Not classified by field of study                       0      

Field Percentage
Business 19.2%
Health professions and related programs 11.4%
Social sciences and history 8.8%
Psychology 6.2%
Biological and biomedical sciences 5.8%
Engineering 5.2%
Visual and performing arts 5.1%
Education 4.8%
Communication, journalism, and related programs 4.8%
Homeland security, law enforcement, and firefighting 3.3%
Computer and information sciences 3.1%
Parks, recreation, leisure, and fitness studies 2.6%
Multi/interdisciplinary studies 2.5%
English language and literature/letters 2.4%
Liberal arts and sciences, general studies, and humanities 2.3%
Agriculture and natural resources 1.9%
Public administration and social services 1.8%
Physical sciences and science technologies 1.6%
Family and consumer sciences/human sciences 1.3%
Mathematics and statistics 1.2%
Foreign languages, literatures, and linguistics 1.0%
Engineering technologies 0.9%
Philosophy and religious studies 0.6%
Theology and religious vocations 0.5%
Architecture and related services 0.5%
Area, ethnic, cultural, gender, and group studies 0.4%
Communications technologies 0.3%
Transportation and materials moving 0.2%
Legal professions and studies 0.2%
Military technologies and applied sciences 0.0%
Library science 0.0%
Precision production 0.0%
Not classified by field of study 0.0%

In the next few slides…


Popular Bachelor’s degrees over the years


How should this information be displayed? And why?

In a table?

Popular Bachelor's degrees over the years
Year Business Health professions Social sciences and history Other
1971 13.7% 3.0% 18.5% 64.8%
1976 15.5% 5.8% 13.7% 65.1%
1981 21.4% 6.8% 10.7% 61.0%
1986 24.0% 6.6% 9.5% 59.9%
1991 22.8% 5.5% 11.4% 60.3%
1996 19.5% 7.4% 10.9% 62.3%
2001 21.2% 6.1% 10.3% 62.4%
2005 21.6% 5.6% 10.9% 61.8%
2006 21.4% 6.2% 10.9% 61.5%
2007 21.5% 6.7% 10.8% 61.1%
2008 21.4% 7.1% 10.7% 60.7%
2009 21.7% 7.5% 10.5% 60.2%
2010 21.7% 7.9% 10.5% 60.0%
2011 21.3% 8.4% 10.3% 60.0%
2012 20.5% 9.1% 10.0% 60.4%
2013 19.6% 9.8% 9.7% 60.9%
2014 19.1% 10.6% 9.3% 61.0%
2015 19.2% 11.4% 8.8% 60.6%

Or in a plot?

Tables, the making of

Tables with gt

We will use the gt (Grammar of Tables) package to create tables in R.

The gt philosophy: we can construct a wide variety of useful tables with a cohesive set of table parts.

Source: gt.rstudio.com

Livecoding: Recreate this table of Bachelor’s degrees awarded in 2015.

  • Install the gt package: install.packages("gt")
Code
BA_degrees |>
  filter(year == 2015) |>
  select(field, perc) |>
  arrange(desc(perc)) |>
  gt() |>
  tab_style(
    style = "padding-top:0px;padding-bottom:0px;",
    locations = cells_body(columns = everything())
  ) |>
  tab_style(
    style = cell_text(size = "small"),
    locations = cells_body(columns = everything())
  ) |>
  fmt_percent(
    columns = perc,
    decimals = 1
  ) |>
  cols_label(
    field = "Field",
    perc = "Percentage"
  )
Field Percentage
Business 19.2%
Health professions and related programs 11.4%
Social sciences and history 8.8%
Psychology 6.2%
Biological and biomedical sciences 5.8%
Engineering 5.2%
Visual and performing arts 5.1%
Education 4.8%
Communication, journalism, and related programs 4.8%
Homeland security, law enforcement, and firefighting 3.3%
Computer and information sciences 3.1%
Parks, recreation, leisure, and fitness studies 2.6%
Multi/interdisciplinary studies 2.5%
English language and literature/letters 2.4%
Liberal arts and sciences, general studies, and humanities 2.3%
Agriculture and natural resources 1.9%
Public administration and social services 1.8%
Physical sciences and science technologies 1.6%
Family and consumer sciences/human sciences 1.3%
Mathematics and statistics 1.2%
Foreign languages, literatures, and linguistics 1.0%
Engineering technologies 0.9%
Philosophy and religious studies 0.6%
Theology and religious vocations 0.5%
Architecture and related services 0.5%
Area, ethnic, cultural, gender, and group studies 0.4%
Communications technologies 0.3%
Transportation and materials moving 0.2%
Legal professions and studies 0.2%
Military technologies and applied sciences 0.0%
Library science 0.0%
Precision production 0.0%
Not classified by field of study 0.0%

Plots in tables

Should these data be displayed in a table or a plot?

Popular Bachelor's degrees over the years
Field 1971 1976 1981 1986 1991 1996 2001 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Business 14% 15% 21% 24% 23% 19% 21% 22% 21% 21% 21% 22% 22% 21% 20% 20% 19% 19%
Health professions 3% 6% 7% 7% 5% 7% 6% 6% 6% 7% 7% 8% 8% 8% 9% 10% 11% 11%
Social sciences and history 18% 14% 11% 9% 11% 11% 10% 11% 11% 11% 11% 11% 10% 10% 10% 10% 9% 9%
Other 65% 65% 61% 60% 60% 62% 62% 62% 62% 61% 61% 60% 60% 60% 60% 61% 61% 61%

Add visualizations to your table

Example: Add sparklines to display trend alongside raw data


Popular Bachelor's degrees over the years
Field Trend 1971 1976 1981 1986 1991 1996 2001 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Business 14% 15% 21% 24% 23% 19% 21% 22% 21% 21% 21% 22% 22% 21% 20% 20% 19% 19%
Health professions 3% 6% 7% 7% 5% 7% 6% 6% 6% 7% 7% 8% 8% 8% 9% 10% 11% 11%
Social sciences and history 18% 14% 11% 9% 11% 11% 10% 11% 11% 11% 11% 11% 10% 10% 10% 10% 9% 9%
Other 65% 65% 61% 60% 60% 62% 62% 62% 62% 61% 61% 60% 60% 60% 60% 61% 61% 61%

Livecoding: Recreate this table of popular Bachelor’s degrees awarded over time.


Sparklines
plot_spark <- function(df){
  ggplot(df, aes(x = year, y = perc)) +
    geom_line(linewidth = 20) +
    theme_void()
}

BA_degrees_other_plots <- BA_degrees_other |>
  nest(field_df = c(year, perc)) |>
  mutate(plot = map(field_df, plot_spark))

BA_degrees_other |> 
  pivot_wider(names_from = year, values_from = perc) |>
  mutate(ggplot = NA, .after = field) |> 
  gt() |> 
  text_transform(
    locations = cells_body(columns = ggplot),
    fn = function(x){
      map(BA_degrees_other_plots$plot, ggplot_image, height = px(15), aspect_ratio = 4)
    }
  ) |> 
  cols_width(
    ggplot ~ px(1000)
    ) |> 
  cols_align(
    align = "left",
    columns = field
  ) |>
  fmt_percent(
    columns = where(is.numeric),
    decimals = 0
  ) |>
  cols_label(
    field  = "Field",
    ggplot = "Trend"
  ) |>
  tab_spanner(
    label = "Popular Bachelor's degrees over the years",
    columns = everything()
  ) |>
  tab_style(
    style = cell_text(weight = "bold"),
    locations = cells_column_spanners()
  )
Popular Bachelor's degrees over the years
Field Trend 1971 1976 1981 1986 1991 1996 2001 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Business 14% 15% 21% 24% 23% 19% 21% 22% 21% 21% 21% 22% 22% 21% 20% 20% 19% 19%
Health professions 3% 6% 7% 7% 5% 7% 6% 6% 6% 7% 7% 8% 8% 8% 9% 10% 11% 11%
Social sciences and history 18% 14% 11% 9% 11% 11% 10% 11% 11% 11% 11% 11% 10% 10% 10% 10% 9% 9%
Other 65% 65% 61% 60% 60% 62% 62% 62% 62% 61% 61% 60% 60% 60% 60% 61% 61% 61%

Your turn: Add color to the previous table.


Colored Sparklines
plot_spark_color <- function(df){
  ggplot(df, aes(x = year, y = perc, color = line_color)) +
    geom_line(linewidth = 20) +
    theme_void() +
    scale_color_identity()
}

BA_degrees_other_plots_color <- BA_degrees_other |>
  mutate(line_color = case_when(
    field == "Business" ~ "#9D6C06",
    field == "Health professions" ~ "#077DAA",
    field == "Social sciences and history" ~ "#026D4E",
    field == "Other" ~ "#A39A09"
  )) |>
  nest(field_df = c(year, perc, line_color)) |>
  mutate(plot = map(field_df, plot_spark_color))

BA_degrees_other |> 
  pivot_wider(names_from = year, values_from = perc) |>
  mutate(ggplot = NA, .after = field) |> 
  gt() |> 
  text_transform(
    locations = cells_body(columns = ggplot),
    fn = function(x){
      map(BA_degrees_other_plots_color$plot, ggplot_image, height = px(15), aspect_ratio = 4)
    }
  ) |> 
  cols_width(
    ggplot ~ px(1000)
    ) |> 
  cols_align(
    align = "left",
    columns = field
  ) |>
  fmt_percent(
    columns = where(is.numeric),
    decimals = 0
  ) |>
  tab_style(
    style = cell_text(color = "#9D6C06"),
    locations = cells_body(rows = 1, columns = field)
  ) |>
  tab_style(
    style = cell_text(color = "#077DAA"),
    locations = cells_body(rows = 2, columns = field)
  ) |>
  tab_style(
    style = cell_text(color = "#026D4E"),
    locations = cells_body(rows = 3, columns = field)
  ) |>
  tab_style(
    style = cell_text(color = "#A39A09"),
    locations = cells_body(rows = 4, columns = field)
  ) |> 
  cols_label(
    field  = "Field",
    ggplot = "Trend"
  ) |>
  tab_spanner(
    label = "Popular Bachelor's degrees over the years",
    columns = everything()
  ) |>
  tab_style(
    style = cell_text(weight = "bold"),
    locations = cells_column_spanners()
  )
Popular Bachelor's degrees over the years
Field Trend 1971 1976 1981 1986 1991 1996 2001 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Business 14% 15% 21% 24% 23% 19% 21% 22% 21% 21% 21% 22% 22% 21% 20% 20% 19% 19%
Health professions 3% 6% 7% 7% 5% 7% 6% 6% 6% 7% 7% 8% 8% 8% 9% 10% 11% 11%
Social sciences and history 18% 14% 11% 9% 11% 11% 10% 11% 11% 11% 11% 11% 10% 10% 10% 10% 9% 9%
Other 65% 65% 61% 60% 60% 62% 62% 62% 62% 61% 61% 60% 60% 60% 60% 61% 61% 61%
10:00

Making better tables

10 guidelines for better tables

  1. Offset the heads from the body
  2. Use subtle dividers rather than heavy gridlines
  3. Right-align numbers and heads
  4. Left-align text and heads
  5. Select the appropriate level of precision
  6. Guide your reader with space between rows and columns
  7. Remove unit repetition
  8. Highlight outliers
  9. Group similar data and increase white space
  10. Add visualizations when appropriate

Table resources

Other packages

  • knitr::kable(): “Cheapest” pretty tables in R Markdown
  • Other (than HTML) outputs:
  • gtsummary: For summarizing statistical output with gt
  • Interactivity: We will work with these when we learn Shiny! - DT - reactable

Table inspiration