Set the format strings and associated summaries to be performed in a layer
Source:R/set_format_strings.R
set_format_strings.Rd
'Tplyr' gives you extensive control over how strings are presented.
set_format_strings
allows you to apply these string formats to your
layer. This behaves slightly differently between layers.
Usage
set_format_strings(e, ...)
# S3 method for desc_layer
set_format_strings(e, ..., cap = getOption("tplyr.precision_cap"))
# S3 method for count_layer
set_format_strings(e, ...)
Arguments
- e
Layer on which to bind format strings
- ...
Named parameters containing calls to
f_str
to set the format strings- cap
A named character vector containing an 'int' element for the cap on integer precision, and a 'dec' element for the cap on decimal precision.
Value
The layer environment with the format string binding added
tplyr_layer object with formats attached
Returns the modified layer object.
Details
Format strings are one of the most powerful components of 'Tplyr'. Traditionally, converting numeric values into strings for presentation can consume a good deal of time. Values and decimals need to align between rows, rounding before trimming is sometimes forgotten - it can become a tedious mess that, in the grand scheme of things, is not an important part of the analysis being performed. 'Tplyr' makes this process as simple as we can, while still allowing flexibility to the user.
In a count layer, you can simply provide a single f_str
object to specify how you want your n's, percentages, and denominators formatted.
If you are additionally supplying a statistic, like risk difference using
add_risk_diff
, you specify the count formats using the name
'n_counts'. The risk difference formats would then be specified using the
name 'riskdiff'. In a descriptive statistic layer,
set_format_strings
allows you to do a couple more things:
By naming parameters with character strings, those character strings become a row label in the resulting data frame
The actual summaries that are performed come from the variable names used within the
f_str
callsUsing multiple summaries (declared by your
f_str
calls), multiple summary values can appear within the same line. For example, to present "Mean (SD)" like displays.Format strings in the desc layer also allow you to configure how empty values should be presented. In the
f_str
call, use theempty
parameter to specify how missing values should present. A single element character vector should be provided. If the vector is unnamed, that value will be used in the format string and fill the space similar to how the numbers will display. Meaning - if your empty string is 'NA' and your format string is 'xx (xxx)', the empty values will populate as 'NA ( NA)'. If you name the character vector in the 'empty' parameter '.overall', likeempty = c(.overall='')
, then that exact string will fill the value instead. For example, providing 'NA' will instead create the formatted string as 'NA' exactly.
See the f_str
documentation for more details about how this
implementation works.
Examples
# Load in pipe
library(magrittr)
# In a count layer
tplyr_table(mtcars, gear) %>%
add_layer(
group_count(cyl) %>%
set_format_strings(f_str('xx (xx%)', n, pct))
) %>%
build()
#> # A tibble: 3 × 6
#> row_label1 var1_3 var1_4 var1_5 ord_layer_index ord_layer_1
#> <chr> <chr> <chr> <chr> <int> <dbl>
#> 1 4 " 1 ( 7%)" " 8 (67%)" " 2 (40%)" 1 1
#> 2 6 " 2 (13%)" " 4 (33%)" " 1 (20%)" 1 2
#> 3 8 "12 (80%)" " 0 ( 0%)" " 2 (40%)" 1 3
# In a descriptive statistics layer
tplyr_table(mtcars, gear) %>%
add_layer(
group_desc(mpg) %>%
set_format_strings(
"n" = f_str("xx", n),
"Mean (SD)" = f_str("xx.x", mean, empty='NA'),
"SD" = f_str("xx.xx", sd),
"Median" = f_str("xx.x", median),
"Q1, Q3" = f_str("xx, xx", q1, q3, empty=c(.overall='NA')),
"Min, Max" = f_str("xx, xx", min, max),
"Missing" = f_str("xx", missing)
)
) %>%
build()
#> # A tibble: 7 × 6
#> row_label1 var1_3 var1_4 var1_5 ord_layer_index ord_layer_1
#> <chr> <chr> <chr> <chr> <int> <int>
#> 1 n "15" "12" " 5" 1 1
#> 2 Mean (SD) "16.1" "24.5" "21.4" 1 2
#> 3 SD " 3.37" " 5.28" " 6.66" 1 3
#> 4 Median "15.5" "22.8" "19.7" 1 4
#> 5 Q1, Q3 "14, 18" "21, 28" "16, 26" 1 5
#> 6 Min, Max "10, 22" "18, 34" "15, 30" 1 6
#> 7 Missing " 0" " 0" " 0" 1 7
# In a shift layer
tplyr_table(mtcars, am) %>%
add_layer(
group_shift(vars(row=gear, column=carb), by=cyl) %>%
set_format_strings(f_str("xxx (xx.xx%)", n, pct))
) %>%
build()
#> # A tibble: 9 × 17
#> row_label1 row_label2 var1_0_1 var1_0_2 var1_0_3 var1_0_4 var1_0_6 var1_0_8
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 4 3 " 1 (33.3… " 0 ( … " 0 ( … " 0 ( … " 0 ( … " 0 ( …
#> 2 4 4 " 0 ( 0.0… " 2 (6… " 0 ( … " 0 ( … " 0 ( … " 0 ( …
#> 3 4 5 " 0 ( 0.0… " 0 ( … " 0 ( … " 0 ( … " 0 ( … " 0 ( …
#> 4 6 3 " 2 (50.0… " 0 ( … " 0 ( … " 0 ( … " 0 ( … " 0 ( …
#> 5 6 4 " 0 ( 0.0… " 0 ( … " 0 ( … " 2 (5… " 0 ( … " 0 ( …
#> 6 6 5 " 0 ( 0.0… " 0 ( … " 0 ( … " 0 ( … " 0 ( … " 0 ( …
#> 7 8 3 " 0 ( 0.0… " 4 (3… " 3 (2… " 5 (4… " 0 ( … " 0 ( …
#> 8 8 4 " 0 ( 0.0… " 0 ( … " 0 ( … " 0 ( … " 0 ( … " 0 ( …
#> 9 8 5 " 0 ( 0.0… " 0 ( … " 0 ( … " 0 ( … " 0 ( … " 0 ( …
#> # ℹ 9 more variables: var1_1_1 <chr>, var1_1_2 <chr>, var1_1_3 <chr>,
#> # var1_1_4 <chr>, var1_1_6 <chr>, var1_1_8 <chr>, ord_layer_index <int>,
#> # ord_layer_1 <dbl>, ord_layer_2 <dbl>