A very common requirement for summary tables is to calculate the risk difference between treatment
groups. add_risk_diff
allows you to do this. The underlying risk difference calculations
are performed using the Base R function prop.test
- so prior to using this function,
be sure to familiarize yourself with its functionality.
Usage
add_risk_diff(layer, ..., args = list(), distinct = TRUE)
Arguments
- layer
Layer upon which the risk difference will be attached
- ...
Comparison groups, provided as character vectors where the first group is the comparison, and the second is the reference
- args
Arguments passed directly into
prop.test
- distinct
Logical - Use distinct counts (if available).
Details
add_risk_diff
can only be attached to a count layer, so the count layer must be constructed
first. add_risk_diff
allows you to compare the difference between treatment group, so all
comparisons should be based upon the values within the specified treat_var
in your
tplyr_table
object.
Comparisons are specified by providing two-element character vectors. You can provide as many of
these groups as you want. You can also use groups that have been constructed using
add_treat_grps
or add_total_group
. The first element provided will be considered
the 'reference' group (i.e. the left side of the comparison), and the second group will be considered
the 'comparison'. So if you'd like to see the risk difference of 'T1 - Placebo', you would specify
this as c('T1', 'Placebo')
.
Tplyr forms your two-way table in the background, and then runs prop.test
appropriately.
Similar to way that the display of layers are specified, the exact values and format of how you'd like
the risk difference display are set using set_format_strings
. This controls both the values
and the format of how the risk difference is displayed. Risk difference formats are set within
set_format_strings
by using the name 'riskdiff'.
You have 5 variables to choose from in your data presentation:
- comp
Probability of the left hand side group (i.e. comparison)
- ref
Probability of the right hand side group (i.e. reference)
- dif
Difference of comparison - reference
- low
Lower end of the confidence interval (default is 95%, override with the
args
paramter)- high
Upper end of the confidence interval (default is 95%, override with the
args
paramter)
Use these variable names when forming your f_str
objects. The default presentation, if no
string format is specified, will be:
f_str('xx.xxx (xx.xxx, xx.xxx)', dif, low, high)
Note - within Tplyr, you can account for negatives by allowing an extra space within your integer side settings. This will help with your alignment.
If columns are specified on a Tplyr table, risk difference comparisons still only take place between
groups within the treat_var
variable - but they are instead calculated treating the cols
variables as by variables. Just like the tplyr layers themselves, the risk difference will then be transposed
and display each risk difference as separate variables by each of the cols
variables.
If distinct
is TRUE (the default), all calculations will take place on the distinct counts, if
they are available. Otherwise, non-distinct counts will be used.
One final note - prop.test
may throw quite a few warnings. This is natural, because it
alerts you when there's not enough data for the approximations to be correct. This may be unnerving
coming from a SAS programming world, but this is R is trying to alert you that the values provided
don't have enough data to truly be statistically accurate.
Examples
library(magrittr)
## Two group comparisons with default options applied
t <- tplyr_table(mtcars, gear)
# Basic risk diff for two groups, using defaults
l1 <- group_count(t, carb) %>%
# Compare 3 vs. 4, 3 vs. 5
add_risk_diff(
c('3', '4'),
c('3', '5')
)
# Build and show output
add_layers(t, l1) %>% build()
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> # A tibble: 6 × 8
#> row_label1 var1_3 var1_4 var1_5 ord_layer_index rdiff_3_4 rdiff_3_5
#> <chr> <chr> <chr> <chr> <int> <chr> <chr>
#> 1 1 3 ( 20.0%) 4 ( 33.3%) 0 ( 0.0… 1 "-0.133 … " 0.200 …
#> 2 2 4 ( 26.7%) 4 ( 33.3%) 2 ( 40.0… 1 "-0.067 … "-0.133 …
#> 3 3 3 ( 20.0%) 0 ( 0.0%) 0 ( 0.0… 1 " 0.200 … " 0.200 …
#> 4 4 5 ( 33.3%) 4 ( 33.3%) 1 ( 20.0… 1 " 0.000 … " 0.133 …
#> 5 6 0 ( 0.0%) 0 ( 0.0%) 1 ( 20.0… 1 " 0.000 … "-0.200 …
#> 6 8 0 ( 0.0%) 0 ( 0.0%) 1 ( 20.0… 1 " 0.000 … "-0.200 …
#> # ℹ 1 more variable: ord_layer_1 <dbl>
## Specify custom formats and display variables
t <- tplyr_table(mtcars, gear)
# Create the layer with custom formatting
l2 <- group_count(t, carb) %>%
# Compare 3 vs. 4, 3 vs. 5
add_risk_diff(
c('3', '4'),
c('3', '5')
) %>%
set_format_strings(
'n_counts' = f_str('xx (xx.x)', n, pct),
'riskdiff' = f_str('xx.xxx, xx.xxx, xx.xxx, xx.xxx, xx.xxx', comp, ref, dif, low, high)
)
# Build and show output
add_layers(t, l2) %>% build()
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> # A tibble: 6 × 8
#> row_label1 var1_3 var1_4 var1_5 ord_layer_index rdiff_3_4 rdiff_3_5
#> <chr> <chr> <chr> <chr> <int> <chr> <chr>
#> 1 1 " 3 (20.0)" " 4 (33.3)" " 0 ( … 1 " 0.200,… " 0.200,…
#> 2 2 " 4 (26.7)" " 4 (33.3)" " 2 (4… 1 " 0.267,… " 0.267,…
#> 3 3 " 3 (20.0)" " 0 ( 0.0)" " 0 ( … 1 " 0.200,… " 0.200,…
#> 4 4 " 5 (33.3)" " 4 (33.3)" " 1 (2… 1 " 0.333,… " 0.333,…
#> 5 6 " 0 ( 0.0)" " 0 ( 0.0)" " 1 (2… 1 " 0.000,… " 0.000,…
#> 6 8 " 0 ( 0.0)" " 0 ( 0.0)" " 1 (2… 1 " 0.000,… " 0.000,…
#> # ℹ 1 more variable: ord_layer_1 <dbl>
## Passing arguments to prop.test
t <- tplyr_table(mtcars, gear)
# Create the layer with args option
l3 <- group_count(t, carb) %>%
# Compare 3 vs. 4, 4 vs. 5
add_risk_diff(
c('3', '4'),
c('3', '5'),
args = list(conf.level = 0.9, correct=FALSE, alternative='less')
)
# Build and show output
add_layers(t, l3) %>% build()
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> Warning: Chi-squared approximation may be incorrect
#> # A tibble: 6 × 8
#> row_label1 var1_3 var1_4 var1_5 ord_layer_index rdiff_3_4 rdiff_3_5
#> <chr> <chr> <chr> <chr> <int> <chr> <chr>
#> 1 1 3 ( 20.0%) 4 ( 33.3%) 0 ( 0.0… 1 "-0.133 … " 0.200 …
#> 2 2 4 ( 26.7%) 4 ( 33.3%) 2 ( 40.0… 1 "-0.067 … "-0.133 …
#> 3 3 3 ( 20.0%) 0 ( 0.0%) 0 ( 0.0… 1 " 0.200 … " 0.200 …
#> 4 4 5 ( 33.3%) 4 ( 33.3%) 1 ( 20.0… 1 " 0.000 … " 0.133 …
#> 5 6 0 ( 0.0%) 0 ( 0.0%) 1 ( 20.0… 1 " 0.000 … "-0.200 …
#> 6 8 0 ( 0.0%) 0 ( 0.0%) 1 ( 20.0… 1 " 0.000 … "-0.200 …
#> # ℹ 1 more variable: ord_layer_1 <dbl>