Sourcing Titles and Footnotes From External Files

Working pharmaRTF Into Your Process

In this vignette, we will take you through some final features of pharmaRTF to help tie together how this package can centralize titles and footnotes within your organization, and show you the tools you need to embed this into your own internal process.

Being able to specify everything directly in a program is useful. For adhoc tables or to quickly put together an output, keeping everything central and in one spot is nice. But in a mature process, companies typically have a strategy to centralize the assignment of titles and footnotes. Keeping all of your titles and footnotes in one place has many advantages. You can global replace changes that impact numerous outputs. You can keep separate repetitive text instead of repeating it for each table. Updates take place in one file instead of having to open each individual impacted program. In short, centralizing this process is highly desirable within an organization.

pharmaRTF has the tools you need to make this process simple, and this comes from the function titles_and_footnotes_from_df. This function allows you to import your titles and footnotes from a data frame. That data frame can either come directly from your R session, or you can build a custom reader function to import the data from an external file. So let’s take a look.

A Small Note

Some of the syntax we use throughout this document uses the %>% pipe operator. This is to display support for a tidyverse style of coding. You can find the %>% operator in the magittr package.

How It Works

The idea behind titles_and_footnotes_from_df is simply that there is a set of expectations of what the data looks like that flows into the function. What that data looks like within an external file is irrelevant, as long as it can be transformed to meet the needs of titles_and_footnotes_from_df on the way in. This is possible because of the reader parameter. The reader parameter is just some function that returns a data frame, and that data frame must be compliant with the required variables and types expected by titles_and_footnotes_from_df. Those variables, types, and requirements are as follows.

Variable	Column.Types	Required
type	character	Yes
text1	character	Yes
text2	character	No
align	character	No
bold	logical	No
italic	logical	No
font	character	No
font_size	numeric	No
index	numeric	No

These variables and types should all look familiar because they’re exactly what’s required for the object hf_line. In fact, titles_and_footnotes_from_df simply takes this data and applies the add_titles and add_footnotes function to the rtf_doc object to attach titles and footnotes.

type and text1 are the only required values and this is because titles_and_foonotes_from_df needs to know where to attach the text, and needs to have text to attach. For variables that aren’t required, they may be omitted from the data returned by the reader function and the default values from the hf_line object will be used.

Custom Reader Functions

The entire concept behind titles_and_footnotes_from_df is allowing the freedom of creating your own process. By allowing the user to pass in a custom reader function to the reader parameter, the source of given titles and footnotes can be anything the user wants. You could read from flat text, CSV, Excel, Access, any remote file that R can process and return a data frame meeting the above requirements is on the table.

With this in mind, let’s consider a simple example. Assume that I have the excel file titles.xlsx. This file looks as follows:

table_number	index	type	text1	text2	align	bold	italic
14-2.01	1	title	Protocol: CDISCPILOT01	PAGE_FORMAT: Page %s of %s	split	TRUE	TRUE
14-2.01	2	title	Population: Intent-to-Treat	NA	left	TRUE	TRUE
14-2.01	3	title	Table 14-2.01	NA	center	TRUE	TRUE
14-2.01	4	title	Summary of Demographic and Baseline Characteristics	NA	center	TRUE	TRUE
14-2.01	1	footnote	[1] P-values are results of ANOVA treatment group comparison for continuous variable and Pearson’s chisquare test for categorical variables.	NA	left	FALSE	TRUE
14-2.01	2	footnote	NOTE: Duration of disease is computed as months between date of enrollment and date of onset of the first definite symptoms of Alzheimer’s disease.	NA	left	FALSE	TRUE
14-2.01	3	footnote	FILE_PATH: Source: %s	DATE_FORMAT: %H:%M %A, %B %d, %Y	split	FALSE	TRUE
14-4.01	1	title	Protocol: CDISCPILOT01	PAGE_FORMAT: Page %s of %s	split	TRUE	TRUE
14-4.01	2	title	Population: Safety	NA	left	TRUE	TRUE
14-4.01	3	title	Table 14-4.01	NA	center	TRUE	TRUE
14-4.01	4	title	Summary of Planned Exposure to Study Drug, as of End of Study	NA	center	TRUE	TRUE
14-4.01	1	footnote	[1] Includes completers and early terminators.	NA	left	FALSE	TRUE
14-4.01	2	footnote	[2] End of Study refers to week 26/Early Termination.	NA	left	FALSE	TRUE
14-4.01	3	footnote	FILE_PATH: Source: %s	DATE_FORMAT: %H:%M %A, %B %d, %Y	split	FALSE	TRUE

(For more information about the special text fields you see here, see our advanced usage vignette)

Here I have everything I need to generate my titles. I just need to make sure of two things:

My column types read in appropriately
When I read the data, I filter to the appropriate table

So overall this will be a pretty simple function. Let’s create it.

example_custom_reader <- function(..., table_number=NULL) {

  # If a column isn't populated then the type may be guessed wrong so force it
  col_types <- c('text', 'numeric', 'text', 'text', 'text', 'text', 'logical', 'logical')
  # pass through arguments from ...
  df <- readxl::read_excel(..., col_types=col_types)

  # Subset and return that dataframe
  df[df$table_number==table_number, !names(df) == 'table_number']
}

Within a few steps I have everything I need:

I use readxl::read_excel to import the data from the excel file
I filter the data to just contain rows associated with whatever table I specify
I remove the table_number column and just keep the variables that need to be sent to titles_and_footnotes_from_df

Note that here I use the ... parameter. This is because

the from.file parameter is passed forward into the reader function
the ... option on titles_and_foonotes_from_df is also passed forward, allowing you to pass whatever arguments necessary from the titles_and_footnotes_from_df call into your custom reader.

With this function now available, let’s see what the function returns.

example_custom_reader('titles.xlsx', table_number = '14-2.01')
#> # A tibble: 7 x 7
#>   index type    text1                          text2          align bold  italic
#>   <dbl> <chr>   <chr>                          <chr>          <chr> <lgl> <lgl> 
#> 1     1 title   Protocol: CDISCPILOT01         PAGE_FORMAT: … split TRUE  TRUE  
#> 2     2 title   Population: Intent-to-Treat    <NA>           left  TRUE  TRUE  
#> 3     3 title   Table 14-2.01                  <NA>           cent… TRUE  TRUE  
#> 4     4 title   Summary of Demographic and Ba… <NA>           cent… TRUE  TRUE  
#> 5     1 footno… [1] P-values are results of A… <NA>           left  FALSE TRUE  
#> 6     2 footno… NOTE: Duration of disease is … <NA>           left  FALSE TRUE  
#> 7     3 footno… FILE_PATH: Source: %s          DATE_FORMAT: … split FALSE TRUE

Here can see the data frame that the function processes and returns. In this example, the data processing was very simple - but in a more advanced case this function could read from multiple sheets, merge in default titles and footnotes, read from different data sources - whatever suits your process.

Putting It Together

Now with an understanding of how a custom reader function works, let’s tie it all together. Let’s assume that our table has been programmed, and that the huxtable table is ready in our session (see the huxtable tips vignette to for tips on how to prepare your table). The huxtable table is our session looks like this:

rowlbl1	rowlbl2	0_C			0_S
		Completers at Week 24			Safety Population [1]
		Placebo\line (N=60)	Xanomeline\line Low Dose\line (N=28)	Xanomeline\line High Dose\line (N=30)	Placebo\line (N=86)	Xanomeline\line Low Dose\line (N=84)	Xanomeline\line High Dose\line (N=84)
Average daily dose (mg)	n	60	28	30	86	84	84
	Mean	0.0	54.0	77.0	0.0	54.0	71.6
	SD	0.00	0.00	0.58	0.00	0.00	8.11
	Median	0.0	54.0	76.9	0.0	54.0	75.1
	Min	0.0	54.0	76.1	0.0	54.0	54.0
	Max	0.0	54.0	78.6	0.0	54.0	78.6

Cumulative dose at end of study [2]	n	60	28	30	86	84	84
	Mean	0.0	9918.6	14089.5	0.0	5347.3	7551.0
	SD	0.00	603.84	481.01	0.00	3680.35	5531.04
	Median	0.0	9936.0	14080.5	0.0	4455.0	5778.0
	Min	0.0	7884.0	12960.0	0.0	108.0	54.0
	Max	0.0	11448.0	15417.0	0.0	11448.0	15417.0

Note: The data here comes from the CDISC Pilot, which you can read about here. Also, see our replication of the tables from this CDISC pilot in our Github repository.

Here’s how you would go about creating your rtf_doc object. (For more information about the functions being used here, see our advanced usage vignette)

doc <- rtf_doc(ht, header_rows = 2) %>% titles_and_footnotes_from_df(
  from.file='../data/titles.xlsx',
  reader=example_custom_reader,
  table_number='14-4.01') %>%
  set_column_header_buffer(top=1) %>%
  set_font_size(10)

write_rtf(doc, file="table16.rtf")

Within the titles_and_footnotes_from_df function you can see the parameters used:

from.file is the location of our titles/footnotes
reader is the function we created within this vignette
table_number is the number of the table we are creating

In the function call, from.file and table_number are passed forward into the custom reader. From there, validation of the incoming data frame happens, any missing variables have their defaults applied, and then add_titles and add_footnotes are called on the titles and footnotes within the data respectively. You can see in the resulting table that the titles and footnotes have been applied

And there you have it. By using this process, the addition of titles and footnotes to your rtf_doc greatly simplifies, and this code becomes very reusable from program to program, allowing you to focus on the nuance of your table creation rather than the redundant application of creating titles and footnotes. As mentioned, it’s necessary to be able to manually specify your titles and footnotes as needed - but automating the process of applying them is much more desirable when your studies reach a certain scale.

Sourcing Titles and Footnotes From External Files

Mike Stackhouse

2020-06-17

Working pharmaRTF Into Your Process

A Small Note

How It Works

Custom Reader Functions

Putting It Together

Further Reading