Standards in Different Regulatory Agencies
Source:vignettes/agency_standards.Rmd
agency_standards.Rmd
Motivation
The xportr
package is designed to help clinical
programmers create CDISC
compliant xpt
files.
It provides the functionality to associate metadata information to a
local R data frame, perform data set level validation checks, and
convert into a transport v5 file (xpt
). However, technical
requirements related to the xpt
files can change across
different regulatory agencies. This vignette aims to start to provide a
clear and concise summary of the differences between the agencies for
the xpt
files. Further updates will come with later package
releases.
The following section will delve into various technical specifications as per FDA, NMPA, and PMDA guidelines.
File name - character
XPT
The first character must be an English letter (A, B, C, . . ., Z) or underscore (_). Subsequent characters can be letters, numeric digits (0, 1, . . ., 9), or underscores. You can use uppercase or lowercase letters. Blanks cannot appear in SAS names. Special characters, except for the underscore, are not allowed.
FDA
Dataset in the transport file should be named the same as the transport file. Variable names, as well as variable and dataset labels should include American Standard Code for Information Interchange (ASCII) text codes only. Dataset names should contain only lowercase letters, numbers, and must start with a letter.
Variable name
XPT
The name can contain letters of the Latin alphabet, numerals, or underscores. The name cannot contain blanks or special characters except for the underscore. The name must begin with a letter of the Latin alphabet (A–Z, a–z) or the underscore.
FDA
Variable names, as well as variable and dataset labels should include American Standard Code for Information Interchange (ASCII) text codes only. Variable names should contain only uppercase letters, numbers, and must start with a letter
Label character
FDA
Variable names, as well as variable and dataset labels should include
American Standard Code for Information Interchange (ASCII) text codes
only. Do not submit study data with the following special characters in
variable and dataset labels: 1. Unbalanced apostrophe, e.g.,
“Parkinson’s” 2. Unbalanced single and double quotation marks 3.
Unbalanced parentheses, braces or brackets, e.g.,(
,
{
and [
Values character
FDA
Variable values are the most broadly compatible with software and operating systems when they are restricted to ASCII text codes (printable values below 128). Use UTF-8 for extending character sets; however, the use of extended mappings is not recommended. Transcoding errors, variable length errors, and lack of software support for multi byte UTF-8 encodings can result in incorrect character display and variable value truncation.
NMPA
If variables had been collected in Japanese and there is a risk of losing certain information by translating it into English, the descriptions in Japanese are necessary and appropriate, and data written in Japanese (hereinafter referred to as Japanese data) may be submitted. In the Japanese dataset, only the Japanese items should be Japanese and the rest should be alphanumeric(=ASCII) data, similar to that in the alphanumeric dataset.