Adoption and Future
datasetjson - Read and write CDISC Dataset JSON formatted datasets in R and Python
2025-11-07
Submission Pilot - RConsortium
Pilot5 - DatasetJSON
![]()
Pilot5DatasetJson
Goals
- Prove that DatasetJson can be accepted by FDA
- Deliver a publicly accessable submission
- Expand on the work of Pilot 1 and 3
Dataset-JSON API and Compressed Dataset-JSON
Dataset-JSON API Standard (Part 1 of 2)
- A REST-based standard API specification (OAS 3.1) for the exchange of Dataset-JSON datasets
- The API supports full CRUD operations, but a read-only implementation is valid
- Primarily implements JSON, but also supports streaming NDJSON
- Many API clients and servers may never work with Dataset-JSON as a file format
Dataset-JSON API Standard (Part 2 of 2)
- A User Guide is available to support implementing the API
- The OAS 3.1 formatted specification can be used to generate code for clients and servers
- Successfully completed Public Review and the final standard should be published in December 2025
- A POC API implementation will be available in December 2025
Compressed Dataset-JSON (Part 1 of 2)
- Based on NDJSON format, which makes it easy to process large datasets.
- Contents is compressed using a standard DEFLATE algorithm, widely supported by programming languages, including SAS, R, Python.
- Compression algorithm is widely used and supported: PNG, DOCX, web traffic, etc.
- Compression is also supported by the API
Compressed Dataset-JSON (Part 2 of 2)
- SDTM package is 15 times smaller and ADaM package is 18 times smaller (larger datasets see greater compression)
- All formats (JSON, NDJSON, and Compressed Dataset-JSON) represent the same underlying information
- Uses the .dsjc file extension
- Successfully completed Public Review and the final standard should be published in December 2025