While eg_read() and eg_write() handle raw
file transfers, you probably spend most of your time working with data
files that you want to load directly into R as data frames. egnyte
provides a set of functions that handle the download-read or
write-upload workflow in a single step.
Prerequisites
Before you can read or write files, you need to authenticate. See
vignette("configuration") or
vignette("authorization") if you haven’t set that up
yet.
Reading Data Files
egnyte provides read functions for common data formats. Each function:
- Downloads the file from Egnyte to a temporary location
- Reads it using the appropriate R package
- Returns the data as a data frame (or tibble)
- Cleans up the temporary file
CSV Files
# Read a CSV file
dat <- eg_read_csv("/Shared/Data/analysis.csv")This uses readr::read_csv() under the hood. You can pass
any additional arguments that read_csv() accepts:
# Specify column types
dat <- eg_read_csv(
"/Shared/Data/analysis.csv",
col_types = cols(
id = col_integer(),
name = col_character(),
value = col_double()
)
)
# Skip rows, select columns, etc.
dat <- eg_read_csv(
"/Shared/Data/analysis.csv",
skip = 2,
col_select = c(id, name, value)
)Delimited Files
For files with delimiters other than commas:
# Tab-delimited file (default)
dat <- eg_read_delim("/Shared/Data/data.tsv")
# Pipe-delimited file
dat <- eg_read_delim("/Shared/Data/data.txt", delim = "|")
# Semicolon-delimited (common in European CSVs)
dat <- eg_read_delim("/Shared/Data/european.csv", delim = ";")Excel Files
# Read an Excel file (first sheet by default)
dat <- eg_read_excel("/Shared/Data/workbook.xlsx")
# Read a specific sheet by name
dat <- eg_read_excel("/Shared/Data/workbook.xlsx", sheet = "Summary")
# Read a specific sheet by position
dat <- eg_read_excel("/Shared/Data/workbook.xlsx", sheet = 3)Additional readxl::read_excel() arguments work here
too:
# Specify a range
dat <- eg_read_excel(
"/Shared/Data/workbook.xlsx",
range = "B2:F100"
)
# Skip rows
dat <- eg_read_excel(
"/Shared/Data/workbook.xlsx",
skip = 5
)SAS Files
# Read a SAS7BDAT file
dat <- eg_read_sas("/Shared/Data/dataset.sas7bdat")
# Read a SAS transport file (.xpt)
dat <- eg_read_xpt("/Shared/Data/dataset.xpt")These use haven::read_sas() and
haven::read_xpt(). Haven preserves SAS attributes like
variable labels and formats.
Stata Files
# Read a Stata file
dat <- eg_read_stata("/Shared/Data/dataset.dta")SPSS Files
# Read an SPSS file
dat <- eg_read_spss("/Shared/Data/dataset.sav")R Objects (RDS)
# Read an RDS file
obj <- eg_read_rds("/Shared/Data/model.rds")RDS files can contain any R object, not just data frames. This is useful for saving fitted models, lists, or other complex objects.
Writing Data Files
The write functions work in reverse - they take an R object, write it to a temporary file, and upload it to Egnyte.
CSV Files
# Write a data frame to CSV
eg_write_csv(dat, "/Shared/Data/results.csv")By default, this will fail if the file already exists:
# Overwrite an existing file
eg_write_csv(dat, "/Shared/Data/results.csv", overwrite = TRUE)You can pass additional arguments to
readr::write_csv():
# Don't include column names
eg_write_csv(dat, "/Shared/Data/results.csv", col_names = FALSE)
# Use a different NA representation
eg_write_csv(dat, "/Shared/Data/results.csv", na = ".")Delimited Files
# Write a tab-delimited file
eg_write_delim(dat, "/Shared/Data/results.tsv")
# Write with a different delimiter
eg_write_delim(dat, "/Shared/Data/results.txt", delim = "|")Excel Files
# Write a single sheet
eg_write_excel(dat, "/Shared/Data/results.xlsx")
# Write multiple sheets by passing a named list
eg_write_excel(
list(
"Summary" = summary_df,
"Details" = details_df,
"Raw" = raw_df
),
"/Shared/Data/workbook.xlsx"
)SAS Transport Files
# Write a SAS transport file
eg_write_xpt(dat, "/Shared/Data/results.xpt")Note: egnyte can write XPT files but not native SAS7BDAT files. If you need SAS7BDAT, you’ll need to use SAS itself or another tool.
Stata Files
# Write a Stata file
eg_write_stata(dat, "/Shared/Data/results.dta")SPSS Files
# Write an SPSS file
eg_write_spss(dat, "/Shared/Data/results.sav")R Objects (RDS)
# Save any R object
eg_write_rds(fitted_model, "/Shared/Data/model.rds")
# Control compression
eg_write_rds(large_data, "/Shared/Data/data.rds", compress = "xz")Optional Dependencies
The format-specific functions require additional packages that aren’t installed by default with egnyte:
| Function | Required Package |
|---|---|
eg_read_csv(), eg_write_csv()
|
readr |
eg_read_delim(), eg_write_delim()
|
readr |
eg_read_excel() |
readxl |
eg_write_excel() |
writexl |
eg_read_sas(), eg_read_xpt(),
eg_write_xpt()
|
haven |
eg_read_stata(), eg_write_stata()
|
haven |
eg_read_spss(), eg_write_spss()
|
haven |
eg_read_rds(), eg_write_rds()
|
(base R - no extra package) |
If you try to use a function without the required package installed, egnyte will prompt you to install it:
The readr package is required for this function.
Would you like to install it? (yes/no)
A Few Tips
Working with Large Files
The read functions download the entire file before reading it. For large files, this can take a while and use significant memory. A few suggestions:
- If you only need specific columns, use the
col_selectargument (for CSV/delim files) to avoid loading unnecessary data - Consider whether you really need the whole file, or if you can work with a sample
- For very large datasets, you might want to use
eg_read()to download the file once, then work with it locally
Passing Arguments Through
All the format-specific functions accept ... arguments
that get passed to the underlying read/write function. If you’re
familiar with readr::read_csv() or
haven::read_sas(), you can use all the same options:
# readr options
dat <- eg_read_csv(
"/Shared/Data/data.csv",
col_types = "ccdd",
locale = locale(decimal_mark = ","),
na = c("", "NA", "N/A", ".")
)
# haven options
dat <- eg_read_sas(
"/Shared/Data/data.sas7bdat",
encoding = "latin1"
)File Extensions Matter
egnyte uses the file extension to determine the format when downloading. Make sure your files have the correct extension:
-
.csvfor CSV files -
.xlsxor.xlsfor Excel files -
.sas7bdatfor SAS data files -
.xptfor SAS transport files -
.dtafor Stata files -
.savfor SPSS files -
.rdsfor R objects
Comparison with Base File Transfer
Here’s the difference between using the format-specific functions
vs. the base eg_read()/eg_write():
Using eg_read_csv() (recommended for data files):
# One step - download and read
dat <- eg_read_csv("/Shared/Data/data.csv")Using eg_read() (manual approach):
# Two steps - download, then read
temp_file <- eg_read("/Shared/Data/data.csv")
dat <- readr::read_csv(temp_file)
unlink(temp_file) # Clean upThe format-specific functions handle all this for you and clean up temporary files automatically.
Next Steps
- Learn about authentication options in
vignette("authorization") - See raw file transfer in
vignette("file-transfer")