Data is available only upon formal request and subject to approval.
Approved users receive a secure institute account and work with the data exclusively in our Trusted Research Environment (TRE) via remote desktop.
Request data (Email to us)Data integration description here.
patid# CSV to REDCap Data Integration Workflow
Automated R-based workflow for standardizing and importing CSV data into REDCap databases with validation and type conversion.
## Overview
This workflow handles the complete pipeline from raw CSV files to REDCap import, including:
- Auto-detection of CSV delimiters
- Column name standardization
- REDCap field alignment and type conversion
- Automated API import
## Prerequisites
### Required R Packages
```r
install.packages(c(
"dplyr",
"janitor",
"readr",
"stringr",
"lubridate",
"httr",
"jsonlite"
))
```
### Required Files
- **CSV data files**: Source data to be imported
- **REDCap data dictionary** (`Redcap_dictionary.csv`): Export from your REDCap project
- **REDCap API token**: Generated from your REDCap project settings
## Configuration
### 1. Set File Paths
```r
folder_path <- "PATH/TO/YOUR/CSV/FOLDER"
```
### 2. Configure API Credentials
**Security Best Practice**: Never hardcode API tokens in scripts. Use environment variables:
```r
# Set environment variable (run once in R console)
Sys.setenv(REDCAP_API_TOKEN = "your_token_here")
# Use in script
api_token <- Sys.getenv("REDCAP_API_TOKEN")
api_url <- "https://your-redcap-server/api/"
```
## Workflow Steps
### 1. Data Loading
The script automatically:
- Detects CSV delimiter (comma or semicolon)
- Handles UTF-8 encoding
- Loads all CSV files from the specified folder
### 2. Column Standardization
Automatically maps common column names to REDCap system fields:
| Source Column | REDCap Field |
|---------------|--------------|
| `subject` | `subjid_drv` |
| `repeat_number` | `redcap_repeat_instance` |
| `visitid` / `visit_id` | `redcap_event_name` |
| `site` | `redcap_data_access_group` |
**Note**: Column names are cleaned using `janitor::make_clean_names()` (lowercase, underscores, no special characters).
### 3. Data Type Alignment
The workflow reads your REDCap data dictionary and applies validation rules:
- **date_dmy**: Converts to date format (supports multiple input formats)
- **integer**: Converts to integer
- **number**: Converts to numeric
System fields are preserved and not subject to validation conversion.
### 4. REDCap Import
Uses the REDCap API to import data with:
- Flat JSON format
- Normal overwrite behavior (updates existing records)
- Count return for verification
## Usage Example
```r
# Source the script
source("data_integration_workflow.R")
# Load and standardize all CSV files
# (already done automatically when script runs)
# Apply REDCap dictionary alignment to a specific dataset
DM <- data_list[[1]] # First CSV file
DM_aligned <- align_by_text_validation(DM, redcap_dict)
# Import to REDCap
api_token <- Sys.getenv("REDCAP_API_TOKEN")
api_url <- "https://your-redcap-server/api/"
result <- import_to_redcap(DM_aligned, api_token, api_url)
```
## Data Preprocessing Features
### Special Value Handling
- Empty strings converted to `NA`
- "f.A." values converted to `NA` (common data export artifact)
### Date Format Support
The workflow supports multiple date input formats:
- `dmy`: 31/12/2024
- `dmY`: 31/12/24
- `Ymd`: 2024-12-31
- `Y-m-d`: 2024-12-31
- `d-m-Y`: 31-12-2024
## Error Handling
The import function provides:
- HTTP status code reporting
- Success/failure messages
- Return of API response for debugging
## Security Considerations
⚠️ **Important Security Practices**:
1. **Never commit API tokens to version control**
2. **Use environment variables for credentials**
3. **Restrict file permissions on scripts containing tokens**
4. **Use separate tokens for development/production**
5. **Regularly rotate API tokens**
## Troubleshooting
### Common Issues
**Import fails with validation errors**:
- Check REDCap data dictionary field names match CSV columns
- Verify date formats match expected validation
- Ensure required fields are present
**Column mapping doesn't work**:
- Column names are case-sensitive after cleaning
- Check for typos in source CSV headers
- Review `make_clean_names()` output
**Date conversion fails**:
- Verify date format in source data
- Add additional `orders` to `parse_date_time()` if needed
- Check for invalid dates (e.g., 31/02/2024)
## Workflow Customization
### Adding Custom Column Mappings
```r
# In standardize_columns() function, add:
if ("your_column" %in% names(df)) {
df <- df %>% rename(redcap_field = your_column)
}
```
### Adding Custom Validation Types
```r
# In align_by_text_validation() function, add:
else if (validation == "custom_type") {
df[[var]] <- your_conversion_function(df[[var]])
}
```
## Author
**Sowjanya Batchu**
## License
Include appropriate license information for your organization.
---
## Related Documentation
- [REDCap API Documentation](https://redcap.vanderbilt.edu/api/help/)
- [janitor Package](https://sfirke.github.io/janitor/)
- [httr Package](https://httr.r-lib.org/)
| Version | Language | Type | Relation | Author | Date |
|---|---|---|---|---|---|
| Global v1 (R v1) | R | Multi-file Archive | Initial Implementation | lakshmi.batchu | 2026-02-06 |
| Global v2 (R v2) selected | R | Multi-file Archive | Refinement/Bug Fix ← Global v1 | lakshmi.batchu | 2026-02-06 |