Reproduction of PEACHES breastfeeding analysis (legacy REDCap export → TRACE)

Dataset Info

Published on

2026-01-15

Variables

Data Access

Data is available only upon formal request and subject to approval.

Approved users receive a secure institute account and work with the data exclusively in our Trusted Research Environment (TRE) via remote desktop.

Request data (Email to us)

Reuse & Usage Terms

Data is not downloadable (TRE access only).
Approved users receive a personal institute account.
Tools available: RStudio, Jupyter, Python, Stata, etc.
Data resides in your TRE home directory.
Re-use/publication per Data Use Agreement (DUA).
No redistribution of the data.

Description

Scientific aim

This application aims to reproduce and document an existing analytical workflow used to evaluate associations between early-life breastfeeding exposures and subsequent child health outcomes in the PEACHES cohort.

The primary exposure is breastfeeding, operationalized as (i) duration measures (any, full, partial breastfeeding and related duration constructs) and (ii) breastfeeding feeding status across the first months of life (month-specific indicators). The primary outcomes are growth-related measures, specifically age-standardized BMI (zBMI) at multiple follow-up ages and changes in zBMI between follow-up windows. Secondary outcomes include selected morbidity indicators collected in follow-up assessments (e.g., diarrheal episodes and asthma/wheeze-related variables).

From an epidemiological perspective, the workflow represents a longitudinal observational analysis in which breastfeeding exposure is treated as an early-life determinant, and outcomes are assessed at multiple later time points. The analysis pipeline includes preprocessing steps and confounder handling intended to estimate adjusted associations between breastfeeding measures and outcomes while accounting for maternal, pregnancy, and birth characteristics.

Because the original analysis was performed on a REDCap export outside TRACE (prior to TRACE onboarding) and the original analyst is no longer available, this reproduction serves two purposes: (1) to preserve the analytical provenance (exact scripts, decisions, and outputs), and (2) to enable verification and future reuse of the workflow within a controlled and auditable research environment.

Purpose / Why do you request access?
This application is submitted to migrate an existing analysis workflow into TRACE and make it reproducible within the Trusted Research Environment (TRE). The analysis was originally performed on a REDCap export downloaded before TRACE onboarding. To support reproducible research, we replicate the process in TRACE, including (1) documenting the variable set, (2) recreating a formal TRACE access request for the relevant variables, and (3) executing the analysis pipeline inside TRACE with traceable outputs.

What data do you need?
We request access to the variables required by the breastfeeding analysis pipeline, covering:

maternal baseline characteristics and pregnancy factors,
birth characteristics,
breastfeeding exposure variables (status and duration),
child growth variables (zBMI at multiple ages, zBMI changes),
selected follow-up outcomes and time variables (e.g., diarrhea frequency, asthma/wheezing measures).

The exact variable list expected by the pipeline is documented in the attached analysis code archive.

Important note on variable naming / mapping
The legacy analysis dataset contains renamed and/or derived variable names compared to the official PEACHES REDCap data dictionary. This request therefore also serves a documentation purpose: the attached ZIP provides the scripts and README describing the expected variable names, and (where feasible) notes on how they correspond to original REDCap fields or derived constructs.

How will the data be used?
Data will be accessed and analyzed exclusively within the TRACE TRE. No participant-level raw data will be downloaded. Outputs consist of statistical summaries, model coefficients, and figures (e.g., forest plots) and will be handled according to the Data Use Agreement (DUA).

Analysis plan (high-level workflow)

Set up the TRE environment (R + packages).
Run preprocessing and confounder handling scripts.
Run outcome analyses (e.g., zBMI and immunological outcomes).
Aggregate analysis outputs using a helper script and generate final plots.
Compare the reproduced outputs to the previously reported results where applicable.

Attached materials / where to find details
The complete analysis workflow is provided as a ZIP archive on the dataset page under “Analysis Code”. It includes R scripts, a Python helper script, and a README describing execution order and expected inputs/outputs.

Available Variables (10)

Event: Geburt

Mutterpass

alt_mu_geburt_mp
para_mp
geb_art_mp
gew_vor_ss_mp
gew_letztes_mp

Screening

geschl_screen
u1_gew_screen
bmi_mu_vor_ss_screen

Event: Schwangerschaft

Diagnosis Gdm

gdm_gdm_diagn

Finale Entscheidung Rauchen In Ss

entsch_abgleich_rauchen_ss

Analysis Code

Viewing: v1 R Multi-file Archive

Viewing version: v1 (R)

Created by jonathan.christ · 2026-01-15 12:30

⬇ Download ZIP

📦 Archive contents

breastfeeding-analysis-main/analysis.py

other · 2966 bytes

file
breastfeeding-analysis-main/breastfeeding_final_data.Rdata

other · 468530 bytes

file
breastfeeding-analysis-main/forest_plot_total_new.R

script · 6792 bytes

script
breastfeeding-analysis-main/initial_setup.R

script · 634 bytes

script
breastfeeding-analysis-main/myfunctions.R

script · 3323 bytes

script
breastfeeding-analysis-main/preprocess_breastfeeding.R

script · 10571 bytes

script
breastfeeding-analysis-main/readme.txt

documentation · 1307 bytes

docs
breastfeeding-analysis-main/treatment_breastfeeding.R

script · 1022 bytes

script
breastfeeding-analysis-main/universal_analysis_immuno_total.R

script · 8107 bytes

script
breastfeeding-analysis-main/universal_analysis_zBMI_total.R

script · 7406 bytes

script

Entry point: breastfeeding-analysis-main/universal_analysis_immuno_total.R
Uncompressed size: 510658 bytes
Files: 10

🧾 README

Explanation of structure and files:

breastfeeding_final_data.Rdata 
  - finished dataset
  - will be loaded in the next script
  
1) Please run initial_setup.R to download all needed packages.

2) treatment_breastfeeding.R - first actual script to run 
  - loads data and initiates preprocessing and drops missing confounder-rows, runs myfunctions and preprocess_breastfeeding (.R)
  - loaded in treatment_breastfeeding.R

#Analysis
3) universal_analysis*
  - *zBMI: runs analysis for zBMI outcome
  - *immuno: runs analysis immonological outcomes

4) Gather all outputs from an outcome-analysis inside a folder and name the folder accordingly (e.g. ZBMI).

Rename the folder variable (e.g. to the Outcome like ZBMI) inside the python script and run the 
python script "analysis.py". This will return a csv-file. Remove the first 
column in the csv file (indexing) and check that values are present. Move the CSV file into the main folder or adjust the path inside the R-script accordingly.

5) forest_plot_total_new.R
  - uses the result from the python script (csv) to create forest plot results

Now run the R-script forest_plot_total_new.R with the specified filename and change the Outcome name pre-run (as commented in the file).

Note: If you get a deprecation error, re-run each line individually.

Version Timeline (by language)

Version History (detailed)

Version	Language	Type	Relation	Author	Date
Global v1 (R v1) selected	R	Multi-file Archive	Initial Implementation	jonathan.christ	2026-01-15

Contact

Marcel Müller
Email
Publisher

Project
PEACHES