Data is available only upon formal request and subject to approval.
Approved users receive a secure institute account and work with the data exclusively in our Trusted Research Environment (TRE) via remote desktop.
Request data (Email to us)Scientific aim
This application aims to reproduce and document an existing analytical workflow used to evaluate associations between early-life breastfeeding exposures and subsequent child health outcomes in the PEACHES cohort.
The primary exposure is breastfeeding, operationalized as (i) duration measures (any, full, partial breastfeeding and related duration constructs) and (ii) breastfeeding feeding status across the first months of life (month-specific indicators). The primary outcomes are growth-related measures, specifically age-standardized BMI (zBMI) at multiple follow-up ages and changes in zBMI between follow-up windows. Secondary outcomes include selected morbidity indicators collected in follow-up assessments (e.g., diarrheal episodes and asthma/wheeze-related variables).
From an epidemiological perspective, the workflow represents a longitudinal observational analysis in which breastfeeding exposure is treated as an early-life determinant, and outcomes are assessed at multiple later time points. The analysis pipeline includes preprocessing steps and confounder handling intended to estimate adjusted associations between breastfeeding measures and outcomes while accounting for maternal, pregnancy, and birth characteristics.
Because the original analysis was performed on a REDCap export outside TRACE (prior to TRACE onboarding) and the original analyst is no longer available, this reproduction serves two purposes: (1) to preserve the analytical provenance (exact scripts, decisions, and outputs), and (2) to enable verification and future reuse of the workflow within a controlled and auditable research environment.
Purpose / Why do you request access?
This application is submitted to migrate an existing analysis workflow into TRACE and make it reproducible within the Trusted Research Environment (TRE). The analysis was originally performed on a REDCap export downloaded before TRACE onboarding. To support reproducible research, we replicate the process in TRACE, including (1) documenting the variable set, (2) recreating a formal TRACE access request for the relevant variables, and (3) executing the analysis pipeline inside TRACE with traceable outputs.
What data do you need?
We request access to the variables required by the breastfeeding analysis pipeline, covering:
maternal baseline characteristics and pregnancy factors,
birth characteristics,
breastfeeding exposure variables (status and duration),
child growth variables (zBMI at multiple ages, zBMI changes),
selected follow-up outcomes and time variables (e.g., diarrhea frequency, asthma/wheezing measures).
The exact variable list expected by the pipeline is documented in the attached analysis code archive.
Important note on variable naming / mapping
The legacy analysis dataset contains renamed and/or derived variable names compared to the official PEACHES REDCap data dictionary. This request therefore also serves a documentation purpose: the attached ZIP provides the scripts and README describing the expected variable names, and (where feasible) notes on how they correspond to original REDCap fields or derived constructs.
How will the data be used?
Data will be accessed and analyzed exclusively within the TRACE TRE. No participant-level raw data will be downloaded. Outputs consist of statistical summaries, model coefficients, and figures (e.g., forest plots) and will be handled according to the Data Use Agreement (DUA).
Analysis plan (high-level workflow)
Set up the TRE environment (R + packages).
Run preprocessing and confounder handling scripts.
Run outcome analyses (e.g., zBMI and immunological outcomes).
Aggregate analysis outputs using a helper script and generate final plots.
Compare the reproduced outputs to the previously reported results where applicable.
Attached materials / where to find details
The complete analysis workflow is provided as a ZIP archive on the dataset page under “Analysis Code”. It includes R scripts, a Python helper script, and a README describing execution order and expected inputs/outputs.
alt_mu_geburt_mppara_mpgeb_art_mpgew_vor_ss_mpgew_letztes_mpgeschl_screenu1_gew_screenbmi_mu_vor_ss_screengdm_gdm_diagnentsch_abgleich_rauchen_ssExplanation of structure and files: breastfeeding_final_data.Rdata - finished dataset - will be loaded in the next script 1) Please run initial_setup.R to download all needed packages. 2) treatment_breastfeeding.R - first actual script to run - loads data and initiates preprocessing and drops missing confounder-rows, runs myfunctions and preprocess_breastfeeding (.R) - loaded in treatment_breastfeeding.R #Analysis 3) universal_analysis* - *zBMI: runs analysis for zBMI outcome - *immuno: runs analysis immonological outcomes 4) Gather all outputs from an outcome-analysis inside a folder and name the folder accordingly (e.g. ZBMI). Rename the folder variable (e.g. to the Outcome like ZBMI) inside the python script and run the python script "analysis.py". This will return a csv-file. Remove the first column in the csv file (indexing) and check that values are present. Move the CSV file into the main folder or adjust the path inside the R-script accordingly. 5) forest_plot_total_new.R - uses the result from the python script (csv) to create forest plot results Now run the R-script forest_plot_total_new.R with the specified filename and change the Outcome name pre-run (as commented in the file). Note: If you get a deprecation error, re-run each line individually.
| Version | Language | Type | Relation | Author | Date |
|---|---|---|---|---|---|
| Global v1 (R v1) selected | R | Multi-file Archive | Initial Implementation | jonathan.christ | 2026-01-15 |