“Puberty age gap”: new method of assessing pubertal timing and its association with mental health problems
[ad_1]
To promote reproducible open research practices, all analyses conducted as part of this article are made publicly available in a Git repository (“https://github.com/Niousha-Dehestani/Puberty-age”). In addition, more detail about participants, measurements, and analysis is provided in the supplementary materials.
Participants
Participants were drawn from the ongoing, longitudinal, Adolescent Brain Cognitive Development (ABCD) Study (https://abcdstudy.org/). Data was collected from ~11,500 children at baseline (47% females, aged 9–10 years old) from 21 sites across the United States, with annual data collection thereafter (see Supplementary Table 1 for demographic information). Data from the baseline to the 3rd annual follow-up waves (Release 4.0) were used in the current analyses. We excluded participants who had a mismatch between their biological sex (collected per visit with the salivary sample) and their self-reported gender, as well as those with missing data for biological sex (see Supplementary Information (SI) Appendix S1 for details of data cleaning procedures). The exact sample sizes utilized in each analysis are reported in detail below.
Measures
Pubertal Development Scale
The Pubertal Development Scale (PDS) measures observable physical signs of puberty. It includes items on height, body hair, and skin change in both sexes, “already complete”). Onset of menarche was a binary variable (yes/no response) that was converted to one for “no” and four for “yes”. The PDS can either be collected via self- or parent-report. However, each of these measures has its own limitations. Parent report has good correspondence to clinician ratings, though this correspondence is lower in males [32]. Conversely, self-reporting is less accurate for individuals who are in the lower or upper pubertal stages as they tend to report toward the mid stages. Therefore, some studies have recommended using the parent-report PDS, especially in late childhood or early adolescence [33], and accordingly, the current study utilized this version. In this study, we used individual PDS items rather than the average PDS score typically used in prior literature [34]. This allowed us to flexibly model the contribution of each item in normative models, consistent with previous literature that has found that separate PDS items exhibit differential relationships with, e.g., brain structure [34].
Hormones
DHEA and testosterone (TST) levels were measured via salivary hormone samples assayed by Salimetrics. Although estradiol was also measured, it was not used in the current analyses due to it only being available for females and having excessive missingness (n = 780). The hormone data was cleaned based on the protocol that was published recently [35], which involved removing the confounding effects of collection time, duration of collection, wake-up time on collection day, having exercised before collection, and caffeine intake (with a linear mixed effect model, see SI for more details). Further information on the reliability and overall quality of the hormone data can be found in recent work [32].
Body Mass Index (BMI)
BMI was calculated as the average of two weight and height measurements per visit, assessed by the researcher. Next, BMI standard deviation scores (BMI z-scores) were calculated relative to age and sex, with reference to the CDC 2000 Growth Charts [36].
Sociodemographic variables
Five categories of race/ethnicity were coded: White, Black, Hispanic, Asian, and Other/Multi-race. Additionally, household income and education were obtained as measures of family socioeconomic status (SES) that were collected in a parent report.
Mental health problems
The Child Behavior Checklist (CBCL) (age 6–18 form [37]) was used to measure parent-reported mental health problems. CBCL includes eight syndrome scales; Anxious/Depressed, Withdrawn/Depressed, Somatic Complaints, Social Problems, Thought Problems, Attention Problems, Aggressive Behavior, and Rule-breaking Behavior as well as three broad summary scales of externalizing, internalizing, and total problems.
Statistical analysis
Calculating pubertal timing
To calculate the pubertal timing model, this study used release 4 of the ABCD data, from baseline to the 3rd annual follow-up wave. For details on data processing, such as the procedure for dealing with missing values, see SI Appendix S1 and Table S1. Inheriting fundamental concepts from the literature on “brain age” [26], “puberty age” was computed using supervised machine learning. The model aimed to learn the relationship between physical and hormonal measurements of puberty (specifically, each PDS item, DHEA, and TST levels) and chronological age, separately in males and females. To remove the potential impact of familial relations and repeat assessments of each individual, the sample was first stratified to randomly keep only a single observation for each family (i.e., across waves and siblings). Additionally, to ensure the model was trained on a typically developing sample; using the CBCL DSM-oriented scales (consistent with DSM diagnostic categories), we excluded individuals above the threshold (symptom’s score >60) for affective, anxiety, somatic, oppositional defiant and conduct problems, as well as ADHD. The remaining participants who had symptom scores <60 were included in the assessment of the performance of the puberty age model (typically developing (TD) sample N = 4949 (2439 females)). The model used a train, validation, and test split in which a subset of the TD sample (90% of TD) was used for training and validation (hyperparameter tuning) of the model. Thereafter, the optimal model was used to predict chronological age from pubertal measurements in an independent test sample (10% of TD). This procedure was repeated over 10 folds to provide out-of-sample age predictions for the complete TD sample. In addition, the trained model on the complete TD sample was used for age prediction in the non-TD sample. The effect of age was subsequently regressed from all predictions to adjust the bias created by regression toward the mean (RTM) [38]. For more details, see SI, Appendix S3, Figs. S2 and S3. The bias-adjusted prediction of an individual’s chronological age from pubertal measurements is termed “puberty age”. Further, the residuals of the prediction model (after subtracting chronological age from puberty age) are referred to as the “puberty age gap”, which we use as a dimensional measure indicative of relative pubertal timing. A positive puberty age gap is interpreted as a sign of earlier pubertal development compared to an age and sex-matched group, while a negative gap reflects relatively delayed pubertal development (see, Appendix 1, Fig. S1).
We implemented a generalized additive model (GAM) for the prediction of age (response variable) from multivariate pubertal measurements (hormones [TST, DHEA] and/or each of the PDS items). As GAM fits smooth nonlinear curves in the form of spline functions, it is expected to outperform commonly utilized linear methods of pubertal timing measurement, given the nonlinear relations previously reported [30]. Inner loop validation and hyperparameter tuning were performed by a grid search for optimal regularization penalty on each term (i.e., using GAM) to minimize the estimated prediction error in the training sample (generalized cross-validation (GCV) score). The whole available sample was used for testing (N = 9919 (4725 females)). First, outer loop cross-validation with 10 folds was used to test the model for the TD sample. In other words, we divided the TD sample into 10 folds where each fold contains 10% of the TD sample. After this, the best parameters from the grid search for GAM were used to train a model on 9 folds and test in one remaining fold. This process was repeated 10 times and the accuracy of the model was calculated as the average for these 10 times. Additionally, a model was trained on the complete TD sample and tested on the non-TD sample to provide puberty age estimates for the complete (TD and non-TD) sample. The complete set of calculated puberty age measures were adjusted for the RTM age bias prior to subsequent analyses.
To understand the contribution of different indices to the measurement of pubertal timing, we implemented three alternative models to estimate “puberty age”. The first approach only used hormones (DHEA and TST) as features to predict chronological age, the second only used PDS items, and the third combined hormone and PDS items. The predictions of age were named hormonal puberty age, physical puberty age, and combined puberty age, respectively. In all three models, the residuals of age prediction indexed the “puberty age gap”, which reflects pubertal timing. We also used the partial dependence function in GAM models that can reflect the importance of each feature in the combined puberty age model (for more detail, see SI, Appendix S4 and Fig. S5).
Comparison of “puberty age” models
We compared the performance of the three alternate puberty age models using Pearson’s correlation between predicted and chronological age and mean absolute error (MAE) averaged over the complete sample (TD and non-TD). Additionally, we used a non-parametric paired t-test (Wilcoxon signed-rank test) on the absolute errors of model predictions to statistically compare the performance of different models.
“Physical puberty age” compared to a linear pubertal timing model
The current study also compared the out-of-sample performance and accuracy of the physical puberty age model to the commonly employed method of linearly regressing age from the average PDS score [19, 22]. In order to draw a comparison between the new model and the traditional linear approach, the performance of the physical puberty age model was contrasted with the traditional approach. The physical model was selected (rather than the hormonal or combined model) to ensure a fair comparison of two approaches that measure pubertal timing from the same input features (i.e., PDS). In order to conduct this comparison, a linear regression model was used to regress chronological age from the total PDS score (train) and used the fitted model coefficients in unseen data to measure timing (test). Similar to the puberty age model design, a 10-fold cross-validation design was used to measure traditional pubertal\ timing for the whole sample. This linear model provided an implementation of the traditional model in an out-of-sample prediction paradigm. Model performance was assessed based on the out-of-sample prediction accuracies (quantified by the absolute error of predictions). Similarly, we used the “Wilcoxon signed-rank test” to investigate the statistical differences in model performance.
“Puberty age gap” associations with mental health problems
We used linear mixed-effect models (LMM) to investigate associations between each alternate “puberty age gap” measure and different dimensions of mental health problems, in males and females separately. The following formula was tested for each syndrome dimension (i.e., Anxious/Depressed, Withdrawn/Depressed, Somatic Complaints, Social Problems, Thought Problems, Attention Problems, Aggressive Behavior, and Rule-breaking Behavior), as well as three broad scales including total problems, externalizing, and internalizing problems. Sample size for this analysis was N = 9919 (4725 (females)).
Mental health problems—puberty age gap + age + (1|site)
Age was included as a confound (fixed effect) and the data collection site was modeled as a random effect. We corrected for multiple tests controlling for False Discovery Rate (FDR) at 5% and reported the FDR corrected p-values in the results. The different “puberty age gap” LMMs were compared based on the Akaike Information Criterion (AIC) and a cut-off of 2 was used to indicate evidence for a better model, i.e., the model with an AIC that is at least 2 units smaller is considered a comparatively better model. Furthermore, to investigate whether associations between pubertal timing and mental health problems differed by age, we investigated the interaction effect of age and puberty age gap in predicting mental health problems using the following LMM:
Mental health problems—puberty age gap + age + puberty age gap: age + (1| site)
Finally, given the known association of pubertal timing with BMI, SES, and race/ethnicity [21], supplementary analyses repeated primary models while accounting for these variables as covariates (see SI, Appendix S5 and Table S1, for details of these analyses). This approach avoided potential complexity in our primary analyses due to the collinearity of confounding variables with our main variables of interest.
[ad_2]
Source link