Sex, drugs, and early emerging risk: Examining the association between sexual debut and substance use across adolescence

Sexual debut, or first intercourse, predicts problem behaviors such as substance use. This association could reflect a direct effect of debut itself, general developmental trends, or the fact that some youth are more predisposed to a wide array of problem behaviors (e.g., risky sex, substance use). Understanding the association between sexual debut and substance use thus requires methods that can distinguish between these various accounts. In this study the association between sexual debut and substance use was investigated in a longitudinal sample of Mexican-origin youth (N = 674) assessed annually from 5th (Mage = 10.86 years, SD = 0.51) through 12th grade (Mage = 17.69 years, SD = 0.48). The longitudinal aspect of the data allowed the direct effect of sexual debut on substance use to be tested while accounting for long-term trends in substance use, and stable individual differences in those trends based on early risk and debut timing. Substance use increased over time, and early risk and debut were consistently associated with more substance use. Sexual debut also modestly predicted an increase in substance use after accounting for these effects, however. Taken together, results provide some evidence consistent with each of the potential explanations for the association between sexual debut and substance use across adolescence.


Introduction
Sexual debut, or first sexual intercourse, is a normative part of adolescent development [1]. However, sexual debut is also associated with certain problematic outcomes, such as substance use [2][3]. Thus, there is a tension in the literature between recognizing that sexual debut is a normal part of development that should not be pathologized, and understanding why sexual debut is correlated with problem behaviors [3][4]. This issue has important implications-for example, for sex education policies-which increases the need for rigorous evidence about the PLOS ONE | https://doi.org/10.1371/journal.pone.0228432 February 6, 2020 1 / 18 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Despite the strengths of the discordant twin design, they have limitations: they often rely on retrospective reports years after the event, feature only one or two occasions of measurement, are based primarily on European/European-origin samples, and require a sample of many twin pairs meaningfully discordant on the timing of sexual debut. Thus, it is critical to test the generalizability of their conclusions using other methods, particularly longitudinal methods [3,22,23,24]. Longitudinal studies initiated before sexual activity and substance use are relatively normative (e.g., reach 50% prevalence) are particularly useful as it is possible to test the direct effect from sexual debut to substance use while controlling for long-term age-related changes, and long standing between-person differences in those trends [3].

Present study
The associations between sexual debut and substance use were examined in a sample of Mexican-origin youth assessed annually from 5 th through 12 th grade. Individual differences in the timing of debut, and childhood variables related to both risky sexual activity and substance use (effortful control, aggressiveness, externalizing problems, peer deviance, and parental monitoring) [3,4,25], were also incorporated into the analyses, which allowed the within-person effects of sexual debut to be examined while accounting for between person differences that are related to substance use and adolescent sexual behavior over time. Gender differences were also considered given findings regarding differences that have been observed regarding the timing of sexual debut and rates of substance use [7,14,26,27,28].
The longitudinal design of this study, paired with the array of risk variables collected at baseline, helps extend and clarify existing research by providing another means of separating out the within-and between-person effects of sexual debut on substance use. Additionally, the focus on Mexican-origin adolescents helps generalize existing findings regarding associations between sexual debut and substance use to a rapidly growing demographic group in the United States [26,29]. Although trends in sexual activity are largely similar across ethnic groups in the United States [2,3], Latino youth tend to debut somewhat earlier than European and Asian American youth [30][31]. Latino adolescents may initiate substance use earlier as well [7,26,27,28]. On the other hand, there are distinctive features of Latino culture (e.g., familism) that are associated with reduced risky behavior and that help promote positive youth development [32][33]. Accordingly, it is important to examine whether previously observed ethnic group differences in debut timing, substance use, and cultural traditions also translate into differences in the associations between sexual debut and substance use.

Participants and procedures
Participants come from the California Families Project, a longitudinal study of 674 Mexicanorigin youth (50.0% girls) and their parents. To recruit participants a simple, unweighted random sample of children was drawn from the rosters of students in the Sacramento and Woodland, CA, school districts. The focal child had to be in the 5 th grade, living with their biological mother, and of Mexican origin (i.e., of Mexican ancestry such that either they or previous generations of their family were born in Mexico). Twenty nine percent of focal children were born in Mexico. Participants were interviewed in their homes in Spanish or English, depending on their preference. Parents were not present in the room when their child was interviewed. The first assessment occurred when the youth were in 5 th grade (M age = 10.86 years, SD = 0.51), and subsequent follow-up assessments were conducted annually until 12 th grade (M age = 17.69 years, SD = 0.48). Retention rates across waves were high, ranging between 85% and 90% (see Table 1), though due to a data collection programming error substance use was not assessed

Measures
Sexual debut. An 11-item sexual behavior questionnaire developed for the California Families Project was administered to participants beginning in the 9 th grade. For the purposes of this study only a single item was used, "Have you had sexual intercourse during the past 12 months?" Participants responded with either "yes" or "no." This item was converted into a series of dichotomous "sexual debut" variables by re-coding the initial item such that for a given grade, "yes" responses were only assigned to participants who reported having sexual intercourse during the past 12 months in that grade and no prior grades. This approach resulted in four separate variables that captured five distinct levels of sexual activity initiation: no debut (52.5% of participants), debut reported in 9 th grade (11.1%), 10 th grade (12.6%), 11 th grade (14%), and 12 th grade (9.8%). More boys reported debut in 9 th grade than girls (40 versus 25), but this gap narrowed over time, with 31 boys and 28 girls reporting debut in 12 th grade. In total 153 of 315 boys (49%), and 128 of 316 girls (41%), reported sexual debut at some point during the study.
Substance use variables. The psychometric properties of the substance use and early risk questionnaires were evaluated using graded item response models [34]. Although it could be argued that the substance use experimentation and frequency variables (see below) are more consistent with a formative measurement model than the reflective model assumed in item response theory (e.g., lifetime experimentation with substances is a function of its indicators, not vice versa), these models still provide a useful means of holistically quantifying the degree of interrelatedness between items and are thus simply used descriptively to those ends here. Summary statistics from these models are presented in text to provide a concise overview of scale functioning. For ease of interpretation discrimination values are converted to standardized factor loadings, and information values are converted to estimates of reliability [35]. The average reliability estimates presented specifically capture the average information provided by a scale (converted to reliability) between -2 and 2 standard deviations from the mean. Two substance use variables were measured using the Alcohol, Tobacco, and Other Drug Use scale [36]. Substance Experimentation was measured with nine items that assessed whether youth had ever tried a wide range of substances (cannabis, tobacco, hard drugs, and multiple forms of alcohol), and if they ever had experiences becoming intoxicated or getting "high". Participants responded either "yes" or "no" to each item. Responses were summed to generate a total lifetime experimentation score (possible range between 0 and 9). Items were highly intercorrelated, with the within-wave average factor loadings ranging from λ = .76 to .88 across time. The average reliability between -2 and 2 standard deviations was low in earlier waves, but increased over time, growing from r xx = .21 in 5 th grade to r xx = .62 in 12 th grade. The lower reliability values here primarily reflect the fact that 1) in earlier years there was very little substance use (i.e., variability), and 2) the information functions were asymmetric such that reliability was considerably lower below the mean than above. That is, the scale provides more information about youth that use substances than those that use little to no substances. As this is consistent with the aim the scale the asymmetric information functions, and the correspondingly moderate average reliability estimates, are not necessarily problematic here.
Substance Use Frequency was measured using nine different items from the Alcohol, Tobacco, and Other Drug Use scale that inquired about the regularity of substance use and intoxication/getting high over the past 3 months. Participants responded a 5 point scale that ranged from 0 ("never") to 4 ("almost every day") to each item. Responses across the scale were summed to generate a total substance use frequency score (possible range between 0 and 36). Average within-wave factor loadings ranged from λ = .84 to .95 across time. The average reliability between -2 and 2 standard deviations was low in earlier waves, but increased over time, growing from r xx = .05 in 5 th grade to r xx = .48 in 12 th grade. Again, few endorsements in earlier waves and asymmetric information functions contributed to these reliability estimates.
Substance Use Intentions were measured using a 9-item scale that assesses willingness to use substances, and plans to use those substances in the next year [37]. Participants responded on either a 3 or 4 point scale that ranged from 0 "Do not plan to/Definitely will not/Not at all willing" to either 3 "Very willing", or 4 "Do plan to/Definitely will." Scores for this measure were computed by summing the individual items (possible range between 0 and 33). The average within-wave factor loadings ranged from λ = .85 to .91 across time. The average reliability between -2 and 2 standard deviations was low in earlier waves, but increased over time, growing from r xx = .27 in 5 th grade to in r xx = .68 12 th grade. Information functions were again skewed to the left.
Finally, Access to Substances was measured using a 7-item scale adapted from the Drug Availability Scale used by the National Household Survey on Drug Abuse [38]. This scale assessed the availability of a range of substances in the youths' environment. Participants responded on a 4 point scale that ranged from 1 ("extremely difficult/impossible") to 4 ("extremely easy"). Responses across the scale were summed to generate a total substance use frequency score (possible range between 0 and 21). The average within-wave factor loadings ranged from λ = .84 to .92 across time. The average reliability between -2 and 2 standard deviations was low in earlier waves, but increased over time, growing from r xx = .51 in 5 th grade to .84 in 12 th grade. Information functions were skewed to the left in the earlier grades, but by 9 th grade the majority of information was provided with 2 standard deviations from the mean.
Early risk. Several measures were used to assess key pre-existing differences between youth on risk factors for problem behaviors [3,4,25]. These variables were collected in both 5 th and 7 th grade, as substance use and sexual activity are rare during this developmental period [3,7]. Effortful control and trait aggression were both measured using the Early Adolescent Temperament Questionnaire-Revised [39]. The effortful control scale captures individual differences in activation and inhibitory control (average within-wave factor loadings from λs = .35 to .49 across time and rater; average reliability from r xx = .62 to .73 across time and rater), and the aggression scale captures individual differences in hostile actions and hostile reactivity (average within-wave factor loadings from λs = .70 to .74 across time and rater; average reliability from r xx = .65 to .84 across time and rater). Mother and self-reports were combined to create composite scale scores. Externalizing behaviors were assessed using a standardized psychiatric interview, the Diagnostic Interview Schedule for Children-IV [40]. An externalizing behavior composite was calculated by using the sum of symptoms of conduct disorder and oppositional defiant disorder (symptom counts were correlated at r = .39 in 5 th grade, and r = .21 in 7 th grade).
Peer deviance was assessed using a 23-item scale adapted from the Delinquent Behavior Scale [41], the Gang Membership Inventory [42], and the Delinquency Scale from the National Youth Survey [43] that measures the degree of deviant behavior, antisocial influence, and gang involvement in the target youth's peer group (average within-wave factor loadings of λ = .75 and .83; average reliability of r xx = .65 and .71). Finally, parental monitoring was measured using a 14-item scale that assesses the degree to which parents are aware of their youth's behavior and various life circumstances [44]. We focused on youth reports of parental monitoring as adolescents' perception of their parents' knowledge may be most relevant for their actual behavior [45]. Youth reports of maternal and paternal monitoring efforts were combined into a single parental monitoring variable (average within-wave factor loadings from λ = .67 to .81 across time and parents; average reliability from r xx = .83 to .93 across time and parents). Standardized risk composite scores were computed for 5 th and 7 th grade by calculating the mean of the risk variables following a z-score transformation (effortful control and parental monitoring were reverse scored).

Data analytic strategy
We used latent growth models with structured residuals (LGM-SR; see Fig 1) to disaggregate within-and between-person effects [46-47]. Latent intercept and slope factors were first specified for the observed measurement occasions. The intercept factor captures status at the first time point, and the slope factor captures the rate of change over the course of the study. Each factor also incorporates individual differences across the sample (i.e., intercept and slope variances). A residual structure was then added to the observed variables alongside the latent growth model factors (i.e., each occasion of assessment functioned as an indicator of the growth model and the residual structure). Latent, occasion-specific residual factors were specified that captured deviations between an individual's observed scores and growth-model implied scores (R5 through R12 in Fig 1). Autoregressive paths linking adjacent latent residual factors were also specified in the residual structure.
For each substance use variable, the optimal unconditional LGM-SR (i.e., a model without any external predictors) was first identified (panel "a" in Fig 1). The slope factor in these models was specified using a latent basis approach in which the first basis coefficient (i.e., slope factor loading) was fixed to 0, while the final basis coefficient was fixed to 1 [48]. The model then freely estimates the intervening basis coefficients, with the mean of the slope factor representing the average amount of change between the first and last observation. Residual factor means were fixed to zero, and residual factor variances were freely estimated. Autoregressive paths also freely varied over time.
Given that the observed trajectories were markedly non-linear (see Fig 2), the latent basis specification was selected as it provides a parsimonious means of simultaneously capturing general maturational trends as precisely as possible while adjusting for broad individual differences in substance use over time. Notably, the latent basis specification does impose the same shaped growth form on each individual [49]. Spline models have been recommended as an alternative specification to consider because of this, especially when the goal is to compare different growth functions [49]. In the present application however identifying the shape of the growth trajectory per se, and the covariates of that trajectory, was not a major goal. That is, it is well-established that substance use broadly increases from 5 th grade through 12 th grade-both on average and for most individuals (the trajectories of individuals who consistently abstain would be captured well in both latent basis and spline models)-and the growth model covariates were specifically selected because they are reliable predictors of adolescent substance use in the literature (as is precocious sexual activity). Accordingly, these potentially major limitations of the latent basis specification were somewhat less relevant here.
To identify a more parsimonious model, an iterative series of parameter constraints were tested with the baseline unconditional LGM-SR. More constrained models were compared to their precursors, and differences in fit were examined (change in chi-square, CFI, and RMSEA were all considered). Originally, all autoregressive paths were freely estimated (A1). Next, all autoregressive paths were fixed to equality (A2). If these constraints reduced model fit, then all autoregressive paths in middle school (i.e., grades 5 through 8) and high school (i.e., grades 9 through 12) were separately fixed to equality (A3). If these constraints reduced model fit, all autoregressive paths in middle school were fixed to equality, while autoregressive paths in early (grades 9 and 10) and late (grades 11 and 12) high school were fixed to equality (A4). The A4 specification reflects the fact that late high school is when rates of sexual intercourse and substance use begin to increase more dramatically. If no constraints were supported then all autoregressive paths remained unconstrained in the subsequent models (i.e., A1).
After identifying the optimal unconditional model, sexual debut (i.e., whether youth reported sexual activity for the first time) was added to the residual structure (the time-varying covariates model; TVC) as a predictor of the grades 9 through 12 residual factors (panel "b" in Fig 1). The debut variables serve as quasi time-varying covariates in which participants either never report debut (i.e., "no" at all time points), or are counted as a "yes" at one and only one occasion. Equality in these paths at different grades was tested by constraining paths and comparing fit to the unconstrained model. The four debut variables were then added as predictors of the slope factor (time varying/time invariant covariates model; TVIC), making sexual debut both a time varying, and time invariant, covariate [50]. The early risk variables were also included in the TVIC as predictors of the slope factor (see panel "c" in Fig 1). The time invariant predictors on the slope adjust for stable individual differences in substance use trajectories. This is potentially useful for isolating the direct effect of sexual debut in the residual structure as the sexual debut paths in the TVC could reflect-especially for earlier debuting youthboth time-specific effects, and the fact that youth using more substances across time (who are more likely to debut early) will consistently have more positive residuals given that the unconditional growth model parameters reflect aggregated trends across all youth. Equality constraints were again tested on the paths from debut to the residual factors.
Indeed, the TVIC is particularly informative as it simultaneously captures all three potential accounts for the association between sexual debut and substance use. First, the slope factor mean and variance (SLOPE in Fig 1, panel c) capture the general age-related trends in substance use that track with sexual development in adolescence. Second, the regression paths from debut timing and early risk to the slope factor (DEBUT GRADE and RISK in Fig 1, panel  c) capture stable between person differences in substance use, reflecting the extent to which some youth are simply more likely be sexually active and use substances across adolescence. Third, the regression paths from sexual debut to the residual factors (DEBUT in Fig 1, panel c) capture the direct effect of sexual debut, or the extent to which the act of sexual debut in a given grade is associated with increased substance use after conditioning substance use on the general age-related trends in substance use, and stable individual differences in those trends (i.e., the within-person effect of sexual debut after accounting for general maturational trends).
Differences between girls and boys were investigated in the unconditional and conditional models via the comparison of increasingly constrained multigroup models. In these analyses the baseline model was the final model from the full sample analysis (i.e., same pattern of constraints on the residual structure), with estimates allowed to freely estimate across gender. All major analyses were conducted using Mplus version 8.0 [51] with full information maximum likelihood estimation, which provides relatively unbiased parameter estimates in the face of missing data [52]. Confidence intervals were derived via percentile bootstrapping (with 1000 draws), which is particularly effective when estimating confidence intervals with skewed variables such as substance use [53]. If any of the models produced negative variances/residual variances (i.e., "Heywood cases") these variance estimates were fixed to 0, along with any path emanating from the variable with a negative variance parameter estimate. The data used in the analyses reported here, along with sample syntax files, can be found in the online supplemental material on the Open Science Framework (https://osf.io/95c8w/).

Results
Descriptive statistics for the substance use variables are reported in Table 1. The average substance use trajectories for youth reporting sexual debut at different times are presented in Fig  2. Earlier debuting youth consistently reported more experimentation, frequency of use, intent to use, and access to substances. Descriptive information for the risk composite variables, based on when youth sexually debuted, is presented in Table 2. Risk scores tended to be higher for youth with earlier sexual debuts.

Substance experimentation
The unconditional model (A3; see supplemental material for full fit information across models: https://osf.io/95c8w/) fit the data at: χ 2 = 17.88, df = 24, p = .81; RMSEA = .000; SRMR = .021; CFI = 1.00; TLI = 1.00. Parameter estimates are reported in Table 3 (see supplemental materials for unconditional multi-group model results for girls and boys: https://osf.io/95c8w/). In this and subsequent substance experimentation models the intercept factor variance was fixed to zero as it was trivial in magnitude and led to non-positive definite solutions when freely estimated. Results from the TVC and TVIC models can be found in Table 4. In both models it appeared that the sexual debut paths in the residual structure could be fixed to equality across time without a notable decline in model fit (Δχ 2 of 10.33 and 3.89; Δdf = 3; ΔRMSEA of -.001 and -.001; ΔCFIs of -.003 and -.001). Sexual debut was associated with a small but statistically significant positive effect in the TVC (b = .51; 95% CI: .31, .71), though the chi square difference test did suggest that the effect of 12 th grade debut may be smaller than the effect of debut in earlier grades. The effect of sexual debut on the residual structure remained statistically significant in the TVIC-that is, after adjusting the growth part of the model for stable individual differences in substance use trajectories based on early risk and debut timing-though the unstandardized path estimate was 37% smaller (b = .33; 95% CI: .12, .51). Sexual debut in 9 th , 10 th , and 11 th grade, and 7 th grade risk, were all significantly associated with more substance experimentation over time in the TVIC (Table 4). Neither the time varying (Δχ 2 = .14; Δdf = 1; ΔRMSEA = .000; ΔCFI = .000) nor time invariant (Δχ 2 = 11.33; Δdf = 5; ΔRMSEA = .000; ΔCFI = -.001) effects differed across boys and girls.

Substance use frequency
The unconditional model (A1; see S1) fit the data at: χ 2 = 101.04, df = 18, p < .001; RMSEA = .083; SRMR = .064; CFI = .954; TLI = .928. Parameter estimates are reported in Table 3. Results for the TVC and TVIC are reported in Table 4. In both models it appeared that the sexual debut paths in the residual structure could be fixed to equality across time without a notable decline in model fit (Δχ 2 of 6.77 and 11.45; Δdf = 3; ΔRMSEA of -.003 and -.002; ΔCFI of -.003 to -.004). Sexual debut was a modest but statistically significant predictor of the residual factors in the TVC (b = .69; 95% CI: .18, 1.06). This effect remained statistically significant in the TVIC, however, the unstandardized path estimate was reduced by 46% (b = .37; 95% CI: .05, .76), and the chi square difference test suggested that effects may be stronger for youth debuting in 11 th and 12 th grade compared to those that debuted earlier. Sexual debut in 9 th and 10 th grade, and 7 th grade risk, were all significantly associated with more frequent substance use over time in the TVIC. The multi-group TVIC for examining gender differences encountered convergence problems, so models were estimated separately for girls and boys. The effect of debut in the residual structure was statistically significant for both sexes and in the same direction, though the path estimate did appear somewhat larger for boys than girls (b of .72 versus .39).

Substance use intentions
The unconditional model (A1; see S1) fit the data at: χ 2 = 32.01, df = 18, p = .02; RMSEA = .034; SRMR = .030; CFI = .990; TLI = .985. Parameter estimates are reported in Table 3. Results for the TVC and TVIC are reported in Table 4. In both models the sexual debut paths in the residual structure could be fixed to equality across time without a notable decline in model fit (

Access to substances
The unconditional model (A1; S1) fit the data at: χ 2 = 58.68, df = 18, p < .001; RMSEA = .058; SRMR = .048; CFI = .981; TLI = .971. Parameter estimates are reported in Table 3. Results for the TVC and TVIC are reported in Table 4. In both models it appeared that the sexual debut paths in the residual structure could be fixed to equality across time without a notable decline in model fit (Δχ 2 of 14.01 and 5.57; Δdf of 3 and 5; ΔRMSEA of .00 and -.003; ΔCFI of -.004 and +.010; in the constrained TVIC the residual variance for the 6 th grade residual factor was negative, and so it and its autoregressive path to the 7 th grade residual factor were fixed to 0). There was a small but significant effect of sexual debut on the residual factors in the TVC (b = .88; 95% CI: .36, 1.49), though the chi square difference test did suggest that the effects of 11 th and 12 th grade debut may be smaller than the effects of debut in earlier grades. The effect of debut in the residual structure remained significant in the TVIC, with the unstandardized path reduced by 12% (b = .77; 95% CI: .32, 1.23). Sexual debut in 9 th and 10 th grade, and 7 th grade risk, were all significantly associated with greater substance use intentions over

Discussion
The association between sexual debut and substance use was examined at different levels of analysis using data from a sample of youth followed annually from the 5 th through the 12 th grade. Consistent with previous findings, youth with earlier sexual debuts reported more substance use across adolescence than youth who debuted later or not at all during the course of the study. Youth scoring higher on early risk factors also demonstrated higher levels of substance use across time, and were more likely to sexually debut earlier. After accounting for these risk factors for increased substance use (and problem behaviors broadly) across time, and general age-related increases in substance use, sexual debut was a modest within-person predictor of greater substance experimentation, frequency of use, intent to use, and access to substances. Effect sizes were small by conventional standards, but the size and direction of effects were consistent across the substance variables. Consistent gender differences were not detected in these effects. Thus, there was evidence for both between-and within-person associations between sexual debut and substance use. Results reinforce the importance of a general risk for adolescent problem behavior while tentatively suggesting that the act of sexual debut itself may entail some additional risk. The amount of variance in substance use that can be isolated and attributed to debut itself though is likely modest, especially relative to the between-person effects regarding prior risk factors. Early debuting youth consistently reported more use across time than later or never debuting youth, and many of these trends were evident even before sexual intercourse is more typical (i.e., earlier than 8 th grade). Higher scores on late childhood predictors of substance use and risky sexual activity were similarly related to more substance use over time, especially when measured in 7 th grade (as compared to 5 th grade). Results therefore support twin studies suggesting that associations between sexual debut and delinquent behaviors are largely between-person. It is notable though that some within-person debut effects emerged after controlling for general age-related trends, and the large between person longitudinal trends. Future research can attempt to more precisely highlight the factors accounting for this, albeit modest, predictive trend.

Limitations
The study had several limitations. The sexual behaviors questionnaire was only included in the study protocol beginning in the 9 th grade because of concerns about the sensitivity of such questions at earlier ages. Consequently, there was low sensitivity to detect youth who initiated before the 8 th or 9 th grade. This was likely a small group, however, as sexual debut before high school is uncommon in the general population [14,31]. Further, in this sample only 11% (n = 65) of youth reported sexual activity in the 9 th grade, suggesting less than 11% of the sample initiated sexual activity before high school. The 12-month reporting period of the questionnaire also means that some reports of sexual debut in 9 th grade could have occurred the previous year (i.e., 8 th grade). Relatedly, no item included in the sexual behaviors questionnaire directly asked about sexual debut. Instead, the item used simply inquired about sexual activity in the past year so that sexual debut could only be considered on an annual time scale. Although this broad time scale precludes the examination of more precise temporal trends, it does allow for the investigation of general patterns over time. Moreover, although this issue may be especially problematic for the lifetime experimentation variable, the other substance variables center on more immediate or prospective behaviors, meaning they are more likely to capture post-debut behaviors and sentiments.
Further, although the sexual debut item used specifically inquired about sexual intercourse, the term "sexual intercourse" may be somewhat ambiguous to youth [2]. However, the interview formant may have allowed for clarification of any ambiguities. A potential drawback of an interview approach is that youth may have been less willing to disclose about sensitive topics relative to a more anonymous collection format. However, youth were allowed to enter their own responses to these questions anonymously on a laptop if they wished.
There were also some limitations in the analyses. Substance use variables were sometimes skewed, especially in earlier grades, and may have been better modeled using zero inflated count models [54]. Unfortunately, serious convergence issues were encountered when estimating such models. Although less than ideal, one of the primary issues with skewed data is that standard errors may be biased, not the parameter estimates per se. The consistency of results across multiple variables and analyses helps protect against concerns of capitalizing on chance [55], especially given that some variables (e.g., substance use intentions) were less skewed than others (e.g., substance use frequency). Further, percentile bootstrapped confidence intervals tend to perform well in the presence of non-normality [53,56].
Finally, the focus here was primarily on the conceptual path running from sexual debut to substance use. We emphasized this path because much of the interest in the relevant literature is specifically about sexual debut itself as a particularly noteworthy developmental milestone that may promote various risky behaviors. However, sexual activity is often reported as occurring in situations with concurrent substance use (e.g., parties) and many youth report being under the influence of some substance during their most recent sexual encounter [13][14]. That is, substance use and sexual activity are tightly linked in many adolescent contexts, and so it is likely that associations between sexual activity and substance are bi-directional such that substance use may increase the likelihood of sexual activity, which in turns promotes an increase in substance use, etc. Future work should build on the present and related findings by exploring the bi-directional dynamics between substance use and sexual activity (especially sexual behaviors beyond simple debut) across adolescence.

Conclusion
Sexual debut was modestly but consistently associated with greater substance use, even after accounting for normative age-related increases in substance use during adolescence, and the fact that earlier debuting youth consistently reported more substance use. Results imply associations between sexual debut and substance use at both the within-and between-person level. Some youth consistently use more substances and are more sexually active than their peers (between-person effects), but substance use and sexual activity also become more widespread across adolescence in general (within-person effect), and sexual debut predicts a slight increase in substance use over and above these other two trends (within-person effect). Understanding the association between sexual debut and substance use is helpful to identify at-risk youth and problematic behaviors for intervention without broadly stigmatizing adolescent sexual activity [4].