Figures
Abstract
Pacific Islanders, including those in American Samoa, face a disproportionately high burden of gestational diabetes mellitus (GDM) and related sequalae of metabolic conditions. The CREBRF rs373863828 genetic variant, which is uniquely common among Pacific Islanders, has been paradoxically associated with higher body mass index (BMI) but lower risk of type 2 diabetes. While emerging evidence suggests this variant may influence both maternal metabolic outcomes and infant growth, studies in pregnancy and early life remain limited. The purpose of this paper is to describe the protocol for a study designed to address these gaps. The Health Outcomes in Pregnancy and Early Childhood (HOPE) Study is an observational, longitudinal cohort study that will enroll up to 180 Samoan pregnant women and their infants (target n = 150 dyads completing study protocols) in American Samoa, with follow-up through six months postpartum/postnatal. The study includes questionnaires, anthropometric measurements, and biospecimen collection. Genetic and epigenetic analyses will examine associations between maternal and infant CREBRF rs373863828 genotype, gestational diabetes status, infant body size, and cord blood DNA methylation. The study is approved by the Institutional Review Boards at the University of Pittsburgh, Yale University, and the American Samoa Department of Health, as well as the Lyndon B. Johnson Tropical Medical Center (American Samoa) Research Oversight Committee. Findings will be disseminated through peer-reviewed publications, conference presentations, and community reports.
Citation: Heinsberg LW, Loia M, Tasele S, Faasalele-Savusa K, Carlson JC, Anesi S, et al. (2025) Study protocol for the Health Outcomes in Pregnancy and Early Childhood (HOPE) Study: A mother-infant study in American Samoa. PLoS One 20(9): e0326644. https://doi.org/10.1371/journal.pone.0326644
Editor: Samantha Frances Ehrlich, University of Tennessee Knoxville, UNITED STATES OF AMERICA
Received: June 3, 2025; Accepted: August 22, 2025; Published: September 15, 2025
Copyright: © 2025 Heinsberg et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: This paper describes the design and methods of an ongoing study. No datasets have yet been analyzed or are available for this protocol paper. Upon study completion, all relevant data will be made available via dbGaP (https://www.ncbi.nlm.nih.gov/gap/) under accession number phs003874.v1.p1.
Funding: This study is supported by the National Institutes of Health (NIH), Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) under award numbers K99HD107030 and R00HD107030. This study is also supported by the University of Pittsburgh School of Nursing through faculty startup funds and the Maternal/Child Health Hub. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Pacific Islander women in the United States (U.S.) face a disproportionately higher risk of gestational diabetes mellitus (GDM) compared to women of European ancestry (9.9–14.8% versus 2–6%, respectively) [1]. In American Samoa, GDM statistics far exceed these numbers, affecting up to 40% of Samoan women [2]. GDM not only impacts maternal health but also birth outcomes and long-term health of the offspring [3,4]. For example, children born to mothers with GDM have an increased risk of metabolic irregularities that can promote the intergenerational transmission of obesity and chronic disease [5,6].
The genetic marker CREBRF rs373863828 is receiving increasing attention for its unique health effects in Pacific Islander populations [7–9] and its potential to improve our understanding of GDM and related child health outcomes. The variant’s minor allele (A) is uniquely common among Pacific Islanders (minor allele frequency ~0.26) and is associated with greater body mass index (BMI) but paradoxically lower odds of type 2 diabetes among adults [8]. These associations have been replicated across multiple Pacific Islander populations [9–13], and early evidence suggests the variant may also protect against GDM [14].
Although research on early life, a critical window for long-term metabolic programming, is still limited, evidence suggests that CREBRF may also influence infant growth and body composition. In a sample of young children from Samoa recruited at birth and followed through 2 years, the presence of the A allele (AA/AG genotypes) was associated with greater bone and lean mass in infants [15] and greater height in toddlers [16] compared to those without the A allele (GG genotype), which may begin to explain its longer-term metabolic health benefits [17–19]. Additional findings indicate that the association with height persists into early childhood and that, by around age four, associations with more clinically visible obesity-related phenotypes, such as greater weight and abdominal circumference, also begin to emerge [20].
Despite these findings, most CREBRF research has focused either on maternal genetic contributions to GDM or offspring genetic influences on growth, without considering the complex interplay between maternal-offspring genetics. This is an important gap because pregnancy is a time of dynamic biological adaptation during which the maternal metabolic system shifts to prioritize nutrient delivery for fetal development [21]. This shift is thought to be shaped by the mother’s metabolic status and genetic makeup alongside hormonal factors produced by the placenta, which generally shares the same genetic profile as the fetus [22]. This suggests that GDM may not emerge in isolation, but instead from suboptimal maternal-fetal resource allocation influenced by both maternal and fetal genetics [22,23]. Fetal growth is similarly shaped by this dual influence—indirectly by maternal genetics, which govern the intrauterine environment [24], and directly by fetal genetics, which affect insulin and hormone production both before and after birth [25]. In the case of CREBRF, we hypothesize that maternal-fetal co-occurrence of the A allele synergistically provides greater protection against GDM compared to maternal genetics alone, leading to improved anthropometric outcomes in children via fetal programming. Supporting this broader framework, other studies have shown that maternal-offspring genotype combinations (e.g., at IRS1 rs1801278 [26]) can modulate GDM risk and, in turn, infant outcomes.
Beyond genetic contributions, epigenetic mechanisms such as DNA methylation may also play a key role in shaping infant health. As a critical regulator of gene expression and a central mechanism in fetal metabolic programming, DNA methylation is influenced by both genetic and environmental factors during fetal development [27,28]. Given its importance in human health, it is essential to understand how maternal-infant CREBRF genotype combinations affect DNA methylation, and how these genetic effects may interact with metabolic exposures, such as GDM, to shape the infant epigenome. Likewise, and more broadly, it is also important to understand how DNA methylation at birth can impact infant growth and development as this may help identify biological pathways through which maternal-fetal genetic interactions and prenatal exposures influence long-term metabolic health.
To address these gaps, we have designed the Health Outcomes in Pregnancy and Early Childhood (HOPE) study, which aims to understand the health and wellness of American Samoan women and their children, with a specific focus on GDM and infant growth. The primary objective is to examine how maternal-infant CREBRF rs373863828 genotypes jointly contribute to GDM status, infant body size, and genome-wide DNA methylation in cord blood at birth. Secondary objectives include integrating social, environmental, and behavioral data to understand how these contextual factors shape health outcomes, as well as exploring additional contributors to maternal and child health in this population, including genetic (e.g., BTNL9 rs200884524, another Pacific Islander-specific variant linked to cardiometabolic health [29]), genomic (e.g., microbiome composition, which may indirectly influence infant growth and immune development [30]), and environmental (e.g., per and polyfluoroalkyl substance concentrations, which have been linked to metabolic disruption [31]).
Materials and methods
Study design and overview
This ongoing, longitudinal, observational study (recruitment started April 2025) aims to enroll up to n = 180 Samoan pregnant women and their index offspring from American Samoa, with a target of n = 150 mother-infant dyads completing study protocols. Eligible women will be enrolled during their third trimester of pregnancy, and each dyad will be followed through the infant’s first six months of life. Participants will complete five study visits: one during pregnancy, one shortly after birth, and follow-up visits at 2, 4, and 6 months postpartum. As part of the study, mothers will complete questionnaires about themselves, their families, and their infants; physical measurements and biospecimens will be collected from both mothers and infants; and additional health data will be abstracted from medical records to supplement study assessments. Further details on study procedures are provided below.
Ethics
The HOPE study has received Institutional Review Board (IRB) approval from both the University of Pittsburgh and Yale University, with the University of Pittsburgh serving as the IRB of record through a multi-site IRB agreement (STUDY24020055). Local and territorial approval has been granted by the American Samoa IRB. In addition, the study has been approved by the Lyndon B Johnson Tropical Medical Center (LBJ) Research Oversight Committee for all hospital-related activities.
Setting
American Samoa is an unincorporated territory of the U.S. located approximately halfway between Hawaii and New Zealand in the South Pacific Ocean [32]. It consists of several islands, the largest and most populated being Tutuila, which is home to >90% of residents [32]. Tutuila measures roughly 21 miles in length and 3 miles across at the widest points.
The American Samoan population consists predominantly of Samoan individuals. Given its mountainous terrain, American Samoa is quite urbanized with the main population center being Tafuna, home to ~8,000 residents [33]. According to the U.S. Census Bureau, the 2020 population of American Samoa was 49,710, a decline from 55,519 in 2010 [34], reflecting significant out migration, largely to the mainland U.S., and a decline in birth rates. American Samoa was recently classified as a high-income economy by the World Bank (transition from upper-middle economy the previous year due to revision in population estimates based on the 2020 census [35]), although the gross national income of $18,017 per capita in 2022 [36] is far lower than the $78,035 in the U.S. as a whole [37].
American Samoa’s healthcare system, designated as medically underserved by the U.S. Health Resources and Services Administration (HRSA) [38], includes one tertiary care facility, the Lyndon B Johnson Tropical Medical Center (LBJ), and four federally qualified community health centers (FQHCs) operated by the Department of Health. Residents are automatically enrolled in Medicaid through a waiver under the Social Security Act, bypassing individual eligibility assessments [39]. All prenatal care in the territory is centralized through five locations: the four FQHCs as well as the obstetrics/gynecology clinic at LBJ Hospital. While prenatal care is distributed across these sites, LBJ Hospital is the only delivery facility on the island, with over 98% of births occurring there annually.
The health landscape in American Samoa has been shaped by complex historical dynamics, including foreign influence and multifaceted internal political, economic, and social systems. For example, as U.S. nationals without full political representation—and therefore no direct voice in federal decisions—American Samoans face structural barriers that affect healthcare access and broader resource allocation. Historical shifts in food systems, coupled with limited local infrastructure and geographic isolation, have further contributed to persistent health challenges.
Despite these challenges, the Samoan cultural values of fa’aSamoa—the Samoan way— emphasizes respect, reciprocity, and family and community ties [40], offering a powerful foundation for the future. This study, in partnership with the American Samoan community, incorporates Samoan frameworks such as Teu le va (maintaining respectful relationships [41]) and Talanoa (“talk story”, a dialogic approach rooted in storytelling [42,43]). By grounding the research in these values, we aim to ensure that knowledge sharing between researchers and the Samoan community occurs in a respectful and culturally safe environment, and that the resulting solutions to improve health outcomes for women and children are aligned with Samoan norms.
Participants
HOPE study participants will be recruited from FQHC and LBJ prenatal care clinics and through flyers, social media advertising, and clinician referrals. Inclusion criteria for dyads include: (1) maternal age ≥ 18 years; (2) maternal report that the child has four Samoan grandparents (an empirically validated method of assessing ancestry in this setting [8], included due to the genetic focus of this study); (3) pregnancy gestation at or beyond 35 weeks (to limit heterogeneity related to extremely preterm birth); (4) singleton pregnancy (to avoid heterogeneity in growth patterns seen in multiples); (5) maternal plans to give birth at LBJ and reside in American Samoa for at least six months post-birth (to enable follow-up); (6) maternal completion of a standard-of-care 2-hour, 75g oral glucose tolerance test for GDM screening between 24 and 28 weeks’ gestation (to have a current best practice measure of GDM status); and (7) maternal intent to provide cord blood and saliva DNA specimens (to conserve limited study resources for the primary study goal, though participants are reminded at each visit that their involvement in each specific assessment is optional).
Exclusion criteria for dyads include: (a) pre-pregnancy maternal diabetes, defined by either self-report or a fasting glucose level of ≥126 mg/dL or HbA1c ≥ 6.5% during a standard prenatal visit before 12 weeks’ gestation; (b) maternal medical history of conditions that may interfere with maternal weight gain or fetal growth, such as prior bariatric surgery or congenital anomalies; or (c) insulin use as a first-line treatment for GDM (to reduce treatment-related heterogeneity).
For those eligible and interested, an orientation visit will be conducted by bilingual (English and Samoan) research assistants. Participant understanding will be assessed, and informed written consent will be obtained from the mother that covers her participation and that of her unborn child. Participants will receive a small baby gift at the time of their infant’s birth and monetary compensation for each completed visit that are aligned with the local cost of living and vary based on visit length.
Study procedures
The HOPE study consists of five study visits: one before birth (Visit 1), one after birth (Visit 2), and follow-ups at 2, 4, and 6 months (Visits 3–5). Visits are estimated to range from 20–90 minutes based on visit-specific research activities outlined in Table 1. Whenever possible, visits will be conducted in person at our research center, the participant’s home, or during standard clinical appointments (based on participant preferences). When in-person visits are not feasible, questionnaire data will be collected by phone. All data will be collected by trained Samoan research staff from the Obesity, Lifestyle and Genetic Adaptations (OLaGA; www.olaga.org) research center in American Samoa, who are bilingual in Samoan and English. Many have prior healthcare training or experience, and all receive training in research ethics and study procedures before engaging in data collection.
Physical (anthropometric and clinical) measurements.
At Visits 1, 3, 4, and 5, research assistants will collect maternal height, weight, heart rate, blood pressure, and HbA1c measurements. Maternal height will be measured using a portable stadiometer (SECA, Hamburg, Germany), and weight will be recorded with a digital scale (Tanita Corporation of America, IL, USA) with participants wearing lightweight clothing. Anthropometric measurements will be taken in duplicate and averaged for analyses. Blood pressure and heart rate will be measured after a 10-minute seated rest period, with a 3-minute rest between readings, using an automated blood pressure monitor (Omron Healthcare). Three measurements will be taken and the last two will be averaged for analyses. A point-of-care device (PTS Diagnostics A1cNow + ™ Systems) will be used to measure HbA1c via capillary blood collected from a finger stick after the finger is cleaned with an alcohol swab and allowed to dry. In the postpartum period only (Visits 3, 4, and 5), maternal body composition will be assessed via bioelectrical impedance analysis (Omron HBF-306C) to estimate fat mass and body fat percentage; this will be performed in only those participants without metal implants or pacemakers.
For infants, anthropometric measurements will be collected during Visits 2–5. Infant length and weight will be assessed with a length board and digital scale (SECA, Hamburg, Germany), with infants weighed in clean diapers after zeroing the scale for diaper weight. Head circumference will be measured at the widest possible point, and abdominal circumference will be measured above the belly button to avoid the umbilical cord stump in early assessments, using a standard tape measure (SECA, Hamburg, Germany). Measurements will be taken in duplicate and averaged for analyses. Age- and sex-standardized BMI z-scores will be calculated based on World Health Organization Child Growth Standards [44,45]. Additionally, abdominal circumference-to-length ratio will be calculated as an indicator of abdominal visceral fat, determined by dividing abdominal circumference by length. Skinfold thicknesses at the tricep, bicep, subscapular, iliac crest, and thigh will be measured on the left side of the body (to minimize potential variability due to limb dominance) using a Harpenden caliper (West Sussex, UK), and subcutaneous fat mass (mm2) will be estimated from the sum of all skinfold measurements.
Biospecimen and environmental sample collection.
Saliva: Saliva specimens will be collected from both mothers and infants using Oragene saliva collection kits for DNA and RNA stabilization (DNA Genotek). Mothers will provide saliva by spitting directly into the tube (DNA, OGR-600) or swabbing their mouth using a kit-supplied absorbent sponge (RNA, ORE-100) under the supervision of trained study staff. Infant saliva specimens will be collected by trained study staff using a kit-supplied absorbent sponge (DNA, OC-175; RNA ORE-100). Once collected, the samples will be mixed with the stabilizing reagent within the tubes to ensure DNA and RNA integrity. Specimens will be securely stored at room temperature until they are shipped in batches to the University of Pittsburgh for DNA and RNA extraction and purification following manufacturer protocols (DNA Genotek). Extracted DNA from maternal and infant saliva samples will be analyzed for CREBRF rs373863828 and BTNL9 rs200884524 using established TaqMan® assays (Applied Biosystems).
Umbilical cord blood: Umbilical cord blood specimens will be collected immediately following delivery by trained hospital staff at LBJ Hospital. After clamping and cleaning the umbilical cord, approximately 6 mL of blood will be drawn from the umbilical vein using an 18-gauge needle and sterile syringe. The specimen will then be aseptically transferred into PAXgene Blood DNA tube (Fisher Scientific, Catalog B761165, 2.5 mL) and a PAXgene Blood RNA tube (Fisher Scientific, Catalog 2302101, 2.5 mL). Samples will be frozen at −20°C for 24 hours and then transferred to −80°C storage for long-term preservation. Specimens will be shipped in batches on dry ice to the University of Pittsburgh for DNA and RNA extraction, analysis, and storage. Extracted DNA from cord blood will undergo epigenome-wide DNA methylation analysis using the EPIC v2.0 chip (Illumina).
Secondary samples: Several secondary samples will be collected by trained study staff and banked for future research. Capillary blood samples will be obtained from mothers (finger stick) and infants (heel stick) using sterile lancets and Mitra devices (Trajan Scientific Americas Inc.) following manufacturer protocols, with a total volume of up to 120 microliters per sample. The filled devices will be stored at −80°C and batch-shipped to the University of Pittsburgh for future per- and polyfluoroalkyl substance profiling. Infant fecal specimens will be collected using OMNIgene-gut kits (OMR-200, DNA Genotek) following manufacturer protocols, with stool obtained from used diapers before being stored at −80°C for future microbiome profiling. Finally, home water samples will be collected from a household tap (with the tap room location [e.g., kitchen, bathroom] recorded) using pre-cleaned 250 mL polypropylene bottles. Study staff collecting samples will be instructed to avoid using hand lotion on the day of collection and to wear clean nitrile gloves to reduce contamination risk. To monitor for field contamination, blank control samples pre-filled with deionized water will be opened and re-capped at select homes during sampling. All samples will be sealed, placed on ice, and frozen at −20°C within 24 hours and stored for future per- and polyfluoroalkyl substance profiling. All samples will be de-identified to protect participant confidentiality.
Study questionnaires.
Participants will complete a series of questionnaires (available in both English and Samoan, side-by-side) that gather information across several key domains including demographic, health, social, behavioral, and environmental factors. Demographic and household questions include topics such as education, relationship status, income, and other family and living characteristics. Health-related questions focus on pregnancy history, pre-pregnancy health, and pregnancy-related conditions, addressing both physical and psychosocial aspects. Behavioral and lifestyle questionnaires assess social support, tobacco and alcohol use, physical activity, nutrition, and sleep health. Environmental questions explore household and community context. Infant-focused sections capture information on feeding practices, sleep patterns, and developmental milestones. Participants will also be invited to share feedback on their study experience and indicate whether they are interested in receiving their genetic results. Additional details about the questionnaires are provided in Table 2. All questionnaire data will be collected using REDCap, a secure, web-based platform hosted on password-protected institutional servers [65,66]. Data will be de-identified to protect participant confidentiality.
Medical record review and extraction.
Clinical data will be extracted from participants’ medical records to supplement questionnaire responses and provide additional context. Extracted information will include GDM status, laboratory results (e.g., lipid levels), medication use, anthropometric measurements, birth-related details, and infant growth. Only personnel who are authorized, trained in privacy and data security (including the Health Insurance Portability and Accountability Act), and approved by LBJ Hospital will be permitted to access and extract this information.
Participant feedback and referrals to clinical services.
Study-related measurements will not be added to the participants’ medical records. However, participants may choose to receive a personalized “results booklet,” which will be maintained and updated throughout the study for those who opt in. This booklet will include maternal physical measurements (e.g., weight, height, heart rate, blood pressure, and HbA1c) and infant measurements (e.g., weight, length, head circumference, and abdominal circumference) and growth curves (using WHO growth charts). Before providing this information, research assistants will emphasize that all measurements are being collected for research purposes only and do not replace standard clinical care.
For unexpected health findings, participants will be referred to local clinicians at LBJ Hospital. Referral criteria include HbA1c levels ≥6.5%, blood pressure ≥140/90 mmHg (local referral threshold), and severe depression symptoms (Patient Health Questionnaire-9 (PHQ-9) score ≥20 or thoughts of self-harm). In cases of severe hypertension (≥160/110 mmHg) during pregnancy, an urgent preeclampsia evaluation referral will be provided.
Upon study completion, participants may request their own and/or their infant’s CREBRF rs373863828 and/or BTNL9 rs200884524 genotype results. If requested, study staff will reinforce that these results are for research purposes only, may not be fully accurate, and cannot replace certified clinical testing. Participants will also be informed that the genetic data are still being studied and may not be directly useful at this time. The study team will reinforce that recommendations for maintaining a healthy lifestyle remain the same regardless of genotype. Participants will also be encouraged to contact the study team in the future if they have additional questions or would like to be connected with further counseling about their results.
Data management and statistical analysis
Data management, cleaning, and quality control.
All study data will undergo rigorous quality control procedures to ensure accuracy, consistency, and reliability. During data collection, quality assurance will include weekly to monthly reports, data capture review, and routine checks of data entry and protocol adherence. Automated validation rules and branching logic will be used within REDCap to help minimize entry errors in real-time. All modifications to study records will be documented through REDCap’s audit trail functionality. Data management includes secure storage on encrypted, password-protected servers, version-controlled datasets, and restricted access to identifiable information.
All statistical analyses will be conducted in R [67]; examples of specific packages are given below but are subject to change as additional packages are developed. During the analysis phase, data quality diagnostics will be conducted to identify outliers, assess variable distributions, evaluate associations among variables, and examine missing data patterns. If needed, multiple imputation and sensitivity analyses will be employed to assess the impact of missingness.
For omics data, potential biases will be addressed through standardized laboratory controls, including the use of technical replicates, batch-specific standards, and internal efficiency measures. DNA methylation data will be cleaned and quality controlled using our established workflows [68,69] which currently includes application of minfi, lumi, Enmix, funNorm, and ewastools R packages [70–75], though these may be updated as new tools become available. This preprocessing will include batch correction and other normalization techniques to reduce technical variability and increase rigor.
Primary and secondary outcomes and data analysis overview.
Aim-specific primary outcomes of the HOPE study include: (1) maternal GDM status, (2) infant anthropometric and body composition measures (length, weight, age- and sex-standardized BMI z-score, head circumference, abdominal circumference, abdominal circumference-to-length ratio, and fat mass estimated from skinfold thicknesses), and (3) infant cord blood DNA methylation. Secondary maternal outcomes include BMI, heart rate, blood pressure, and HbA1c as well as postpartum fat mass estimated via bioelectrical impedance analysis. Across aims, the primary predictor/exposure of interest is maternal-infant CREBRF rs373863828 genotype combination. However, analyses will also examine maternal and infant genotypes following an iterative modeling strategy (e.g., modeling outcomes separately for maternal genotype and infant genotype, followed by joint models including both, and then models including an interaction term to test for synergistic effects). Based on the expected genotype distributions, maternal and infant genotypes will be operationalized based on the presence or absence of the minor (A) allele (e.g., AA/AG vs. GG). Linear regression models will be used for continuous outcomes (e.g., anthropometric measures), while binary logistic regression will be used for binary outcomes (e.g., GDM status), adjusting for relevant covariates (detailed below). Multiple testing correction will be made based on the correlation structure of the data using the meff function from the poolr R package [76].
For DNA methylation analyses, epigenome-wide association studies (EWAS) will be conducted to assess associations between DNA methylation with maternal-infant genotype combinations (primary, modeled as described above) and participant characteristics (secondary) using the limma R package [77]. M values (logit-transformed beta values) will be used in linear regression models to identify site-specific associations, adjusting for appropriate covariates. Analyses will be performed both with and without adjustment for cell type heterogeneity [78,79] inferred using established deconvolution methods [80,81]. Multiple testing correction will be applied using a methylome-wide significance threshold (p < 9 × 10⁻⁸). Differentially methylated regions (DMRs) will be identified using EWAS summary statistics and the dmrff R package [82], and gene set enrichment analysis (GSEA), Gene Ontology (GO) enrichment analysis, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis will be performed using the missMethyl R package [83,84].
For analyses focused on maternal secondary outcomes or infant outcomes, GDM status will be included as a covariate, with subgroup analyses explored to assess differential effects. Additional covariates (e.g., infant sex, maternal age, gestational age, and social drivers of health) will be included in regression models as appropriate. Additional exploratory analyses are planned to support secondary study goals and future sample collections.
Power.
The study sample of up to 180 maternal-infant dyads (accounting for anticipated attrition to achieve a final analytic sample of approximately n = 150 dyads) provides adequate power for the planned analyses. With n = 150 dyads, we have ≥80% power to detect a moderate odds ratio (OR) of 1.75 for the association between maternal-offspring genotype combination and GDM, and a small to moderate effect size (Cohen’s d = 0.28) for associations with continuous infant anthropometric measures. Importantly, these detectable effect sizes are consistent with or smaller than those reported in prior studies of CREBRF and metabolic traits in Pacific Islander populations [8,12,14,15]. Power calculations for DNA methylation analyses indicate sufficient power to detect moderate differences at individual CpG sites, even at genome-wide significance thresholds (e.g., ~ 3% difference at up to 25% of sites and ~7% difference at up to 75% of sites), providing critical foundational data for future studies.
Discussion
This study will support a hypothesis-driven examination of how maternal and infant genetics—particularly the CREBRF rs373863828 variant—relate to maternal and infant health in American Samoa. This work is significant because, unlike most genetic variants which have small effect sizes and are often overshadowed by social and behavioral factors, the CREBRF variant has demonstrated large and consistent effects [7,8,12,14,85], suggesting potential as a future screening or intervention target. As such, this study may help identify biological factors that contribute to health outcomes among the Pacific Islander population group which may inform population-specific strategies to improve outcomes. Additionally, the study is designed to contribute novel epigenetic and other omic data, offering valuable insight into potential mechanisms linking maternal and infant factors to health outcomes. Importantly, while this variant is enriched in Pacific Islander populations, its involvement in key metabolic pathways suggests broader relevance for understanding metabolic health in diverse populations.
Despite its significance, this study has some limitations related to feasibility, timeline, and resource constraints. For example, recruitment is limited to the third trimester of pregnancy, restricting our ability to examine early pregnancy or pre-conception factors. Similarly, paternal data will not be collected, preventing evaluation of extended family health behaviors or factors that may influence outcomes. In addition, while the use of non-invasive biospecimens was designed to reduce participant burden and distress and increase compliance, it may constrain the types of biomarkers that can be assessed. Finally, the study follows infants only through six months of age, limiting the ability to assess longer-term developmental trajectories.
Despite these constraints, several key strengths enhance the rigor and potential impact of this study. The focus of this work on a historically excluded population group with a high rate of GDM will ensure that findings contribute to reducing health disparities and improving the representation of Pacific Islander communities in genetic and environmental health research. In addition, the longitudinal design supports a comprehensive assessment of factors that shape early infant growth. Finally, the multimodal data collection approach including questionnaires, biospecimens, and genetic and epigenetic analyses enhances the study’s ability to investigate multiple biological pathways influencing maternal and infant health.
Therefore, this study represents a critical first step toward a biologically and culturally grounded understanding of maternal/child health outcomes in American Samoa. It lays the groundwork for future studies that may include earlier pregnancy recruitment, more detailed assessments of environmental and behavioral exposures, and extended follow-up into childhood. Taken together, this line of research may help guide the development of targeted, community-informed interventions to promote maternal and child health in American Samoa and other underrepresented populations globally.
Acknowledgments
We thank the participants—both current and future—for their time and contributions to this work. We are deeply grateful to the Lyndon B. Johnson (LBJ) Tropical Medical Center Research Oversight Committee and hospital staff for their invaluable feedback on the development of this protocol and logistical considerations. Special thanks to the clinical staff at LBJ who support the study and assist with cord blood collection—your collaboration is essential to the success of this work. We also extend our heartfelt appreciation to the prenatal care clinics at LBJ and the Department of Health, and to their staff, for being so welcoming, kind, and supportive. Your generosity and partnership make this study possible. We would also like to thank Sydney Harris, Nicole Bender, and Michelle Heller from the University of Pittsburgh School of Nursing, Department of Health Promotion and Development, for their incredible support throughout the design of this study—including help with supply procurement, payment setup, and other essential infrastructure. Finally, we are grateful to the American Samoa Community Cancer Coalition (ASCCC) for their partnership and support in expanding and sustaining the local infrastructure that enables this work. ASCCC is a non-profit organization that has operated for the last 20 years in developing innovative ways to address the cancer burden. This has included obtaining a U24 to create the Indigenous Samoan Partnership to Initiate Research Excellence (INSPIRE) to build research capacity and assess functional health literacy [86]. This led to the first National Institutes of Health R01 study awarded in American Samoa entitled Puipui Malu Manatu – protecting memories – a study to determine Alzheimer’s Disease and Related Dementia prevalence.
References
- 1. Gregory CWE, Danielle ME. Trends and Characteristics in Gestational Diabetes: United States, 2016–2020. 2022.
- 2. Hawley NL, Faasalele-Savusa K, Faiai M, Suiaunoa-Scanlan L, Loia M, Ickovics JR, et al. A group prenatal care intervention reduces gestational weight gain and gestational diabetes in American Samoan women. Obesity (Silver Spring). 2024;32(10):1833–43. pmid:39256170
- 3. Sheiner E. Gestational diabetes mellitus: long-term consequences for the mother and child grand challenge: how to move on towards secondary prevention? Front Clin Diabetes Healthc. 2020;1:546256. pmid:36993989
- 4. Gillman MW. Early infancy - a critical period for development of obesity. J Dev Orig Health Dis. 2010;1(5):292–9. pmid:25141932
- 5. Mantzorou M, Papandreou D, Pavlidou E, Papadopoulou SK, Tolia M, Mentzelou M. Maternal gestational diabetes is associated with high risk of childhood overweight and obesity: A cross-sectional study in pre-school children aged 2-5 years. Medicina (Kaunas). 2023;59(3).
- 6. Xiang AH. Diabetes in pregnancy for mothers and offspring: reflection on 30 years of clinical and translational research: the 2022 Norbert Freinkel Award lecture. Diabetes Care. 2023;46(3):482–9.
- 7. Zhang JZ, Heinsberg LW, Krishnan M, Hawley NL, Major TJ, Carlson JC, et al. Multivariate analysis of a missense variant in CREBRF reveals associations with measures of adiposity in people of Polynesian ancestries. Genet Epidemiol. 2023;47(1):105–18. pmid:36352773
- 8. Minster RL, Hawley NL, Su C-T, Sun G, Kershaw EE, Cheng H, et al. A thrifty variant in CREBRF strongly influences body mass index in Samoans. Nat Genet. 2016;48(9):1049–54. pmid:27455349
- 9. Burden HJ, Adams S, Kulatea B, Wright-McNaughton M, Sword D, Ormsbee JJ, et al. The CREBRF diabetes-protective rs373863828-A allele is associated with enhanced early insulin release in men of Māori and Pacific ancestry. Diabetologia. 2021;64(12):2779–89. pmid:34417843
- 10. Hanson RL, Safabakhsh S, Curtis JM, Hsueh WC, Jones LI, Aflague TF, et al. Association of CREBRF variants with obesity and diabetes in Pacific Islanders from Guam and Saipan. Diabetologia. 2019;62(9):1647–52.
- 11. Krishnan M, Major TJ, Topless RK, Dewes O, Yu L, Thompson JMD, et al. Discordant association of the CREBRF rs373863828 A allele with increased BMI and protection from type 2 diabetes in Māori and Pacific (Polynesian) people living in Aotearoa/New Zealand. Diabetologia. 2018;61(7):1603–13. pmid:29721634
- 12. Naka I, Furusawa T, Kimura R, Natsuhara K, Yamauchi T, Nakazawa M, et al. A missense variant, rs373863828-A (p.Arg457Gln), of CREBRF and body mass index in Oceanic populations. J Hum Genet. 2017;62(9):847–9. pmid:28405013
- 13. Ohashi J, Naka I, Furusawa T, Kimura R, Natsuhara K, Yamauchi T, et al. Association study of CREBRF missense variant (rs373863828:G > A; p.Arg457Gln) with levels of serum lipid profile in the Pacific populations. Ann Hum Biol. 2018;45(3):215–9. pmid:29877158
- 14. Krishnan M, Murphy R, Okesene-Gafa KAM, Ji M, Thompson JMD, Taylor RS, et al. The Pacific-specific CREBRF rs373863828 allele protects against gestational diabetes mellitus in Māori and Pacific women with obesity. Diabetologia. 2020;63(10):2169–76. pmid:32654027
- 15. Arslanian KJ, Fidow UT, Atanoa T, Unasa-Apelu F, Naseri T, Wetzel AI. A missense variant in CREBRF, rs373863828, is associated with fat-free mass, not fat mass in Samoan infants. Int J Obes. 2020;45(1):45–55.
- 16. Oyama S, Duckham RL, Arslanian KJ, Kershaw EE, Strayer JA, Fidow UT. Body size and composition of Samoan toddlers aged 18-25 months in 2019. Ann Hum Biol. 2021;48(4):346–9.
- 17. Bassett DR Jr. Skeletal muscle characteristics: relationships to cardiovascular risk factors. Med Sci Sports Exerc. 1994;26(8):957–66. pmid:7968429
- 18. Mengeste AM, Rustan AC, Lund J. Skeletal muscle energy metabolism in obesity. Obesity (Silver Spring). 2021;29(10):1582–95. pmid:34464025
- 19. Hong S, Chang Y, Jung HS, Yun KE, Shin H, Ryu S. Relative muscle mass and the risk of incident type 2 diabetes: A cohort study. PLoS One. 2017;12(11):e0188650.
- 20. Berry SD, Walker CG, Ly K, Snell RG, Atatoa Carr PE, Bandara D, et al. Widespread prevalence of a CREBRF variant amongst Māori and Pacific children is associated with weight and height in early childhood. Int J Obes (Lond). 2018;42(4):603–7. pmid:28928463
- 21. Fowden AL, Moore T. Maternal-fetal resource allocation: co-operation and conflict. Placenta. 2012;33(Suppl 2):e11–15.
- 22. Sferruzzi-Perri AN, López-Tello J, Fowden AL, Constancia M. Maternal and fetal genomes interplay through phosphoinositol 3-kinase(PI3K)-p110α signaling to modify placental resource allocation. Proc Natl Acad Sci U S A. 2016;113(40):11255–60.
- 23. Sinsheimer JS, Elston RC, Fu WJ. Gene-gene interaction in maternal and perinatal research. J Biomed Biotechnol. 2010;2010:853612. pmid:20798776
- 24. Beaumont RN, Warrington NM, Cavadino A, Tyrrell J, Nodzenski M, Horikoshi M. Genome-wide association study of offspring birth weight in 86 577 women identifies five novel loci and highlights maternal genetic effects that are independent of fetal genetics. Hum Mol Genet. 2018;27(4):742–56.
- 25. Hughes AE, De Franco E, Freathy RM, Flanagan SE, Hattersley AT. Monogenic disease analysis establishes that fetal insulin accounts for half of human fetal growth. J Clin Invest. 2023;133(6):e165402.
- 26. Wu L, Fang C, Zhang J, Ye Y, Zhao H. The Association between Maternal/Fetal Insulin Receptor Substrate 1 Gene Polymorphism rs1801278 and Gestational Diabetes Mellitus in a Chinese Population. Gynecol Obstet Invest. 2021;86(1–2):177–84. pmid:33895751
- 27. Zhu Z, Cao F, Li X. Epigenetic programming and fetal metabolic programming. Front Endocrinol (Lausanne). 2019;10:764. pmid:31849831
- 28. Gharipour M, Craig JM, Stephenson G. Epigenetic programming of obesity in early life through modulation of the kynurenine pathway. Int J Obes (Lond). 2025;49(1):49–53. pmid:39424650
- 29. Carlson JC, Krishnan M, Rosenthal SL, Russell EM, Zhang JZ, Hawley NL. A stop-gain variant in BTNL9 is associated with atherogenic lipid profiles. HGG Adv. 2023;4(1):100155.
- 30. Robertson RC, Manges AR, Finlay BB, Prendergast AJ. The human microbiome and child growth - first 1000 days and beyond. Trends Microbiol. 2019;27(2):131–47.
- 31. Goodrich JA, Walker DI, He J, Lin X, Baumert BO, Hu X, et al. Metabolic Signatures of Youth Exposure to Mixtures of Per- and Polyfluoroalkyl Substances: A Multi-Cohort Study. Environ Health Perspect. 2023;131(2):27005. pmid:36821578
- 32.
Overview of the State - American Samoa. 2020. [cited 2025 Apr 15]. Available from: https://mchb.tvisdata.hrsa.gov/Narratives/Overview/be7a1b90-b6cb-4716-ac6e-a27b897be87a
- 33.
2020 Census Report - American Samoa: Census Population by Village [Internet]. [cited 2025 Apr 15]. Available from: https://www2.census.gov/programs-surveys/decennial/2020/data/island-areas/american-samoa/population-and-housing-unit-counts/american-samoa-phc-table02.pdf
- 34.
2020 Census Report - American Samoa: Population of American Samoa 2010 and 2020 [Internet]. [cited 2025 Apr 15]. Available from: https://www2.census.gov/programs-surveys/decennial/2020/data/island-areas/american-samoa/population-and-housing-unit-counts/american-samoa-phc-table01.pdf
- 35.
World Bank Group country classifications by income level for FY24 (July 1, 2023- June 30, 2024) [Internet]. [cited 2025 Apr 15]. Available from: https://blogs.worldbank.org/en/opendata/new-world-bank-group-country-classifications-income-level-fy24
- 36.
The World Bank. GDP per capita (current US$) - United States, American Samoa [Internet]. [cited 2024 Feb 9]. Available from: https://data.worldbank.org/indicator/NY.GDP.PCAP.CD?locations=AS
- 37.
World Bank. GDP per capita (current US$) - United States [Internet]. [cited 2025 Apr 15]. Available from: https://data.worldbank.org/indicator/NY.GDP.PCAP.CD?locations=US
- 38.
Health Resources & Services Administration. Medically Underserved Areas/Populations [Internet]. [cited 2025 Apr 15]. Available from: https://data.hrsa.gov/ExportedMaps/MUA/HGDWMapGallery_MUA.pdf
- 39.
MACPAC. Medicaid and CHIP in American Samoa. [cited 2025 Apr 15]. https://www.macpac.gov/wp-content/uploads/2019/06/Medicaid-and-CHIP-in-American-Samoa.pdf
- 40. Alefaio-Tugia S. Fa’asinomaga: Identity of the Heart. In: Alefaio-Tugia S, editor. Pacific-Indigenous Psychology: Galuola, A NIU-Wave of Psychological Practices [Internet]. Cham: Springer International Publishing; 2022. pp. 3–21. Available from:
- 41. McCutchan-Tofaeono J. Teu le va: We over me” a brief overview of mental health amongst Samoans in American Samoa. Psychol Oceania Caribbean. 2022.
- 42. Vaioleti T. Talanoa: Differentiating the Talanoa Research Methodology from Phenomenology, Narrative, Kaupapa Maori and Feminist Methodologies. Te Reo. 2013;56/57:191–212.
- 43. Matagi CE, Worthington JK, Palakiko DM. Using Talanoa, a Pan Pacific Indigenous Approach, to Identify Solutions to Public Health Issues. Hawaii J Health Soc Welf. 2023;82(10 Suppl 1):14–7.
- 44.
Lazarus M, James B. MUACz: Generate MUAC and BMI z-Scores and Percentiles for Children and Adolescents [Internet]. Available from: https://cran.r-project.org/web/packages/MUACz/index.html
- 45.
World Health Organization. Body mass index-for-age (z-scores) [Internet]. [cited 2025 Apr 15]. Available from: https://www.who.int/toolkits/child-growth-standards/standards/body-mass-index-for-age-bmi-for-age
- 46.
Ware JE, Kosinski M, Dewey JE, Gandek B, Kisinski M, Ware JE. How to score and interpret single-item health status measures: a manual for users of the SF-8TM Health Survey. 2001. Available from: https://api.semanticscholar.org/CorpusID:78751834
- 47. Yiengprugsawan V, Kelly M, Tawatsupa B. SF-8TM Health Survey. In: Michalos AC, editor. Encyclopedia of Quality of Life and Well-Being Research [Internet]. Dordrecht: Springer Netherlands; 2014. pp. 5940–2. Available from:
- 48. Williams DR, Yan Yu, Jackson JS, Anderson NB. Racial Differences in Physical and Mental Health: Socio-economic Status, Stress and Discrimination. J Health Psychol. 1997;2(3):335–51. pmid:22013026
- 49. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606–13.
- 50. Mew EJ, Lowe SR, Galea’i A, Iopu F, Anderson J, Naseri J, et al. Cross-cultural adaptation of mental health screening instruments for Samoan adolescents. PLOS Mental Health. 2025;2(2):1–28.
- 51. Cohen S, Kamarck T, Mermelstein R. A global measure of perceived stress. J Health Soc Behav. 1983;24(4):385–96. pmid:6668417
- 52. Brown KW, Ryan RM. The benefits of being present: mindfulness and its role in psychological well-being. J Pers Soc Psychol. 2003;84(4):822–48. pmid:12703651
- 53. Zimet GD, Dahlem NW, Zimet SG, Farley GK. The Multidimensional Scale of Perceived Social Support. J Personal Assess. 1988;52(1):30–41.
- 54. O’Brien K, Wortman CB, Kessler RC, Joseph JG. Social relationships of men at risk for AIDS. Soc Sci Med. 1993;36(9):1161–7. pmid:8511645
- 55. Chasan-Taber L, Schmidt MD, Roberts DE, Hosmer D, Markenson G, Freedson PS. Development and validation of a pregnancy physical activity questionnaire. Med Sci Sports Exerc. 2004;36(10):1750–60.
- 56. Chasan-Taber L, Park S, Marcotte RT, Staudenmayer J, Strath S, Freedson P. Update and novel validation of a pregnancy physical activity questionnaire. Am J Epidemiol. 2023;192(10):1743–53. pmid:37289205
- 57. Craig CL, Marshall AL, Sjöström M, Bauman AE, Booth ML, Ainsworth BE, et al. International physical activity questionnaire: 12-country reliability and validity. Med Sci Sports Exerc. 2003;35(8):1381–95. pmid:12900694
- 58. Colby S, Zhou W, Allison C, Mathews AE, Olfert MD, Morrell JS. Development and validation of the short healthy eating index survey with a college population to assess dietary quality and intake. Nutrients. 2020;12(9):2611.
- 59. Johns MW. A new method for measuring daytime sleepiness: the Epworth sleepiness scale. Sleep. 1991;14(6):540–5. pmid:1798888
- 60.
Coates J, Swindale A, Bilinsky P. Household Food Insecurity Access Scale (HFIAS) for Measurement of Food Access: Indicator Guide [Internet]. 2007 [cited 2025 Apr 16]. Available from: https://www.fantaproject.org/monitoring-and-evaluation/household-food-insecurity-access-scale-hfias
- 61. Squires J, Bricker D. Ages & Stages Questionnaires®. 3rd ed. APA PsycTests. Available from:
- 62. Sadeh A. A brief screening questionnaire for infant sleep problems: validation and findings for an Internet sample. Pediatrics. 2004;113(6):e570–7. pmid:15173539
- 63. Sadeh A, Mindell JA, Luedtke K, Wiegand B. Sleep and sleep ecology in the first 3 years: a web-based study. J Sleep Res. 2009;18(1):60–73. pmid:19021850
- 64. Mindell JA, Gould RA, Tikotzy L, Leichman ES, Walters RM. Norm-referenced scoring system for the Brief Infant Sleep Questionnaire - Revised (BISQ-R). Sleep Med. 2019;63:106–14. pmid:31610383
- 65. Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42(2):377–81. pmid:18929686
- 66. Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O’Neal L, et al. The REDCap consortium: Building an international community of software platform partners. J Biomed Inform. 2019;95:103208. pmid:31078660
- 67.
R Core Team. R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2021. Available from: https://www.R-project.org/
- 68. Liu S, Liu D, Bender CM, Erickson KI, Sereika SM, Shaffer JR, et al. Associations between DNA methylation and cognitive function in early-stage hormone receptor-positive breast cancer patients. medRxiv: the Preprint Server for Health Sciences. 2024:2024.11.17.24317299.
- 69. Arockiaraj AI, Liu D, Shaffer JR, Koleck TA, Crago EA, Weeks DE. Methylation Data Processing Protocol and Comparison of Blood and Cerebral Spinal Fluid Following Aneurysmal Subarachnoid Hemorrhage. Front Genet. 2020;11:671.
- 70. Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen KD, et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30(10).
- 71. Du P, Kibbe WA, Lin SM. lumi: a pipeline for processing Illumina microarray. Bioinformatics. 2008;24(13):1547–8.
- 72. Fortin JP, Triche TJJ, Hansen KD. Preprocessing, normalization and integration of the Illumina HumanMethylationEPIC array with minfi. Bioinformatics. 2017;33(4):558–60.
- 73. Heiss JA, Just AC. Identifying mislabeled and contaminated DNA methylation microarray data: an extended quality control toolset with examples from GEO. Clin Epigenetics. 2018;10:73. pmid:29881472
- 74. Xu Z, Niu L, Li L, Taylor JA. ENmix: a novel background correction method for Illumina HumanMethylation450 BeadChip. Nucleic Acids Res. 2016;44(3):e20. pmid:26384415
- 75. Xu Z, Niu L, Taylor JA. The ENmix DNA methylation analysis pipeline for Illumina BeadChip and comparisons with seven other preprocessing pipelines. Clin Epigenetics. 2021;13(1):216. pmid:34886879
- 76. Cinar O, Viechtbauer W. The poolr package for combining independent and dependent p values. J Stat Soft. 2022;101(1):1–42.
- 77. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47. pmid:25605792
- 78. Bakulski KM, Feinberg JI, Andrews SV, Yang J, Brown S, McKenney LS, et al. DNA methylation of cord blood cell types: Applications for mixed cell birth studies. Epigenetics. 2016;11(5):354–62. pmid:27019159
- 79. Qi L, Teschendorff AE. Cell-type heterogeneity: Why we should adjust for it in epigenome and biomarker studies. Clin Epigenetics. 2022;14(1):31. pmid:35227298
- 80. Gervin K, Salas LA, Bakulski KM, van Zelm MC, Koestler DC, Wiencke JK, et al. Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data. Clin Epigenetics. 2019;11(1):125. pmid:31455416
- 81. Schmidt M, Maié T, Dahl E, Costa IG, Wagner W. Deconvolution of cellular subsets in human tissue based on targeted DNA methylation analysis at individual CpG sites. BMC Biol. 2020;18(1):178. pmid:33234153
- 82.
Suderman M, Staley JR, French R, Arathimos R, Simpkin A, Tilling K. Dmrff: identifying differentially methylated regions efficiently with power and control. Available from: https://www.biorxiv.org/content/10.1101/508556v1
- 83. Phipson B, Maksimovic J, Oshlack A. missMethyl: an R package for analyzing data from Illumina’s HumanMethylation450 platform. Bioinformatics. 2016;32(2):286–8.
- 84. Maksimovic J, Oshlack A, Phipson B. Gene set enrichment analysis for genome-wide DNA methylation data. Genome Biol. 2021;22(1):173. pmid:34103055
- 85. Rivara A, Heinsberg LW, Carlson JC, Pomer A, Naseri T, Reupena MS, et al. Psychosocial correlates of HbA1c among adult Samoans without diabetes. PLOS Mental Health. 2025.
- 86. Tofaeono V, Ka’opua LSI, Sy A, Terada T, Taliloa-Vai Purcell R, Aoelua-Fanene S, et al. Research Capacity Strengthening in American Samoa: Fa’avaeina le Fa’atelega o le Tomai Sa’ili’ili i Amerika Samoa. Br J Soc Work. 2020;50(2):525–47. pmid:32280149