Development of an affirming and customizable electronic survey of sexual and reproductive health experiences for transgender and gender nonbinary people

To address pervasive measurement biases in sexual and reproductive health (SRH) research, our interdisciplinary team created an affirming, customizable electronic survey to measure experiences with contraceptive use, pregnancy, and abortion for transgender and gender nonbinary people assigned female or intersex at birth and cisgender sexual minority women. Between May 2018 and April 2019, we developed a questionnaire with 328 items across 10 domains including gender identity; language used for sexual and reproductive anatomy and events; gender affirmation process history; sexual orientation and sexual activity; contraceptive use and preferences; pregnancy history and desires; abortion history and preferences; priorities for sexual and reproductive health care; family building experiences; and sociodemographic characteristics. Recognizing that the words people use for their sexual and reproductive anatomy can vary, we programmed the survey to allow participants to input the words they use to describe their bodies, and then used those customized words to replace traditional medical terms throughout the survey. This process-oriented paper aims to describe the rationale for and collaborative development of an affirming, customizable survey of the SRH needs and experiences of sexual and gender minorities, and to present summary demographic characteristics of 3,110 people who completed the survey. We also present data on usage of customizable words, and offer the full text of the survey, as well as code for programming the survey and cleaning the data, for others to use directly or as guidelines for how to measure SRH outcomes with greater sensitivity to gender diversity and a range of sexual orientations.


Introduction
The ways in which we conduct research have implications for data quality and inferential value. [1] The quality and completeness of participant-reported information is intimately related to participants' direct experience. [2][3][4] Participant experience, in turn, is influenced by whether participants feel respected, confident in and trusting of the study investigators, and invested in the study topic. [5][6][7][8][9][10] One way that researchers can establish trust with participants is by designing research questions that resonate with participants lived experiences. Gender identity-defined as one's internal sense of being a man, woman, both, neither of these, or something else-is a powerful determinant of one's lived experience. Gender identity can be consistent with or different from the sex that someone was assigned at birth. Sex assigned at birth is typically based on external genitalia, and is recorded as female, intersex, or male. "Transgender" is an umbrella term for people whose gender identity differs from the sex assigned to them at birth, while "cisgender" is a term for people whose gender identity aligns with their sex assigned at birth. "Nonbinary" is an umbrella term for gender identities that are not exclusively man or woman; rather, they could be a blend of both, or neither. Other words that people use for nonbinary identities include agender, bigender, gender-expansive, or genderqueer. An estimated 4.5% of the United States population, or 11.3 million people, [11] identifies as a sexual and/or gender minority (SGM). [12] At least 1.4 million transgender and gender nonbinary (TGNB) people are included in this group, and almost certainly more. [13] Gender identity and sexual orientation, however, are distinct. Gender identity refers to a person's sense of self, while sexual orientation-often labeled as being asexual, bisexual, gay, lesbian, pansexual, queer, straight or many others-encompasses how someone identifies sexually, to whom someone is attracted to romantically and or sexually, and who someone engages with sexually. Sexual orientation and its constituent domains of identity, attraction, and behavior are each independently and combined strong determinants of a person's lived experience.
Gender identity and sexual orientation are often conflated. Much sexual and reproductive health (SRH) research has made assumptions about the gender identity and sexual orientation of research participants and their sexual partners that raise concerns about data quality. [3-5, 14, 15] These problematic assumptions include: [1] research participants described as "women" explicitly include only cisgender women, thereby ignoring transgender women and nonbinary people; [2] the sexual and/or romantic partners of "women" are only cisgender men (and not cisgender women, transgender men, transgender women, nonbinary people, and/or those of another gender identity); and [3] sexual activity is assumed to refer only to sex that could lead to pregnancy or specific presentations of sexually-transmitted infections, ignoring other forms of sex that people have. Examples of these assumptions are easily found in widely used demographic, public health, and SRH surveys, both nationally and internationally. [14] These assumptions can induce bias in SRH research in at least two ways. First, they can induce selection bias if researchers do not appropriately conceptualize the target population and/or define eligibility criteria with sufficient detail to recruit a sample from this target population. For instance, when designing a study to evaluate risk of unintended pregnancy, the target population should include all people capable of pregnancy. However, due to lack of awareness, researchers may not consider pregnancy as a possibility for anyone other than a cisgender woman. Consequently, SRH researchers imprecisely describe eligibility criteria as "women of reproductive age" instead of more relevant criteria: the presence of a uterus in someone whose endogenous or exogenously supported hormonal milleu can carry a pregnancy. The data may systematically miss factors related to chance of pregnancy among transgender men and nonbinary people-people already known to face substantial barriers to preventative health care. [16] As a result, SRH research across subject areas may be systematically missing segments of the target population, while the health needs of a marginalized community remain inadequately characterized.
Even when SRH researchers accurately define eligibility criteria and enroll an unbiased sample, study questions that make heteronormative (i.e., the belief that all people are heterosexual [17]) and cisnormative (i.e., the expectation that all people are cisgender [18]) assumptions or use imprecise language about sexual activity can introduce measurement bias. As one example, the National Survey of Family Growth (NSFG) in 2015-2017 assumed involvement of a "he": "And what about your (husband/partner) at the time? At the time you had your procedure, had he had all the children he wanted?" [emphasis added]. [19] The use of the pronoun "he" makes clear that the study investigators assume that the respondent is in a heterosexual relationship, and that the respondent's partner uses he/him/his pronouns. Modules within the national Behavioral Risk Factor Surveillance System (BRFSS) include examples of imprecision regarding sexual activity. In the 2017 Preconception Health/Family Planning module, a question asks: "Did you or your partner do anything the last time you had sex to keep you from getting pregnant?" [20] Given the framing of the question, the investigators were interested only in sexual activity that can lead to pregnancy. However, the question does not specify the kind of "sex." It might be interpreted in different ways depending on what "sex" means to a given participant; this could include sexual activity that leads to pregnancy and sexual activity that cannot lead to pregnancy (e.g., sex between two cisgender women where no sperm is released in or near a vagina). These question design shortcomings could lead participants to [1] skip questions that seem irrelevant to their personal experiences; [2] answer a question differently than intended due to different definitions between participants and study investigators; or [3] drop out of a study that does not allow them to accurately convey their experiences or that reflects fundamental misunderstandings about their lives. Taken together, these situations could lead to more missing data, more response misclassification, or both.
As an interdisciplinary team of clinicians, researchers, and advocates, we recognize these potential biases and are concerned about their potential impact on SRH data and on participants. Because of cisnormative and heteronormative assumptions, participants may feel that SRH research is irrelevant, offensive, and erases many lived experiences, perpetuating critical knowledge gaps regarding the needs of an underserved population. Thus, we set out to co-create a survey to improve the assessment of SRH experiences of SGMs. Nearly all perinatal, contraception, and abortion research to date has focused exclusively on individuals assigned female sex at birth (AFAB) who are presumed to be cisgender and heterosexual. We sought to fill in the gaps within available research and methodologies. The objective of this process-oriented paper is to describe the collaborative development of an electronic, quantitative survey co-created by interdisciplinary research and community advisory teams to improve the relevance, precision, and affirming nature of SRH research for SGM, and to provide the full text of the final survey for others to utilize and tailor for their own research.

Composition of study team
We formed an interdisciplinary research team of researchers with diverse gender identities and sexual orientations including a communications specialist, an epidemiologist, an obstetrician-gynecologist, a family medicine physician, an internist, qualitative researchers, a social worker, psychologists, and a reproductive health advocate. Each member of the team contributed expertise necessary for developing a customizable survey to measure and affirm SRH experiences across the gender spectrum and acknowledge a diversity of sexual orientations.

Formative qualitative research
To inform selection and development of survey domains, we conducted 27 in-depth interviews between October 2017 and January 2018 with stakeholders in the field of SRH research and care for TGNB people AFAB. As described in detail elsewhere, [21] these stakeholders included clinicians, researchers, advocates, and patients, including those who identified as TGNB across all categories. To guide survey development, we focused analysis on responses to questions about SRH research gaps and priority SRH topics. Participants highlighted several priority issues including broader sexual health information, fertility and family building, sexuallytransmitted infections, pregnancy prevention, and the need for evidence-based patient-education materials. [21]

Recruitment and involvement of a community advisory team
In April 2018, we posted recruitment messages on social media groups (S1 File) and other community websites designed and run by TGNB people to recruit a community advisory team (CAT) for the study. The messages encouraged interested people to contact the study team. Approximately 20 candidates expressed interest; we selected five individuals to maximize CAT diversity in terms of gender identity, racial/ethnic identity, geography, and age. Included members identified as genderqueer, genderfluid, nonbinary, and transgender man, as well as Ashkenazi, Asian, Black, Latinx, and White, and resided in the Northeast, South, and Western regions of the United States. All are co-authors of this manuscript. We provided each member with information detailing the expected task and time contributions as well as the schedule for compensation. Over the 12-month survey development period, we paid each CAT member $750 for their time and expertise. CAT members participated in quarterly one-hour virtual meetings; provided high-level feedback on survey domains; provided detailed feedback on question wording, answer choices, and ordering; revised and informed recruitment strategies; and helped prioritize planned analyses.

Iterative review and editing of survey questions
Research team expertise, findings from a literature review, formative qualitative data, [21] and consultations with CAT members informed survey domain selection. Survey domains ( Table 1) and questions used and/or modified existing measures where possible from the U.S. Transgender Survey (USTS), the Behavioral Risk Factor Surveillance System (BRFSS), [22] compiled measurement work from the National Institutes of Health Sexual & Gender Minority Research Office, [23,24] the Guttmacher 2014 Abortion Patient Survey, [25] the Nurses' Health Study 3, [26] the Growing Up Today Study, [27] Pregnancy Attitudes Timing and How (PATH) questions, Pregnancy Risk Assessment Monitoring System (PRAMS), [28] the Texas Policy Evaluation Project, [29] and guidance for clinicians regarding preconception care. [30] Within each survey domain, we created revised and/or new questions to measure the concept of interest without heteronormative and/or ciscentric bias in question wording (S2 and S3 Files). After finalizing the survey domains, the CAT and research team drafted the survey questions and structure. The research team then submitted survey materials to The Population Research in Identity and Disparities for Equality (PRIDE) Study (pridestudy.org) Research Advisory Committee (RAC) (pridestudy.org/team) and PRIDEnet Participant Advisory Committee (PAC) (pridestudy.org/pridenet) for review and input as part of a formal ancillary study collaboration with The PRIDE Study (pridestudy.org/collaborate). The PRIDE Study, based at Stanford University, is a community-engaged research dynamic online longitudinal cohort of SGM people that is made possible by lesbian, gay, bisexual, trans, and queer (LGBTQ+) community involvement in every step of the research process. Over approximately twelve months between May 2018 and April 2019, the study team conducted multiple rounds of revisions of survey question wording and order based on feedback from CAT members and the RAC and PAC. This work included making definitions for clinical terms more accessible, shifting the framing of questions of sexual attraction, and adding precision to questions of sexual activity ( Table 2).

Programming and testing the survey
To create a highly customized survey that could be distributed widely, we used Qualtrics (Qualtrics LLC; Provo, UT) to develop an electronic questionnaire with participant-customized language for candidate words as well as complex display and skip logic. We recognized that people use varied words for their sexual and reproductive anatomy, and that for some, the words used to describe their bodies may induce either gender dysphoria or feelings of empowerment-depending on how well the words align with a person's sense of their own bodies. [31,32] Consequently, we programmed the survey to allow participants to input words that they use to describe their bodies, and then have those customized words replace traditional medical terms throughout the survey. In using customizable language, we aimed to create a more personalized, understandable survey that affirmed respondents' lived experiences.
To operationalize this, we programmed questions early in the survey that asked participants to provide the words they use to talk about their bodies (breasts, penis, sperm, uterus, vagina); physiological processes (menses, pregnancy); and medical procedures and treatments (abortion, contraception). The research team selected these nine customizable words because these words are known to be sensitive for particular groups, appeared frequently in the survey, and are used often by clinicians and researchers to discuss SRH issues. For each customizable word, participants indicated a preference for [1] the medical term (i.e., vagina), [2] a customized word input by the participant (i.e., front hole), or [3] a preference not to say (in which case, the medical term displayed by default) ( Table 2, Row 1). We provided definitions for each customizable word that were gender-neutral and written in an accessible reading level. For those participants who provided their own word, this word was used throughout the survey each time the candidate medical term would have been used. For instance, if someone preferred "front hole" to the original candidate term "vagina," any question that used "vagina" would appear as "front hole" for that participant. Individual survey questions used up to three customizable words, which led to lengthy combinatorial display logic to ensure that each participant saw the correct words based on their stated customized words (Fig 1). We include Stata code for collapsing multiple copies of customizable-word questions to a single variable in the data cleaning phase in S3 File.
We conducted extensive survey testing to ensure that participants were displayed the correct questions based on gender identity, medical history, and customizable words. To measure current gender identity, we followed established guidelines to ask two questions: a multiple choice current gender identity question and a question to assess sex assigned at birth. [33][34][35] However, given the difficulty of representing all gender identities in a multiple choice question, community members emphasized the importance of allowing participants to first freely self-identify with a write-in response, followed by a multiple choice "select all that apply" question, and asking about sex assigned at birth last. The final measure of gender identity that we used, first asked participants to self-identify current gender identity with an open-text question, and then to select all that apply from a list of gender identities that included: agender, cisgender man, cisgender woman, genderqueer, man, nonbinary, transgender man, transgender woman, Two-Spirit (specify if desired), woman, another gender (specify if desired), and prefer not to say. Participants then reported sex assigned at birth with answer choices: female, male, not listed (specify if desired), and prefer not to say. We went through similar processes for modifying our sexual orientation questions as we did our gender identity questions; we modified a commonly used measure of sexual orientation and expanded it to reflect a greater diversity of sexual orientations. The modified question that we used reads: "Do you consider yourself to be: asexual, bisexual, gay, lesbian, pansexual, queer, questioning, same-gender-loving, straight/heterosexual, or another sexual orientation." The survey prompted participants to select all that apply, rather than selecting a single answer from often used questions that only ask about attraction to binary gender identities.

Recruitment of study participants
The target population for this survey included sexual and/or gender minorities (SGM) who were assigned female or intersex at birth. Eligible study participants lived in the United States or its territories, were assigned female or intersex at birth, could read and understand English, were 18 years or older and were either [1] of transgender, nonbinary, or gender-expansive experience with any sexual orientation, or [2] identified as a sexual minority cisgender woman. We recruited participants via two approaches. First, we distributed the survey to all members of The Population Research in Identity and Disparities for Equality (PRIDE) Study (pridestudy.org). At the time of survey launch on The PRIDE Study, the cohort had 13,900 enrolled participants. The survey appeared on The PRIDE Study participant dashboard, advertised as a study on sexual and reproductive health. Any interested participant within The PRIDE Study could click on the survey and begin the screening questions for eligibility. Secondly, we also recruited participants from the general public via postings on social media, emails to community listserves, fliers at LGBTQ+ community events, and via word of mouth and boosted snowball sampling as facilitated through the social media of CAT members and their social networks. While we recruited both populations of interest through The PRIDE Study dashboard, for those recruited through the general public we limited recruitment to just TGNB individuals (not cisgender sexual minority women), and only those between the ages of 18-45 years to focus on those most likely to be of reproductive age.

Ethical review
The Institutional Review Board at Stanford University (#: 49215, 48707) and at the University of California, San Francisco (#:18-24934) reviewed and approved the study. The PRIDE Study Research Advisory Committee (RAC) and The PRIDE Study Participant Advisory Committee (PAC) reviewed, provided input, and approved the design and conduct of this study. All participants provided written informed consent, recorded in an electronic survey form, before beginning the study survey.

Final survey instrument
The final survey included 328 survey questions, corresponding to 1,423 variables in the dataset resulting from multiple copies of customized word questions, and multiple 'select all that apply' question structures. The final survey domains are listed in Table 1, and the survey is included in Appendix 1.

Participant characteristics
A total of 5,005 people initiated the survey; of these, 3,110 were determined to be eligible and completed the survey (Fig 2). The majority of participants were under the age of 40 years, and reported multiple gender identities and sexual orientations (Table 3). Participants resided across the United States.

Participant response to customized words & survey design
Across all nine customizable medical terms offered in the survey, 708 (23%) of 3,110 participants who responded to the preferred word questions provided at least one customized response, and 315 (10%) provided two or more. The three medical terms for which participants most frequently provided a customized word included 514 (17%) for the medical term "breasts," followed by 258 (8%) for "vagina," and 212 (7%) for "period. " In an open-ended question at the end of the survey, participants were provided space to share any feedback to the research team. Participants provided detailed feedback on study eligibility and exclusion criteria, survey content, and technical issues related to survey programming and format. Regarding eligibility criteria, some participants expressed frustration with upper age limits for the sample recruited from the general public. In terms of content, participants identified answer options that they felt were missing and expressed appreciation for question wording and the option for customizable language. Participants shared comments that highlighted the impact of the collaborative, affirming, customized nature of the survey. Selected responses are listed in Table 4.

Conclusions
Recognizing the exclusion of SGM people from most traditional SRH research and the additional bias imposed by measurement error from imprecise survey measures, we developed a customizable, electronic, SRH-related survey to be affirming, empowering, and relevant to SGM participants. The resulting survey and lessons learned may be useful to researchers measuring health outcomes tied to sexual behavior, sexuality, and/or reproduction. In appendices, we offer the final text of the questionnaire, as well as programming details and code for cleaning the resulting data, to advance the field of survey design by creating a more inclusive and personalized research experience. Findings specific to the study research questions on the family planning needs and experiences of TGNB people, as well as cisgender sexual minority women, will be presented in manuscripts that are currently in development.  Core lessons learned included the essential role of community input from initial conceptualization to final implementation and the importance of centering the participant experience in survey design. This survey design process and resulting survey also has limitations. Engaging with multiple stakeholders and rounds of language revisions was lengthy, time-consuming, and expensive. Due to the prohibitively complex nature of programming questions with four or more customizable piped-in words, we had to restrict our questions to only three customizable terms. At some times, this artificially constrained the questions we ideally would have asked or forced the use of medical terms, even when participants had told us this was not their preference. Importantly, our survey should reduce measurement bias in SRH research through more inclusive and precise questions and response options. The survey in and of itself, however, does not directly address the problem of selection bias in SRH research. Investigators need to be mindful of gender-diversity and differences in sexual orientation when defining study eligibility criteria to directly reduce selection bias. However, the hope is that more inclusive and precise surveys will indirectly attenuate selection bias through creating more inclusive environments that foster participation from SGM participants, and simultaneously reduce drop-off from surveys once initiated. We note a number of strengths of the process and resulting questionnaire. Chiefly, the ability to use individualized, affirming, customized language for sexual and reproductive body parts and processes may avoid gender dysphoria evoked for some by medical terms. Further, modified measures of gender identity, sexual orientation, and pregnancy desires and experiences were developed to center the experiences of TGNB people (many with marginalized sexual orientations too) and to offer new, inclusive approaches to measurement of core SRH events. "I love LOVED the use of my preferred language for body parts in the questions. After I entered that language I just assumed that it would be researched and that was that. Seeing it used to take care of me personally as a participant was really meaningful. It was a tiny way that I felt affirmed." "I didn't realize the language questions were going to make the whole survey read so awkwardly. I would have just left it as "birth control" b/c I know what you mean by that, instead of trying to explain what language I use, which made the questions read confusingly. Also, I was kind of upset that the survey was advertised as for trans folks, but was really just for AFAB people. / / That said, I really appreciated the chance to skip over sections like the one about sexual assault. Thanks." "I really appreciated the wide range of options available for answering most questions. It made me feel way less frustrated than most surveys where I end up checking things that don't really fit because of the lack of options." "I recently went through an egg retrieval procedure. It was challenging on every possible level. I am disappointed that this survey did not ask any questions about fertility preservation and assisted reproduction. Aren't these a part of reproductive health too, especially for trans people?" Future research could expand the methodologies we utilized in a number of ways. For instance, new research could build upon the customizable word method by asking participants to use their preferred word in context by filling-in-the-blank in a sample sentence-an exercise that could lead to more specific and accurate data on words used in specific contexts. Ideally, researchers could then track substitute words with their appropriate use across settings to streamline the development of new, affirming research instruments.
The design process and final questionnaire can be used to measure epidemiological outcomes with greater sensitivity to gender diversity and diversity of sexual orientations. Future work should test the ability of these measures to reduce self-selection and non-response biases. We hope that this survey development process and resultant survey measures will inspire fellow researchers to think more inclusively and to innovate in more expansive ways to continue advancing the field of survey research, particularly for historically marginalized populations.