Validation of Three Early Ejaculation Diagnostic Tools: A Composite Measure Is Accurate and More Adequate for Diagnosis by Updated Diagnostic Criteria

Purpose To validate three early ejaculation diagnostic tools, and propose a new tool for diagnosis in line with proposed changes to diagnostic criteria. Significant changes to diagnostic criteria are expected in the near future. Available screening tools do not necessarily reflect proposed changes. Materials and Methods Data from 148 diagnosed early ejaculation patients (M age = 42.8) and 892 controls (M age = 33.1 years) from a population-based sample were used. Participants responded to three different questionnaires (Premature Ejaculation Profile; Premature Ejaculation Diagnostic Tool; Multiple Indicators of Premature Ejaculation). Stopwatch measured ejaculation latency times were collected from a subsample of early ejaculation patients. We used two types of responses to the questionnaires depending on the treatment status of the patients 1) responses regarding the situation before starting pharmacological treatment and 2) responses regarding current situation. Logistic regressions and Receiver Operating Characteristics were used to assess ability of both the instruments and individual items to differentiate between patients and controls. Results All instruments had very good precision (Areas under the Curve ranging from .93-.98). A new five-item instrument (named CHecklist for Early Ejaculation Symptoms – CHEES) consisting of high-performance variables selected from the three instruments had validity (Nagelkerke R 2 range .51-.79 for backwards/forwards logistic regression) equal to or slightly better than any individual instrument (i.e., had slightly higher validity statistics, but these differences did not achieve statistical significance). Importantly, however, this instrument was more in line with proposed changes to diagnostic criteria. Conclusions All three screening tools had good validity. A new 5-item diagnostic tool (CHEES) based on the three instruments had equal or somewhat more favorable validity statistics compared to the other three tools, but is more in line with recently proposed diagnostic criteria.


Introduction
Recently, sustained efforts have been made to address problems that affect diagnosis of early ejaculation (EE), chief among these is the lack of a universally agreed-upon definition and valid screening tools that capture diagnostically relevant features [1]. Concerns have been raised that definitions and diagnostic criteria that rely on subjective indicators of EE (e.g. ejaculation-related distress) inflate prevalence estimates and result in over-diagnosis in normally-functioning men [2]. Thus, the International Society for Sexual Medicine (ISSM) has proposed a definition which, in addition to patient-reported outcomes (ejaculatory control, negative consequences), takes ejaculation latency time (ELT) "prior to or within about one minute" (p. 2949) into account [3]. The American Psychiatric Association (APA) has since followed suit, and the proposed criteria for EE in the upcoming 5th Edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5, proposed EE diagnostic criteria can be seen at http://www.dsm5.org/Lists/ ProposedRevision/DispForm.aspx?ID=174) now include ELT ("approximately one minute") during partnered sexual activity. The pursuit of formulating an accurate definition of EE has also been hampered by the fact that its causal mechanisms are poorly understood, although current evidence suggests that various neurotransmitters (including serotonin and dopamine) as well as genetic factors play a role in EE etiology (for a more detailed review of the pathophysiology of EE, please see 1).
Various methods of definition and diagnosis have been suggested in the literature, with EE traditionally being diagnosed by a physician in accordance with DSM-IV (the DSM-5's predecessor) or the World Health Organization's International Classification of Diseases (ICD-10) criteria [1,2]. However, clinician-based diagnoses have been criticized for being imprecise [3]. To address this, a common measure in recent studies has been the stop-watch measured intra-vaginal ELT (IELT, [4]), which typically generates population-wide prevalence estimates of less than 2% [5], compared to ~30% when using patient-reported outcome-based definitions [6]. While this measure has some appealing properties, not least statistically as it forms a continuous, linear variable, it alone does not capture all aspects considered diagnostically relevant by the APA and the ISSM. Furthermore, given that multiple stop-watch timed measures must be collected in order to estimate a mean ELT, it is also impractical for clinical purposes. However, self-reported and stop-watch measures of ELTs are highly correlated [7], as we shall also show in the present study.
There are a number of questionnaires available for EE diagnosis, of which we shall focus on three recently published ones here (see Supporting Information): the Premature Ejaculation Diagnostic Tool (PEDT, [8]), the Premature Ejaculation Profile (PEP, [9]), and finally, Multiple Indicators of Premature Ejaculation (MIPE [10], note that MIPE is unnamed in the aforementioned reference), the latter a modified version of an unpublished questionnaire developed by Grenier and Byers [11]. Relatively few studies have been conducted to assess and validate these questionnaires, but reports suggest satisfactory reliability and validity for all three instruments. However, unlike MIPE, neither PEDT nor PEP has items inquiring about ejaculatory latency (although both have been compared to self-report IELT data to some extent, [9,12]). The purpose of the present study was to: 1. Validate these three questionnaires using responses from both EE patients and population controls to compare their validity; and 2. Investigate whether a more valid measure of EE could be created by selecting variables from all three questionnaires (i.e. variables that best differentiated between patients and controls).
Some of the patients who participated in the present study had received or were currently receiving pharmacological treatment for their EE problems. They reported on their ejaculatory functioning both as it was currently, and how it was prior to the commencement of the treatment. We expected using the pre-medication values for the patients to result in clearer differences between patients and controls as some of the patients' values were expected to have been improved as a result of the treatment.

Ethics Statement
The research plan was approved by the Ethics Committee of the Abo Akademi University (data collection from control group) and the Ethics Committee of the Hospital District of South-West Finland (data collection from the EE patients). Written informed consent was obtained from all participants.

Participants
Analyses were based on the responses of 148 men diagnosed with EE (hereafter: "EE patients") (M age = 42.8, SD = 10.7) and 892 controls (M age = 33.1, SD = 4.8, p < .0001), who had information available on the instruments used (age was not associated with any of the EE measures when analyzed separately for patients and controls in line with previous research [13]). The control group was drawn from a populationbased sample of Finnish twins and siblings of twins.
Data were collected from both EE patients and controls in 2012. Prior to data collection, we obtained updated postal addresses from the Central Population Registry of Finland for all. Data were collected through a secure online connection following inquiries by postal mail with an invitation to participate in the study. All aspects of the study, including its voluntary nature, were readily explained in the invitation letter. Participants were instructed to log onto the questionnaire using a personal eight-digit code. If a potential participant had not responded after two weeks from the initial contact, a reminder letter was sent. Participants were eligible to participate in a lottery, in which they could win one of three gift vouchers to a national travel agency (one lottery for EE patients and one for the controls, both involving 2x€500 and 1x€1000 gift cards).
The group of EE patients was invited to participate in a stopwatch study, in which participants were to record data regarding ejaculatory latency with a stopwatch. Participants interested in this study were sent an instruction letter and a stop watch by postal mail, and given the option to complete this part of the data collection using either a secure internet diary or a paper diary which was to be returned by mail upon completion. Participants were requested to record ELT data from at least five, and a maximum of ten, sexual intercourses (anal or vaginal). In order to motivate participation, participants were reimbursed with a nominal sum for their participation (€10 per intercourse, for a maximal reimbursement of €100/ participant). 24 participants eventually submitted stopwatch measured ELT data.
In total, we contacted 419 individuals who had sought treatment for EE at two clinics in Finland (EE patients). They were identified from the second author's patient registry. In total, 178 of those we approached responded (four did not wish to participate, and 26 did not complete necessary parts for the present study), and an additional four invitation letters were unable to be delivered, suggesting that these individuals had moved or died shortly before the contact attempt, resulting in a final response rate of 43.3% for the EE patients.
Regarding the control group, 2559 individuals were contacted by postal mail. In total, 1054 individuals completed the questionnaire. A total of 1173 responded and of these, 124 did not wish to participate, or did not complete the necessary parts of the questionnaire. Also, five individuals could not be reached (invitation letters sent to these were returned to the researchers presumably because these individuals had e.g. moved or died), resulting in a final response rate of 46% for the control group (control group n differs somewhat due to using one individual per family in statistical analyses, to control for between-subjects dependence due to controls being derived from a sample of twins and siblings of twins).

Instruments
Prior to the study, all instruments were back-translated into Finnish. Note that we used two different responses from the patients depending on their treatment status: 1) regarding the situation prior to commencing medication (completed by those who had received or were currently receiving medication for EE), and 2) regarding the current situation (completed by all, including controls).
Premature Ejaculation Profile. The PEP consists of four items that participants respond to on a 5-grade Likert scale ranging from 1-5. Lower scores indicate more severe EErelated problems. PEP has been shown to have good psychometric properties (test-retest-reliabilities ranging from . 66 to .83 across items in two different samples, [9]). In the present study, it was found to have good reliability with Cronbach's α being .73 for patients before treatment, and .85 at present (α controls = .82).
Premature Ejaculation Diagnostic Tool. PEDT consists of five items responded to on a 5-grade Likert scale from 1-5, with higher values indicating more EE problems. It has also been shown to have good psychometric properties (e.g. Cronbach's α = .77, [12]). In the present study, Cronbach's α for patients was .75 before medical treatment (.89 at present), and .88 for controls.
Multiple Indicators of Premature Ejaculation. MIPE consists of seven items, of which five are responded to on 5grade, and two on three-grade, scales. MIPE has not been subject to case-controlled validity analyses, but extensive confirmatory factor analyses suggest good reliability [10]. In the present study, Cronbach's α for EE patients was .59 before medical treatment (.84 at present) and .77 for controls. The item measuring self-reported ELT (including response options) is found in the Appendix.
Stop-Watch Measured Ejaculation Latency Time. We collected stop-watch measured ejaculation latency time data from 24 EE patients (each providing 3-10 recoded ELTs, M = 8.00 observations, SD = 2.11). These data were used to compute geometric mean ELTs [14], based upon which ELT values were imputed for the other EE patients using information from PEP, PEDT and MIPE scores in the expectation maximization procedure in SPSS 21.0.

Statistical analyses
To control for dependence between observations of men belonging to the same family, we randomly excluded 135 men whose brothers also participated, so that only one person per family was represented. We used independent and paired samples t-tests to compare means on the ejaculatory function measures between EE patients and controls, and of EE patients in pre-medication and current settings. Pearson correlations were used to investigate the co-variation of different variables. We used logistic regression analyses (Forward and Backward selection procedures using the likelihood ratio criterion with a p-value of .05 for entry and .10 for removal from the model; both types of selection procedures were used to increase the robustness of the selection procedure) to identify items from among all items of the three EE instruments that best differentiated between patients and controls. These analyses were done separately using premedication and current scores for the EE patients. Items that remained in the final models using both Forward and Backward selection were retained for further analyses. We used Receiver Operating Characteristics (ROC) analyses to assess the ability of different instruments to differentiate between the EE patients and controls using the Area Under Curve (AUC) value.

Ejaculatory functioning in EE patients and controls
The values on the different instruments correlated highly for both EE patients (both pre-medication and currently) and controls, indicating that all three instruments measured the same construct (Table 1). EE patients' ejaculatory functioning improved from pre-medication to the current situation, and the ejaculatory functioning measured in the control group was better both compared to the EE patients' pre-medication measurement as well as compared to their current values ( Table 2). This was the case independent of the instrument used.

Ability of different ejaculatory function instruments to differentiate between EE patients and controls
Next, we investigated how well the different instruments as well as the self-reported ELT were able to differentiate between EE patients and controls using ROC analysis ( Table 3). The AUC values were consistently very high and did not differ between the three instruments. As expected, the values were in each instance higher (as indicated by the non-overlap of the confidence intervals) when pre-medication (vs. current) values were used for the EE patients. Using only the self-reported ELT also resulted in good differentiation between the groups even if it was worse compared to MIPE and PEP when pre-medication values were used for EE patients.

Identifying individual items best able to differentiate between EE patients and controls
Next, we used logistic regression analyses to identify from among the items of all three instruments those best able to differentiate between EE patients and controls. Both forward and backward stepwise (likelihood ratio) logistic regressions were run; items that were selected by both procedures into the final model were selected for further analyses. Table 4 shows the final models. All models had high Nagelkerke R 2 . As expected, models using pre-medication values differentiated between patients and controls better than models based on post-medication values. Using the pre-medication values, five variables were included in both the final models using forward and backward selection. Using current values, five variables (not exactly the same) were also included in both models.

Using sum scores based on the selected individual items to differentiate between EE patients and controls
Next, we calculated a sum score (the PEP items were reversed prior to this) based on the pre-medication and current items that were included in both the forward and backward logistic regressions, respectively. Both variables had significantly higher values among the EE patients (   Table 5 shows (using the pre-medication values for the patients) the ability of the different instruments to identify EE patients. As the instruments are for screening purposes, we want to keep the sensitivity of the instrument high, that is, we do not want to lose potential EE sufferers by using too high a threshold. If we, for example, want to identify 90% of EE patients, the cut-off scores are the following (% of controls wrongly identified as patients with these cut-offs in parentheses: PEP: 10 (6%), PEDT: 17 (6%), MIPE: 21 (6%), New: 17 (5%). The differences are small with the new instrument keeping the false positive rate at its lowest.

Discussion
Here, we show that all three tested instruments (PEP, PEDT, MIPE) are valid tools to separate between EE patients and controls, all having very good specificity and sensitivity to distinguish between patients and controls. In addition, an important characteristic of any good screening tool is -besides the obvious ability to discriminate between untreated patients and controls -to detect change in the trait it is designed to measure. Again, all three screening tools were able to detect significant change between pre-medication and current status in the group of EE patients. However, a combination measure featuring two items each from PEP and PEDT, and one item from MIPE covers factors considered important for diagnosis outlined in recent proposals for EE diagnostic criteria [1,2] while maintaining equally good validity and precision (even somewhat higher validity scores). Furthermore, the new instrument also had higher correlations with both imputed and non-imputed stopwatch measures of ELT compared to the other three instruments. We propose the name Checklist for Early Ejaculation Symptoms (CHEES) for this brief, composite instrument, which is presented with proposed diagnostic cutoff scores in Table 6.
In order to comply with ISSM and DSM-5 definitions, the ejaculatory latency should ideally be one minute or less for diagnosis. An ELT latency cut-off has been considered necessary for diagnosis because definitions that rely solely on subjective perceptions tend to inflate prevalence rates of EE, and unnecessarily label normal ejaculatory function as pathological. It should be noted, however, that some confusion remains regarding the optimal ELT cutoff for EE diagnosis. As noted previously, both the ISSM and the DSM-5 definitions suggest a cutoff of "about" one minute (or less), allowing for some flexibility for diagnosing physicians. Hence, greater emphasis could be placed on the fifth question of the newly proposed instrument, in order to avoid that individuals with comparatively long ejaculatory latencies are misdiagnosed. Indeed, about 30% of men express subjective concerns regarding their ejaculatory function [6], whereas only 1-2% of men in the general population have consistent ELTs of less than a minute [5,13]. Four different "subtypes" of EE have been proposed in the literature, with two of these ("natural variable premature ejaculation" and "premature-like ejaculatory dysfunction") being described as temporary, circumstantial (i.e., non-chronic) problems or psychological misrepresentations of normal ejaculatory function [15]. On the other hand, it can be argued that a short ejaculatory latency can, ultimately, only be considered to be dysfunctional if it occurs before penetration (as the ultimate function of ejaculation is to impregnate a female, and there is currently no evidence to support that conception is less likely if the ELT is short). As EE is treated to improve recreational, rather than procreational, sexual activity, it is questionable whether it is defensible to give greater weight to ELT over patient-reported outcomes in EE diagnosis [16], however it is arguably problematic to prescribe EE medication that can potentially be harmful to an individual who presents  subjective complaint of EE but has a relatively long ejaculatory latency. Hence, we suggest that patients should score at least 4 (between 1 and 5 minutes) on the fifth question in Table 5.
Clinicians are also advised to inquire about ELT separately when meeting with patients, and pay special attention to other EE indicators in borderline cases (i.e. if the patient scores a 4 on the aforementioned questionnaire item).
In the literature, (I)ELT has been proposed as a necessary parameter for accurate measurement of EE [1,[3][4][5]. However, the (I)ELT as a measure has some drawbacks: firstly, it is costly and difficult to manage in large-scale studies, although some clinical trials using stopwatch measured ELT data have amassed study samples close to 2,000 individuals (e.g. [17]). Concerns have also been raised that it is unsuitable in the clinical setting, since some individuals may be reluctant to record the duration of sexual intercourse with a stopwatch, and many clinicians consider stopwatch measurement of ELT impractical [18]. These concerns are supported by the results of the present study, as about 5/6 participants who volunteered for a survey study chose not to participate in a study involving actual recording of stopwatch data, even though they received financial reimbursement for doing so. This, in addition to research findings suggesting that self-estimated ejaculatory latencies are accurate reflections of stopwatch measured ejaculatory latencies in general, underlines the need for accurate diagnostic tools that are simple to use in both clinical and research settings.
It is important to note that the control group in the present study is a population-based sample (i.e. not a selected group of non-affected individuals, but a group inclusive also of men who display EE symptoms). In the light of this, the sensitivity and specificity of the measures, especially the newly proposed measure, would likely be even more convincing had we used a non-affected control group. Limitations to be considered for this study is the relatively low number of participants who provided stopwatch data (24 patients), as well as retrospective measures of ejaculatory function prior to use of medication in the patient group. Data were not recorded in laboratory settings but by the participants themselves, and this may be a source of bias. The response rate was somewhat low, but fully comparable to other sexualityrelated surveys in which large cohorts has been approached at a single occasion without pre-screening [13,19,20]. Furthermore, all EE patients had been diagnosed with premature ejaculation by a physician specializing in sexual medicine.

Conclusions
In conclusion, three commonly used diagnostic tools for EE were shown to have good reliability and validity. A composite measure can be derived from these tools to form a new, valid tool which is adequate considering proposed changes to diagnostic criteria. Studies assessing the newly proposed questionnaire in independent samples are welcome.  Table 6. Proposed new diagnostic screening tool: CHecklist for Early Ejaculation Symptoms (CHEES; questions should be responded to considering the past 6 months).

Response options
Supporting Information Table S1. Three instruments for diagnosis and measurement of early ejaculation. Note. All variables have been converted to a 1-5 or 1-3 scale. PEP = Premature Ejaculation Profile 9 ; PEDT = Premature Ejaculation Diagnostic Tool 8 ; MIPE = Multiple Indicators of Premature Ejaculation 10 . For PEP, higher scores indicate better function; for MIPE and PEDT, lower scores indicate better function. Items in boldface type have the greatest effect sizes and constitute the new proposed diagnostic tool. (DOCX)