A quantitative method for measuring the relationship between an objective endpoint and patient reported outcome measures

Patient reported outcome measures (PROMs) become increasingly important for assessing the effectiveness of a drug or medical device. In order for a PROM to be claimed in labeling, the PROM has to be valid, reliable and able to detect a change if the targeted disease status changes. One approach to assess the quality of a patient reported outcome measure (PROM) is to investigate the association between the PROM and an objective clinical endpoint measuring the status of a disease/condition. However, methods assessing the association between continuous and discrete variables are limited, especially for correlated measurements. In this paper, we propose a method to assess such association with any type of samples with or without correlation. The method involves estimating the probability revealing the status of a subject’s disease/condition (called truth thereafter) through the subject’s reported outcomes. The probability is a conditional probability revealing truth given the relative location of the subject’s objective outcome compared to the subject-specific latent threshold in the objective endpoint. A consistent estimator for the probability is derived. The operating characteristics of the consistent estimator are illustrated using simulation. Our method is applied to hypothetical clinical trial data generated for an ophthalmic device as an illustration.


Introduction
Patient reported outcome measures (PROMs) have become increasingly important in measuring the effectiveness of a drug or medical device. Between years 1997 and 2002, about 30% of the new drug labels were found to have included patient reported outcomes (PROs) [1]. Later between 2006 and 2010, about 24% of new molecular entities and biologic license applications were granted patient reported outcome (PRO) claims [2]. The authors of this paper also noticed that the PROM claims in approved medical devices had been steadily increasing since 2012. In the meantime, many efforts have been made to advance the use of PROMs in drug or a1111111111 a1111111111 a1111111111 a1111111111 a1111111111

OPEN ACCESS
Citation: Ahn C, Fang X, Silverman P, Zhang Z (2018) A quantitative method for measuring the relationship between an objective endpoint and patient reported outcome measures. PLoS ONE 13 (10): e0205845. https://doi.org/10.1371/journal. pone.0205845 medical device development and regulatory decision making. Recent major challenges were reported from the Food and Drug Administration's perspective [3]. The National Institutes of Health (NIH) also funded the establishment of a PROM Information System (PROMIS) [4,5,6]. Some recent literature focuses on the interpretation of PRO analysis results [7,8].
In order for a PROM to be claimed in labeling of a drug or medical device, the PROM has to be valid, reliable and able to detect a change if the status of the targeted disease or condition changes [9]. The most frequently and broadly used statistics in a PROM validation such as Pearson and intra-class correlation coefficient (ICC) [10] assess the association among PROM items or between a PROM and other established measurement(s). These correlation coefficients have been used to examine various validities (e.g. construct, convergent/divergent, criterion) of PROMs . The correlation coefficients were also used to investigate the PROM's ability to detect a change [31]. Some authors also used these correlation coefficients to explore the relationship of a PROM with other measurements [32][33][34][35].
However, these correlation coefficients (1) may not be appropriate in correlated samples such as repeated measures, (2) may not be reliable for endpoints with different scales (e.g. categorical scale vs. continuous scale), and (3) do not have an intuitive clinical meaning because these coefficients or their changes don't directly carry a clinical meaning. It is difficult to draw a line for an acceptable association based on these popular coefficients most likely due to the lack of clinical meaning of these correlation coefficients.
The challenge here is to develop a meaningful reliable methodology to measure the relationship between an objective continuous endpoint (X) and the dichotomized endpoint (G) of an ordinal PROM, and if the association index is strong enough, to use only the PROM to make inference about the effectiveness of the therapy or to use the PROM to support the primary inference in a clinical trial setup. This paper provides such a new meaningful quantitative statistic measuring the conditional association (denoted as Q here after) between paired endpoints (X, G), and a method to translate the ordinary PROM scales to the continuous objective measurement. The use of conditional association is due to the fact that the outcome of G is conditional on the outcome of X, because the PROM is always administrated after the treatment takes effect. The dichotomized endpoint (G) may represent mixed Bernoulli random variables with the same parameter but opposite meaning, which is explained in the method section of this paper. Section 2 describes the definition of the conditional association parameter Q, the data structure used in this paper and how to estimate Q. The derivation of the estimator of Q is also presented in this section. Section 3 shows simulation results of the estimator (Q) of Q and an application of this new methodology to hypothetical clinical trial data. The discussion and conclusion are presented in Section 4.

Methods
This section shows how the parameter Q works in assessing the quality of a PROM using repeated measures from a single subject. It starts with minimum notations and theoretical construct of Q, followed by the characteristic and estimation procedure of Q, and the derivation of the consistent estimator of Q. The derivation of the estimator is specifically arranged after introducing the estimation process so that the derivation is more accessible to readers. The section ends with how to obtain the inference for the PROM in multiple subjects.
In general, a single italic lower-case letter represents a nonrandom variable and a single italic upper-case letter represents a random variable unless stated otherwise (such as parameter Q). The non-italic PROM z (not a random variable) represents the scale z of the unidimensional PROM. Q iz is the probability of the PROM z revealing the disease status of Subject i according to his/her latent minimum objective threshold a iz given the subject's objective outcome x i � a iz or x i < a iz . Note: Q iz is not defined as a random variable and is a parameter to be estimated. The italic PROM is the random variable for the subjective PRO measurement, and the italic PRO is the realization of the PROM. The italic PRO z represents the patient reported outcome equal to the scale z of the PROM. Other notations are defined in Appendix A.

Theoretical construct of parameter Q iz
As illustrated in Fig 1 below, the theoretical construct of Q iz is that there is a latent minimum threshold a iz of a disease status in terms of the objective disease measurements of Subject i which triggers PRO z (z = 1, . . ., 7 in Fig 1) upon the PROM question according to the association parameter Q iz given the subject's objective outcome x i � a iz . Although the PROM question and scales don't change with subject, sub-index i is used to indicate that the PROM i is the PROM random variable for Subject i, hereafter for clearance and without a loss of generality. Subject i will give his/her PROM i � z with probability Q iz when his/her x i � a iz , and will give his/her PROM i < z with probability Q iz when x i < a iz . Note here, the PRO i is always dependent on where the X i is realized relative to the minimum latent threshold a iz . Fig 1 illustrates the relationship between the continuous objective endpoint X i (such as increase in hemoglobin count (HC)) and a unidimensional 7-scale PROM i (such as fatigue improvement). The upper divided rectangular block illustrates a 7-scale unidimensional PROM, and the lower line X illustrates the continuous objective measurement with letter O indicating the baseline location of a subject. Each scale of the PROM (such as 5 = improved) for Subject i has its own minimum latent objective threshold (such as a i5 ) pointed by a connecting arrow between the two measurements. The PROM will be realized to the PRO z with probability Q iz by Subject i upon the PROM question if x i � a iz , which determines the conditional association of the PROM i with the continuous objective endpoint X i for Subject i at PROM z .
Note here, the event of "PROM z " revealing the disease status of Subject i includes two true events: (1) PROM i � z if x i � a iz (as true positive), and (2) PROM i <z if x i < a iz (as true negative). We realize that if there is no conditional association between PROM i and X i both Pr (PROM i � z | X i � a iz ) and Pr(PROM i <z | X i < a iz ) are equal to the pure chance rate: 50%. Therefore, we are searching the minimum threshold a iz in this paper such that Subject i will give his/her PROM i � z with probability Q iz when x i � a iz ; and likewise Subject i will give his/ her PROM i < z also with probability Q iz when x i < a iz . If the probability between the two possible "truths" are not equal, their estimations require many more assumptions (see derivation section for details) and are not considered in this paper. It is also necessary to point out that the two probabilities are not complementary to each other.

Characteristics of parameter Q iz
Parameter Q iz varies with PROM z and subject based on its definition. Therefore, there is no linear relationship between the PROM and the objective endpoint X for any subject. For example, Pr( It is obvious that the clinical meaning of Q iz is inherited from its definition; i.e. the rate of revealing the truth, conditional on disease status (the actual disease status of Subject i relative to his/her minimum latent objective threshold for PROM z ). A 50% rate revealing truth is equivalent to the subject flipping a fair coin to determine his/her PRO z upon the PROM question; thus, this rate of 50% revealing truth indicates that the PROM z is not able to reveal the subject's disease status. In general, the higher the rate revealing truth is, the better the quality of the PROM z is. This is because the higher rate indicates a higher probability of the PROM z to reveal a subject's disease status upon the PROM question.
The use of Q iz to reveal the actual status of a subject's disease has not been discussed in literature. Rasch promoted a probability model for a true positive response [36]. However, because a negative agreement was not considered, the Rasch positive probability did not measure the probability of revealing truth from a PROM. Our approach is related to latent variable models for similar problems [37,38] in the sense that a iz can be regarded as a latent variable. On the other hand, we do not assume a particular distribution for a iz , which makes our approach different from most latent variable models. It is also noteworthy to know that Q iz is also measuring an indirect agreement between a continuous endpoint and a dichotomized version of an ordinal endpoint. Most traditional methodologies for measuring agreement as described in [39] are developed for two measures of the same type: both categorical or both continuous endpoints. In the case of different types of endpoints, ranks within each endpoint will replace the original values to make the two endpoints the same type (such as Spearman CC). In addition, the estimation of Q iz (1) can be applied to correlated data, (2) takes into consideration the uncertainty of the "gold standard" and involves a series of 2-by-2 tables in order to select one for the estimate (see the toy example below). Therefore, Q iz can be also viewed as a new agreement statistic between a continuous endpoint and a binary endpoint with or without correlation among samples.

Data and corresponding random variables
The data considered in this paper consist of pairs of observations (x ik , g ik ) for Subject i at clinical visit k, where k = 1, . . ., t. This x ik is a continuous outcome representing disease status and could be the value at visit k or the change from baseline to visit k, such as the change in hemoglobin count from baseline. The outcome g ik is the dichotomized version of the collected PROs at visit k, such as g ik = 1 if the PROM i � 5 and g ik = 0 otherwise. The change from baseline in the PROM i is not considered here, because (1) each latent threshold of a PROM z is corresponding to the PROM z itself instead of its change, and (2) a change in PROs from baseline does not carry the same clinical meaning, which depends on the baseline PROs. For example, in a 7-point scale PROM i shown in Fig 1, a change in one PROM unit from "much worse" to "worse" may not be meaningful to a subject, while a change in one PROM unit from "neither" to "improved" carries clinical meaning to the subject.
The corresponding random variables are denoted as ( In other words, upon the PROM question, Subject i will give his/her g ik = 1 (positive) with probability Q iz when his/her x ik � a iz , and will give his/her g ik = 0 (negative) with probability Q iz when his/her x ik < a iz as illustrated in Fig 2 below.

Estimation of Q iz
This subsection shows how to estimate Q iz using a toy example. The derivation of the estimator of Q iz can be found in next subsection. In order to estimate Q iz , it is necessary to first search a iz . Because the a iz is the minimum latent threshold in the objective measurement for the PROM z , the search for a iz can be done using a pre-selected set of values {a j, j = 1, . . ., m} between the possible minimum objective measurement and the maximum objective measurement based on the current medical knowledge for the entire target population (such as normal range of human hemoglobin count). The pre-selected value a j is not meant to be random, but rather fixed and ideally pre-determined before the realization of X ik . For example, the normal range of human blood hemoglobin concentration can be determined from 5g/dL to 20g/dL so that a iz is believed to be included in the range for any subject; if the increasing step is 1g/dL between a j and a j+1 , then number of searching points, m, is equal to 16 in this case. The magnitude of the increasing step is determined by how precise the a iz is expected to be. Again, this searching set is not considered random because it doesn't change with study or subject and may not be changed for decades, such as the normal range of human blood pressures. Table 1 shows a toy example of how to estimate Q iz . Note here, the number of searching points m need not necessarily be equal to the number (t) of clinical visits although we do so for illustration purpose. At each a j , the outcome x ik (k = 1, . . ., t) is compared to a j one at a time. Then the number of potential true positive (TP) and the number of potential true negative (TN) responses can be summarized per Table 2. For example, in the 1 st data row of Table 1 there are 9 x i � 5.0 (positive) and only 6 g i equal to one (PRO positive), therefore the TP is The g ik is from two Bernoulli random variables with same parameter but opposite meaning depending where the X ik is realized: x ik < a iz or � a iz . equal to 6 (see next paragraph for more details). The total number of such 2-by-2 tables is equal to m, as the total number of distinct a j is m. The derivation in next subsection shows that the maximum of R ij = (TP+TN) ij /t is a consistent estimator of Q iz . Table 1 shows how to use the pre-determined set of a j (j = 1, . . ., m) to calculate R ij at each a j based on two sets of 9 pairs of observations (x i1 , g i1 ) . . . (x i9 , g i9 ) from Subject i. The only difference between the two sets of samples is the different values in the 2 nd binary outcome g i2 (0 vs. 1). If the PRO i is positive, g ik = 1; otherwise g ik = 0. The pre-determined set of a j (j = 1, . . ., 9) is listed in the 2 nd column of Table 1. At each a j , one can compare the 9 objective outcomes (x i1 , . . ., x i9 ) to a j one at a time, and obtain the numbers of potential TP, FP, TN, FN per Table 2 above. Thus, each data row of Table 1 displays the four statistics TP, FN, FP, and TN, corresponding to a j . The estimate of Q iz for Subject i at the PROM z is the maximum of R ij . In this paper, if there are multiple tied maximums of R ij the median of the corresponding a j is used as an estimate of a iz . This is because at each maximum of R ij , the corresponding a j could be an estimate of a iz .

Derivation of the estimator of Q iz
As illustrated in Fig 1 above, the Q iz doesn't change its magnitude as long as x i � a iz or x i < a iz although Q iz changes its meaning from conditional true positive rate (when x i � a iz ) to   conditional true negative rate (when x i < a iz ). This is a reasonable setup because the event of PROM i � z is a composite event including PRO iz , PRO iz+1 , etc. For example, the event PROM i � 5 includes PRO i = 5, 6, or 7. When x i is far above a iz , Subject i may just give a higher PRO i (say 7) and this event counts as one event of PROM i � 5. This illustrates the fact that Q iz can be independent of the distance between x i and a iz . Because we search a iz such that Pr(PROM i � z | X i � a iz ) = Pr(PROM i <z | X i < a iz ) and Q iz doesn't change its magnitude as long as 8b < a iz ) (see Fig 2 for the illustration), where a and b are two arbitrary values in the objective measurement. Note here, although the clinical meaning of Q iz changes from conditional positive rate to conditional negative rate according to x i � a iz or x i < a iz , the magnitude of Q iz doesn't change. This implies that the magnitude of Q iz doesn't change with any subset of X i � a iz or X i < a iz . In order to reflect the setup and the meaning of Q iz , we use a and b here to indicate that Q iz does not change its magnitude with any subset in X i � a iz or X i < a iz . Also, the derivation of the Q iz estimator doesn't assume independence among X i1 , . . ., X it . The cumulative distribution function of X i1 is denoted as F i1 . Because the x i1 is obtained in the 1 st clinical visit before the realization of X i2 ,. . ., X it , the cumulative distribution function of X ik (denoted as F ik , k>1) is the marginal cumulative distribution function, which can be obtained by integrating out X i1 , . . ., X ik-1 from the joint distribution F Xi1, . . ., Xik for Subject i. The use of general form of F ik in the derivation takes into consideration the correlated samples. The joint distribution F Xi1, . . ., Xik applies to random variables with or without correlation. Therefore, the X ik (k = 1, . . ., t) are not assumed independent to each other and each X ik has a different marginal distribution.
The derivation of the estimator of Q iz starts with the probability of getting TN and TP at Visit k, which are presented in Expressions (1)-(4) below: • When a j < a iz : Pr ijk ðTNÞ ¼ PrðX ik < a j and G ik ¼ 0Þ With same argument, one can have the following: • When a j � a iz : Pr ijk ðTNÞ ¼ PrðX ik < a iz and G ik ¼ 0Þ þ Prða iz < X ik < a j and G ik ¼ 0Þ Pr ijk ðTPÞ ¼ Prða j < X ik and G ik ¼ Consequently, the expectation of TP+TN can be shown in Expressions (5) and (6), where E is the expectation operator.
• When a j � a iz : • When a j > a iz : If a j is equal to a iz , both expressions (5) and (6) are reduced to tQ iz . Therefore, R ij = (TP +TN)/t is an unbiased estimator of Q iz only if a j = a iz , and TP+TN follows the binomial distribution when a j = a iz because its expectation follows the expectation of the binomial random variable (tQ iz ). We further notice that E ij (TP+TN) obtains its maximum at a iz when Q iz > 0.5 (i.e. Q iz À � Q iz > 0) based on the sign of the derivative of E ij (TP+TN) with respect to a j . When Q iz > 0.5, the derivative of E ij (TP+TN) is positive at the left of a iz (see Expression 5), and becomes negative at the right of a iz (see Expression 6). Therefore, E ij (TP+TN) not only reaches its maximum at a iz , but also becomes tQ iz . This is why the unbiased estimate of Q iz is chosen as the maximum of R ij . Similarly, E ij (TP+TN) obtains its minimum at a iz when Q iz < 0.5.
In practice, it is reasonable to assume that a PROM has a non-negative association with the objective endpoint because it is obvious to see a potential direction of the PROM. If a negative association is expected, one can transform the objective outcome in order to have a non-negative association. For the example of a negative associate, if the PROM is the price satisfaction survey and the continuous objective endpoint is the cost of medical expense; then one can transform the cost by multiplying "-1" so that the higher negative cost (smaller cost) is in positive direction. Therefore, Q iz can be assumed to be � 0.5. If Q iz = 0.5 (pure chance), this indicates that the PROM z may not be able to reveal the truth; consequently, there is no conditional association between the PROM and the objective measurement X at PROM z . This is because Q iz is defined as the probability revealing truth at PROM z ; Q iz = 0.5 is equivalent to Subject i flipping a fair coin to get the PRO z by pure chance.
As discussed above, based on Expressions (5) and (6), the unbiased estimator of Q iz is b Q iz ¼ maxfR ij; j ¼ 1; . . . ; mg if a iz is in the searching set {a j, j = 1, . . ., m}. In practice, many tied maximums of R ij may occur especially when t is small and m is large. In this case, the median of the tied maximums will be taken as the estimate. Because of this, b Q iz becomes a consistent estimator. The variance of b Q iz is nuisance because the validation of PROM is usually drawn from multiple subjects instead of Subject i. Nonetheless the variance estimate ( d because TP+TN follows a binomial distribution with parameter t and Q iz when a j = a iz . Further, because t is usually small, the exact binomial confidence interval for Q iz is used for b Q iz in the simulation study. It is necessary to point out that if the two probabilities (say Q _iz for negative truth and Q +iz for positive truth) are not equal, many more assumptions are needed to estimate Q _iz and Q +iz . Using our method, when both Q _iz and Q +iz are both greater than 0.5 we can have t½rF ik ða iz Þ þ � F ik ða iz Þ�Q þiz ¼ Max a j ðTP þ TNÞ, where r = Q +iz /Q _iz . We can estimate a iz using a j at which the maximum of (TP + TN) is reached, but we have unknown r and many unknown F ik (k = 1, . . ., t). If we further assume r is known, we still could not find the estimate for Q +iz because we don't know these F ik . Unless we further assume the distribution function of X ik at each clinical visit k, we can have a consistent estimate of Q _iz and Q +iz . But we feel that these further assumptions on knowing r and F ik (k = 1, . . ., t) are not practical, especially in medical device clinical trials. Therefore, we only search the threshold such that the two probabilities are equal in this paper.

Inference of Q iz in multiple subjects
So far, Q iz is estimated based on t repeated pairs of measurements from Subject i for the PROM z . If one wants to know the population Q z for the PROM and the objective measurement X at PROM z in a target patient population, the Q z can be confirmed by the mean ( of b Q iz with its 95% CI. For example, the lower bound of the 95% confidence interval of Q z must be greater than a desired probability of revealing truth in order for one to believe that the PROM z is able to reveal disease status for majority of subjects in the patient population.
The ability of the PROM i to detect a change in the objective endpoint X i could be confirmed by the statistically significant change of a iz to a iz' obtained by different dichotomizations of the PRO i . Note, the magnitude of a iz will be changed when the PRO i is dichotomized differently. For example, the PRO i can be dichotomized at scale 7 by "at least very much improvement or otherwise" or at scale 6 by "at least much improvement or otherwise". This change of dichotomization represents one unit change of the PRO i from scale 6 to 7, and thus the change of a i6 to a i7 measures the ability of the PROM i to detect the change in the objective endpoint X i . The a iz is expected to be larger when the PRO i is dichotomized by "at least very much improvement or otherwise" compared to that by "at least much improvement or otherwise". This is because "at least very much improvement" is more difficult to be reached and thus its minimum threshold is expected to be higher than that for "at least much improvement". One can obtain the estimate of the change of a iz to a iz' from each of n different subjects, and perform the test of the mean change > 0.

Simulation
Simulation data from Subject i is used to illustrate the characteristics of b Q iz , especially to show b Q iz is a consistent estimator of Q iz . The simulation is not meant to align with a real clinical trial, however the use of b Q iz in a clinical trial is presented after the simulation using hypothetical clinical data. Because Q iz is defined at subject level, the simulation uses one treatment for a disease in one subject only. The primary endpoint is an objective endpoint measuring the change of the disease status from baseline to 3 months. The PROM is the 7-scaled disease-related satisfaction PROM such as illustrated in Fig 1. In order to include different means and standard deviations, the simulation uses 5 different means [μ = (0, 0.5, 1, 1.5, 2)] and 5 associated different standard deviations [σ 2 = (1, 1.3, 1.6, 1.9, 2.2)] as two building blocks to construct various multivariate normal distribution for the objective endpoint. For example, if t = 10 then X ik (k = 1, . . . 10) will follow the multivariate normal distribution with stacked mean vector (μ, μ) and the variancecovariance matrix with diagonal elements of σ 2 repeated similarly on diagonal and the off-diagonal element of ρσ l σ s . Other setups are described as follows: a. The correlation coefficient (ρ) between X ik and X ik' ranges from 0.3, 0.5, and 0.8. b. 'a i7 ' (the minimum objective threshold for "at least very much improved") is equal to 1.2, 'a i3 ' is equal to -0.3 and 'a i5 ' is equal to 0.4.
d. Number of repeated measurements for the subject is t = 5, 10, 20, 40.
e. Pre-selected a j ranges from -2 to 5.0 with increasing step of 0.1, therefore m = 71. Because the minimum two standard deviations below the five means is -2 and the maximum two standard deviations above the five means is 5, this range is wide enough to include all underlying true values of a i3 , a i5 , and a i7 .
f. Number of simulation is 10,000.
For each combination of ρ (0.3, 0.5, 0.8), a iz (-0.3, 0.4, 1.2), and Q iz (0.5, 0.6, 0.7, 0.8, 0.9) , , the t (5, 10, 20, 40) pairs of outcomes (x ik , g ik ) (k = 1, . . ., t) are sampled as follow. First x ik (k = 1, . . ., t) is drawn from the corresponding multivariate normal distribution. If x ik � a iz , g ik is drawn from Bernoulli (Q iz ); otherwise g ik is drawn from Bernoulli (1-Q iz ). Then an estimate of Q iz is calculated using the method described above based on the t pairs of outcomes, and its 95% CI is calculated using the exact binomial confidence interval due to small samples. These steps are repeated 10,000 times for each underling value of Q iz and t; and then the mean of these 10,000 b Q iz and the coverage probability of the 95% CIs for the Q iz are obtained.  show three examples that the mean of these 10,000 b Q iz converges to Q iz regardless of the values of ρ and a iz . As the number of clinical visits increases for Subject i, the mean of b Q iz approaches its underlying true value of Q iz . The converging pattern exists for every value of Q iz (0.6, 0.7, 0.8, 0.9) except for Q iz = 0.5. This is not a surprise because when Q iz = 0.5 there is no association between PROM i and X i at PROM z . As shown in expressions (5) and (6), when Q iz = 0.5 every R ij (j = 1 . . . m) is an unbiased estimator of Q iz . A separate simulation using the median of R ij as b Q iz is performed when Q iz = 0.5. The mean b Q iz ranges from 0.50 to 0.52 (converging to 0.5) for different combinations of ρ, a iz , Q iz , and t. In practice, the simulation results for Q iz = 0.5 in Figs 3-5 can be used as a reference to set a minimum acceptable Q iz value. Table 3 shows that mean b Q i7 is a fairly close estimate of Q i7 under different values of t (5, 10, 20, 40). It is found that the probability of the 95% CI including the true value of Q i7 (coverage probability) is at least 95% due to the use of exact binomial confidence interval.

Case study: Hypothetical clinical trial data
The probability Q iz of revealing truth for Subject i at PROM z , has been applied to hypothetical clinical trial data in order to assess the conditional association parameter in multiple subjects. The purpose of the trial is to improve near vision by a medical device. Each subject had a test device implanted and was followed up at Months 3, 6, 12, 18, 24, 30 post procedure. At each follow-up visit, a subject had his/her uncorrected near visual acuity (UCNVA) measured using ETDRS Chart at 40 cm/16 in, and answered a unidimensional PROM question with 7 possible outcomes as shown in Fig 1. The question in the PROM was "How satisfied are you with your near vision without reading glasses after the treatment?" The change from baseline in UCNVA is considered as the continuous objective clinical endpoint with a larger change indicating better near vision. The outcome of the satisfaction question is the PRO which can be dichotomized in 3 ways for every subject: �5 or otherwise, �6 or otherwise, �7 or otherwise. The mean b Q iz (z = 5, 6, or 7) is used to assess the probability of the PROM z to reveal the status of the visual acuity in the targeted population.
The pre-determined threshold searching set {a j, j = 1, . . ., m} ranges from -20 to 60 letters with an increasing step of 1. This set contains m = 81 searching points for the minimum threshold a iz (z = 5, 6, or 7). It is believed that the threshold-searching set is large enough to contain the true value of a iz for PROM z for every subject in the target population. Table 4 below shows that the mean of the b Q iz (probability of revealing truth) and the mean b a iz in the change of UCNVA. As expected, one can see that the highest satisfaction has the lowest mean probability of revealing truth uncorrected visual acuity and the largest threshold in the change of UCNVA: 21 more letters correctly identified from baseline. The associated 95% CIs for Q iz well exclude 0.5 indicating Q iz from the majority of subjects are greater than 0.5 and consequently the probability of the PROM z revealing subjects' uncorrected visual acuity is established. Since the PROM z has > 83% probability (based on the lower limits) of revealing the status of UCNVA, it may be used as a binary endpoint for the primary inference for uncorrected near visual acuity. Table 5 shows the median of b a iz À b a iz 0 when the satisfaction level changes. The b a iz À b a iz 0 is found to have a highly skewed distribution; therefore p-values are reported here from a nonparametric signed rank test, and the reference statistic is referred to median instead of mean. One can observe that 1. When the PRO increases from �5 to �6, the majority of subjects have no change (median = 0) in their uncorrected near vision acuity; this means that the PRO change from scale 5 to scale 6 may not represent a change in majority subjects' uncorrected near vision acuity. Relationship between objective endpoint and patient reported outcome measure 2. When the PRO increases from �6 to �7 or �5 to �7, the majority of subjects have a positive change (median = 9 or 21, respectively) in their uncorrected near vision acuity; this means that the PRO changes from a lower score to 7 represent a change in majority subjects' uncorrected near vision acuity.
These indicate that a change of one PROM unit in this case might not be adequate for a translation to a change in uncorrected near visual acuity. An increase of at least two (2) PROM units represents that the majority subjects have a positive increase in their uncorrected near visual acuity. Consequently, the ability of detecting the change of uncorrected near vision function by this PROM is suggested by two (2) PROM units in this clinical trial instead of one (1) PROM unit; or the majority of subjects have their PRO scores changed to 7. It is noted that the number of samples from each subject is � 6 in this trial, which limits the capability of this method to search for a iz .

Concluding remarks
The conditional probability Q iz revealing the true status of Subject i's disease at PROM z is a new quantitative statistic assessing the conditional association between a unidimensional PROM i and a continuous objective endpoint X i measuring the disease status. The probability Relationship between objective endpoint and patient reported outcome measure Q iz of revealing truth is estimated for each subject using paired observations (x ik , g ik ) measured repeatedly at different clinical visits (such as Months 3, 6, 12 etc.). The Q iz reveals truth with respect to the latent minimum objective threshold a iz (i.e. x ik � a iz , or x ik < a iz ). When the PROM i doesn't associate with the objective endpoint X i , the Q iz is equal to the pure chance of 0.5. Because Q iz is a probability measure, this situation looks like one has flipped a fair coin to  get his/her PRO regardless the status of his/her disease. When a PROM is used as a measure for a disease/condition in a clinical trial setup, the probability of revealing truth must be at least statistically greater than the pure chance of 0.5. The threshold searching set {a j : j = 1, . . ., m} can be pre-determined using the current clinical standard of the possible minimum and maximum objective measurements in the target population. For example, the human hemoglobin concentration ranges from 5 g/dL to 20 g/dL. The number m can be determined based on how precise a iz is expected to be.
In practice, a clinical trial has n subjects and thus has n estimates of Q iz (i = 1, . . ., n). In order to have the PROM z used for a target population, the majority of Q iz (i = 1, . . ., n) have to be greater than the pure chance of 0.5; or it is equivalent to say that the mean/median of the Q iz (i = 1, . . ., n) should be greater than 0.5. Although the mean/median of the Q iz > 0.5 would indicate some association between the PROM and the objective endpoint X greater than chance in the target population, a higher quality PROM should have a larger value of the mean/median of the Q iz (i = 1, . . ., n). Let's denote δ as the minimum value of the mean/ median of the Q iz (i = 1, . . ., n) which is an acceptable probability for PROM z to reveal the status of the majority of subjects' disease. To confirm that the majority of subjects have their Q iz (i = 1, . . ., n) greater than δ, one can simply test that the mean/median of the Q iz (i = 1, . . ., n) among n different subjects is >δ.
When the PRO is dichotomized differently by one PROM unit increased at a time, one can get the associated estimate of the change of the minimum threshold in the objective measurement for each subject, such as b a iz À b a iz0 (i = 1, . . ., n). If the mean of these estimates from different subjects is statistically significantly greater than 0, then the PROM has the ability to detect a change in the objective endpoint. In case that b a iz À b a iz0 (i = 1, . . ., n) has a skewed distribution, one should use the median of the estimates of b a iz À b a iz0 (i = 1, . . ., n) so that the test implies that the majority of a iz − a iz 0 (i = 1, . . ., n) are greater than 0.
The limitations of using Q iz include (1) it is applicable to a unidimensional PROM or a PROM item of interest in a multi-dimensional PROM instrument when a valid continuous objective measure of the disease status exists, and (2) if the number of repeated measurements is small, the estimator of Q iz is more biased. In this case, one can adjust the minimum acceptable probability of revealing truth in order to have confidence for the PROM z to reveal truth. Further research may focus on a quantitative method for measuring the conditional association between a multi-dimensional PROM and a pertinent objective measurement.

Appendix A: Notations
• Sub-indexes i and j represent Subject i and threshold searching point j within a clinical visit k (i = 1, . . ., n, j = 1, . . ., m, and k = 1, . . ., t). The letter z denotes the z th scale of the PROM (PROM z ).
• The a iz is a fixed parameter which is defined as the minimum latent threshold in terms of the objective measurement for Subject i at PROM z . The a iz is defined for the z th scale and Subject i. For example, if the PROM has 5 different scales, then we will have five different values of a iz for the subject.
• The a j is the j th searching point for a iz , and the a j belongs to a fixed pre-selected threshold searching set {a j : j = 1, . . ., m} (such as the normal range of hemoglobin count with an increasing step of 0.5). The a j is a nonrandom variable and does not change with subject. The set is selected based on the current clinical standard of normal range.
• The X is the random variable for the continuous objective measurement of the status of a subject's disease/condition, and lower case x is an outcome/realization of X.
• G 1 ik is the Bernoulli random variable with probability Q iz to be 1 when x ik � a iz . • G 0 ik is the Bernoulli random variable also with probability Q iz to be 0 when x ik < a iz . • G ik represents two mixed Bernoulli random variables with the same parameter Q iz (but opposite meaning) G 1 ik (if x ik � a iz ) or G 0 ik (if x ik < a iz ).