Complete hazard ranking to analyze right-censored data: An ALS survival study

doi:10.1371/journal.pcbi.1005887

Fig 1.

Workflow.

Data were randomly divided into training and testing datasets for both survival status and clinical features. Because the challenge evaluation used first 3-month data as the prediction input data, only the features recorded during the first 3 months of patient enrollment in the trial were included in the data training features. (Step 1) Using the survival data, and K-M estimator, we ranked all the individuals and assigned each one ranking score. (Step 2) Each feature was further extracted to four meta-features. (Step 3) After calculating the correlation between the meta-features and survival results, we chose the top related features to be counted as the training and testing data. (Step 4) The training ranking was set as the training target to generate the model. This model was subsequently applied to the testing features for prediction. The prediction results were compared to test death report data for concordance Index shown as ‘C’ in the figure (right bottom).

More »

Expand

Table 1.

The summary of the PRO-ACT dataset used in this study.

More »

Expand

Fig 2.

Comparison of performance and predictive features availability.

A: The results of 2-fold cross validations are shown for each regression method in this in this figure, Cox: Cox proportional hazards regression model, RSF: random survival forest, GPR: Gaussian process regression, Linear: linear (Lasso) regression, and RF: random forest. For Cox model and random survival forest, the time-event data was used as the training target while our ranking values were used for the other 3 regression method. The distribution of concordance index is represented by the width of the violin shape: the wider the shape, the greater the sample concentrated. The three red horizontal lines in each violin shape shows the lowest, mean, and the highest concordance for each method. B: The eleven features used and their availability. Weight = weight in kg; mouth = composite score of ALSFRS questions 1–3; fvc1/fvc2 = value (in liter) of first/second attempt in fvc measurement, fvc = average of fvc values in measurement attempts; Q3_swallowing = ALSFRS Question 3 score; chloride = chloride concentration in blood; ALSFRS_total = sum of ALSFRS question scores; Q8_walking = question 8 of ALSFRS; respiratory = question 10 in ALSFRS and question 1 in ALSFRS-R; leg = composite scores combining ALSFRS question 8 to 9.

More »

Expand

Fig 3.

Testing of different feature numbers.

The performance of all tested methods, as number of features is increased (x axis). Twenty rounds of 2-fold cross validation are performed using the 1 to 15 features. The performance gradually increases when we add features to the training set. The prediction performance significantly improved when the number of features adopted increases from 1 to 6. No closely related feature was excluded. (A) Cox model. (B) Rank-based Gaussian process regression. (C) Rank-based linear regression. (D) Rank-based random forest.

More »

Expand

Fig 4.

Feature adoption ratios in repeated cross-validation tests.

The adoption ratio of a feature is, when the models are allowed to take only a limited number of features, the ratio of models that selects the feature throughout all 20 rounds of cross-validation tests. Since the data were split randomly throughout all the tests, the models might pick up a different set of features. The most significant features were expected to have high adoption ratios. Features such as fvc, ALSFRS Total, and weight were adopted in all training process; features like fvc1, mouth and Q3_Swallowing were picked in more than half of the tests. (A) Feature adoption ratios when models were allowed to take 15 features at most. In total 22 features have been picked at least once. (B) Feature adoption ratios when models were allowed to take 6 features at most. In total 11 features have been taken at least once.

More »

Expand

Fig 5.

Features correlation.

The colors in the heat-map show the Pearson correlation between each pair features. Dark color indicates a strong correlation, either positive (red) or negative (blue). Closely correlated features are clustered into group by applying hierarchical clustering. The black box shows several significant clusters among the features. A: Respiratory related features, B: Upper and lower limb related features and bulbar onset, C: FVC related features, D: Mouth related features and Limb onset.

More »

Expand

Fig 6.

Ranking method based on the Kaplan-Meier estimator.

Each panel above represents the pair-wise comparison between two participants. To compare two Patients A(t1) and B(t2), assuming t₁ < t₂, there are four possibilities (as shown in 6A-6D). (A) Survival time is known for patient B but is censored for patient A. The probability of patient A dying between t1 and t2 is ρ_t1,t2 (ρ_t1,t2 = )). In this case, we add ρ_t1,t2 to rank of patient A, and (1- ρ_t1,t2) to B. (B) Survival status for patients A and B are both censored. The probability of patient A dying between the time t1 and t2 is represented by ρ_t1,t2. Due to the uncertainty of the both A and B, we assume that after time point t2, A and B are equally likely to survival longer than another, with the possibility of P = (1 − ρ_t1,t2)/2. Therefore, ρ_t1,t2 + (1 − ρ_t1,t2)/2 and (1 − ρ_t1,t2)/2 are added to the rank of A and B here, respectively. (C & D): Patient B survives longer than patient A, 1 is added to the rank of A. Yellow dots = death event of completely observed samples; Blue dots = last observation of right censored samples.

More »

Expand