Fig 1.
H&E Image Interpretation and Protein Expression Inference—overview of the data and method: (A) A schematic overivew of HIPI. An H&E slide is processed in tiles through a Deep-Learning prediction model generating multiplexed spatial expression level of 16 proteins. (B) An overview of the data and preprocessing steps. Pairs of H&E and CyCIF images taken from adjacent tissue slices are aligned and processed to generate image tiles with corresponding expression levels. (C) Alignment of adjacent H&E and CyCIF images using global linear transformation followed by local non-linear registration. The inset is a zoom-in on a tile to demonstrate the effect of the local refined alignment. (D) Deriving tile level expression from CyCIF cell data. Cell locations and expression were taken from [25]. The expression is then aggregated at the tile level. The figure illustrates expression of Ki67 marker on slice 25 (yellow—high expression). (E) Train (* red), validation (** green) and test (*** blue) data splits for sample CRC1 illustrated on a single slice. (F) Seven additional CRC samples from different patients.
Fig 2.
Selected results on sample CRC01 slice 96 for five markers: Keratin, Ki67, CD45, PD1, PDL1.
(A) tile level CyCIF expression (z-score). (B) Prediction of our model for each tile (z-score). (C) Pearson correlation between measured and predicted values of the selected markers. We calculate the correlations for the train/validations/test tiles of each slice of sample CRC01 separately (slice 96 is marked with a star). (D) Pearson correlation between measured and predicted values across all slices and all markers on the test set.
Fig 3.
Selected results on sample CRC15 for five markers: Keratin, Ki67, CD45, PD1, PDL1.
(A-B) similar to to Fig 2A and 2B. (C) Pearson correlation between measured and predicted values for the test set of each slide of sample CRC01 and all external CRC samples (sample CRC15 is marked with a star). (D) Pearson correlation between measured and predicted values across all slices and all markers on the test CRC samples.
Fig 4.
Selected cell marker occurrence in CRC01–96.
(A) Occurrence and co-occurrence of CD45, CD4/FOXP3/CD8a, PDL1, PD1/PDL1 and CD68/PDL1/CD8a/PD1 in measured tile level data. (B) Predicted occurrence by HIPI. (C) Confusion matrices classifying different marker co-occurrence. Matrices are normalized column-wise by the total number of predictions for each class. Therefore, the inline numbers give the fraction of tiles predicted by the model to be in a certain class which truly belong to that class.
Fig 5.
Cell markers occurrence in CRC15.
See Fig 4 for panel description.