Nuclear IHC enumeration: A digital phantom to evaluate the performance of automated algorithms in digital pathology

Automatic and accurate detection of positive and negative nuclei from images of immunostained tissue biopsies is critical to the success of digital pathology. The evaluation of most nuclei detection algorithms relies on manually generated ground truth prepared by pathologists, which is unfortunately time-consuming and suffers from inter-pathologist variability. In this work, we developed a digital immunohistochemistry (IHC) phantom that can be used for evaluating computer algorithms for enumeration of IHC positive cells. Our phantom development consists of two main steps, 1) extraction of the individual as well as nuclei clumps of both positive and negative nuclei from real WSI images, and 2) systematic placement of the extracted nuclei clumps on an image canvas. The resulting images are visually similar to the original tissue images. We created a set of 42 images with different concentrations of positive and negative nuclei. These images were evaluated by four board certified pathologists in the task of estimating the ratio of positive to total number of nuclei. The resulting concordance correlation coefficients (CCC) between the pathologist and the true ratio range from 0.86 to 0.95 (point estimates). The same ratio was also computed by an automated computer algorithm, which yielded a CCC value of 0.99. Reading the phantom data with known ground truth, the human readers show substantial variability and lower average performance than the computer algorithm in terms of CCC. This shows the limitation of using a human reader panel to establish a reference standard for the evaluation of computer algorithms, thereby highlighting the usefulness of the phantom developed in this work. Using our phantom images, we further developed a function that can approximate the true ratio from the area of the positive and negative nuclei, hence avoiding the need to detect individual nuclei. The predicted ratios of 10 held-out images using the function (trained on 32 images) are within ±2.68% of the true ratio. Moreover, we also report the evaluation of a computerized image analysis method on the synthetic tissue dataset.


INTRODUCTION
Modern imaging techniques generate very large images and the wide-spread utilization of image data in many areas has created significant challenges associated with processing, analysis, and management of these large datasets. The need to handle large volumes of image data is pervasive today in nearly every industry, government, and institutional sector.
For example, in digital pathology, a single digital whole slide image produces a file ranging in size from 2 to 60 gigabytes (GBs). When an entire case is considered (typically between 4 to 30 different slides), the raw data size can exceed 100 GB. Considering the daily slide volumes of a typical academic pathology department, the volume of data generated by a fully digital pathology practice can be enormous.
It is important to point out that the information present in medical images has immediate as well as long-term relevance: The pathology images associated with a case have immediate diagnostic relevance for that case. However, some information in these images becomes relevant only when it is considered as part of a large cohort. Subtle characteristics of disease can only be identified when a large set of images are analyzed collectively. Therefore, there has been growing interest in Big Data analytics in information sciences over the last few years [1]. While extraction and quantification of relevant [2] and task-specific information [3] from images has long been an active area of research within the image processing community, the emergence of Big Data poses additional challenges particularly in processing resources/speed and scalability.
One of the most prominent Big Data challenges in histopathological image processing is how to transmit, store, and, most importantly, manage this large amount of data efficiently.
Conventional image compression methods were designed to act as an input-output filter used for compression enabling access to the image at a single image quality, size, resolution, and spatial extent that was envisioned at the time of compression [4]. However, these conventional methods are impractical for Big Data image analytics. Many Big Data image analytics applications employ distributed storage and computing, and transmission of large amounts of data between the storage and computing nodes creates a major processing bottleneck. Furthermore, Big Data databases are often mined for different tasks and the image quality, resolution, and spatial extent relevant for different tasks can be vastly different [5]. In this work, we propose an interactive image compression framework that acts as part of a Big Data histopathological image processing system providing efficient and scalable interaction with histopathology images. The proposed framework is based on the JPEG2000 image compression standard [6] and the JPEG2000 Interactive Protocol (JPIP) [7]. Through the use of an example application, we illustrate that when image compression is tightly integrated with the image processing methods, both compression and computational efficiency can be significantly improved. So, the aim of this study is to develop and validate a novel image compression paradigm where information most relevant to the task at hand is stored and transmitted preferentially.

II. SUBJECTS AND METHODS
The JPEG2000 image compression standard was designed with the central theme of scalability. In addition to resolution and quality scalability, JPEG2000 provides spatial and image component accessibility. This rich family of scalability features enables JPEG2000 code-streams to be easily parsed to extract subsets of compressed data that represent a desired region of interest with a desired set of color components, at a desired resolution and with a selected level of quality. The JPIP protocol [7] builds on this strong foundation to allow interactive access to JPEG2000 compressed images over networks. The block diagram in Figure 1 illustrates the basic architecture of a JPIP server and client. [ Figure 1.] As suggested in the figure, the client sends request to the server about its desired region, resolution, and color component of interest and the server responds by transmitting the compressed data that is necessary to obtain an image with the desired attributes. While the JPIP framework has been shown to be useful for remote browsing of very large images, it can also provide the essential ingredients for image processing applications that employ distributed storage and computing. The JPIP client can be integrated within the image processing method and communicate the current region, resolution, and color components of interest to the server which, in turn, transmits the corresponding compressed data to the client. A region decompressor can then decompress this compressed data and return it to the image processing method. Note that this framework not only minimizes the amount of data transfer between the storage and computing nodes but also minimizes the computational resources consumed by the compression engine since only data required by the image processing method is transmitted and decompressed. Using the Kakadu Software Development Kit [8], we have created a MATLAB® (Mathworks, Natick, Massachusetts) interface which allows seamless integration of a JPIP client with image processing methods.
Using this interface, we have implemented a distributed image analysis method. This method is presented in the next section.

Application to Hot-Spot Detection in Ki-67 Stained Breast Tissue
Ki-67 is a nuclear protein expressed exclusively during the active cell cycle phases with no expression in quiescent cells [9]. Its presence appears to be necessary for cell proliferation, although its exact function is unclear [10]. In breast cancer, Ki67 has shown promise as an independent prognostic marker and as a predictive marker of responsiveness or resistance to chemotherapy or hormone therapy [11]. According to the published recommendations of the Breast Cancer working group [12], Ki-67 score or index is defined as the percentage of positively stained cells within the total number of malignant cells scored. In Ki-67 scoring, hotspots are generally defined as areas in which Ki-67 staining is most prevalent; or those areas with the highest number of positively staining nuclei within the invasive component.
From image analysis perspective, hotspot detection can be considered as a density approximation problem. In the past, hotspot detection was partially addressed by a few studies [13][14][15]. Some of these earlier methods [14,15] were only tested on small region-ofinterest images and their extension to whole slide images is not trivial due to computational challenges. In this work we present an efficient method that approximates the hotspots from a whole slide image within reasonable time using our proposed compression frame-work.
The hotspot detection component of this framework is generalization of [16,17] to whole slide images.
Ki-67 positive cells manifest themselves as brown hue cells in images of breast tissues. The large variations in specimen preparation, staining, and imaging as well as true biological heterogeneity of breast tissue often results in variable brown intensities in Ki-67 stained images [17]. These variations affect the segmentation accuracy of Ki-67 nuclei. We performed segmentation of breast tissue images using an efficient version of [17]. This method exploits the intrinsic properties of CIE L * a * b * color space to translate this complex problem into an automatic entropy based thresholding problem. The method in [17] consists of three main components; clustering of RGB color pixels into three clusters based on cluster centroids, color space transformation in the CIE L * a * b * color space, and entropy thresholding to segment the Ki-67 positive nuclei. Computationally, clustering is the most expensive component among the three. The absence of a closed form solution for clustering and the need for iterative refinement turns [17] into a computational bottleneck. Moreover, the method was originally designed for region of interest (ROI) images with an assumption that each ROI image has some Ki-67 positive nuclei. However, its block-by-block application to a whole slide image usually contains blocks where Ki-67 positive nuclei are completely absent.
In those situations, the method erroneously starts to consider negative nuclei as Ki-67 positive nuclei. To reduce the computational complexity and the number of false positives, we propose an efficient version of [17] which extends its application to whole slide images.
The efficient version consists of three main steps; automatic ROI selection from a whole slide image, extraction of parameters from the selected ROI, and application of [17] to whole slide images with the extracted parameters.

1) Automatic Selection of an ROI from a whole slide image
The aim of this step is to automatically select an ROI image with some Ki-67 positive nuclei.
To accomplish this, we manually cropped 25 ROI images with different concentration of Ki-67 positive nuclei from 25 different whole slide images. The size of these ROI images varies between 1K×0.5K and 9K×5K. Nine out of 25 whole slides images were acquired at Cleveland Clinic while the rest were acquired at The Ohio State University. From the 25 manually cropped ROI, we extracted: cluster centroids, the color transformation matrices, and the thresholds resulting from entropy thresholding. Figure 2 shows the resulting cluster centroids as piecewise linear functions. Each function consist of nine points, the first three points correspond to first cluster centroid, the next three points represent the second cluster centroid, and the last three points represent the third cluster centroid. Although all ROI images contain Ki-67 positive nuclei, there exists a huge variation in cluster centroids across images. However, the cluster centroids seems to differ less if the ROI images (containing Ki-ROI from the whole slide image and apply the resulting cluster centroids to whole slide image.
To accomplish this we computed the average centroid matrix, ̅ as: the training set, we grouped RGB color pixels based on its elucdian distance from ̅ .
[ Figure 2.] Apart from the computation of the color transformation matrix, rest of the steps are the same as explained in [17]. Figure 3 shows the resulting color transformation matrices as piecewise linear function for 25 ROI images. The Eigen vectors in each color transformation matrix are first appended next to each other, hence resulting in a 9 element vector as shown in Figure   3. Interestingly, the color transformation matrix ( In summary, while ̅ and ̅̅̅̅ were kept unchanged for all 2.5x images, we computed the individual threshold using entropy thresholding. The pre-computation of ̅ and ̅̅̅̅ bring considerable computational savings as these do need to be iteratively refined for clustering. The next step in automatic selection of an ROI is to segment all 2.5x images with precomputed ̅ and ̅̅̅̅ and entropy thresholding. Once segmented, the images are reduced to 0.25x magnification using bilinear interpolation. The resulting images are again converted into binary images by setting all non-zero elements to 1. The use of bilinear interpolation followed by setting of non-zero elements to 1 ensures that any potential nuclei are not lost as a result of down sampling. The next step is to group the location of the resulting non-zero pixels into seven clusters. One may opt for different number of clusters than 7. Our choice was mainly driven by computational efficiency as it resulted in a large enough ROI region to reliably extract true cluster centroids. The automatic selection of an ROI takes nearly 8 seconds per image. As a last step, we use the convex shape of the cluster with the minimum point to centroid distance as an automatically selected ROI. Once a ROI is automatically selected, we compute the true cluster centroids according to the method in [17]. For accurate segmentation of the whole slide images, we used true cluster centroids, ̅̅̅̅ , and entropy thresholding at (4x) lower-resolution images. For images of size larger than 90K×90K pixels, lower resolution images (8x) were used.
In [16], we used an automated method to detect individual nuclei from nuclei clumps at 40x magnification. However, this method is computationally demanding. For computational efficiency, we benefit from relatively uniform size of Ki-67 positive nuclei to approximate the number of nuclei within nuclei clumps. During training, average nucleus size was manually measured and it was used during testing to approximate the number of nuclei within clusters of nuclei.
Once the nuclei centroids are approximated, we used a modified version of α-shape maps [16] to generate a heat-map from nuclei centroids. The α-shape was computed from the knearest centroids of the i th centroid. α-shape is a generalization of convex hull to non-convex shapes and attempts to define the shape of a finite set of point in the space. Each α-shape is a binary image where the points inside the α-shape are assigned the value of 1. We further compute the weighted map by using the equation: Here, Λ i is the area function which computes the area of the i th α-shape. To compute the heat map, , we compute: = However, computing α-shapes in a large size image is computationally challenging. As we are only utilizing the nuclei centroids (locations), we can afford to use a much lower resolution image to generate heat-maps without losing much information. Dividing the location of nuclei centroids by a factor of d directly corresponds to d-times lower resolution of resulting heat-map. However, dividing the location by a factor of d will not result in loss of any nuclei but will only bring them closer in proximity. In our experiments, we set d = 10 as it results in least number of overlapping nuclei. After generating the heat maps, the heat map is segmented using fast marching [18]. We used gray-scale intensity difference as weights for image pixels. The top 2% (highest values in the heat-map) were set to initialize the fast marching. Once again, this step was also performed at the same resolution as the heatmaps.  Table 1. The Table was only limited to seven images due to space constrains. The first approach (denoted as "Full

This
Tissue Image" in the table) corresponds to a "conventional approach" where the entire compressed codestream is transferred to the computing node. In this case, the average effective compression ratio (ECR) calculated as the ratio of the number of bytes used to represent the uncompressed image to the number of bytes transferred over the network is roughly 15 which correspond to the compression ratio of the original JPEG2000 codestreams stored on the server. In addition, the runtimes for decompression of the transferred data are also tabulated in Table 1. It is easy to see that the decompression runtimes in this case are high and considerably exceed the runtime of the hotspot detection method. The second approach (denoted as "Full ROI" in the table) corresponds to an intermediate integration of the hotspot detection method with the proposed framework. In this case, the tumor region is determined and a minimum enclosing rectangular region of interest around the tumor region is requested from the server for further processing. The average ECR in this case is 127, roughly an order of magnitude higher than the ECR of the conventional method.
Correspondingly, there is a considerable decrease in the decompression runtimes as well.
Finally, the third approach (denoted as "Only ROI" in the table) represents a much tighter integration of the client software with the hotspot detection method where only the tumor region is requested from the server (instead of the enclosing rectangular region). The average ECR in this case is 222 and the decompression runtimes are on average 15 times faster than the conventional approach.

IV. DISCUSSION & CONCLUSIONS
Clinicians and researchers in histopathology are increasingly generating and using whole slide images. The slide volumes of a typical academic pathology department require roundthe-clock operation of multiple scanners which can be loaded with hundreds of slides and can scan continuously [19]. Thus, the volume of data that is expected to be generated by a fully digital pathology practice is enormous. In addition to tremendous storage requirements, the large data sizes also present challenges for rapid and interactive access to image data.
High performance image rendering with low-lag times are critical to match or exceed the current productivity levels of pathologists using the light microscope. While there is an increased interest in the use of quantitative image analysis algorithms for disease detection, diagnosis, and prognosis prediction to complement the opinion of the pathologist, these algorithms require efficient access to image data. The practice of pathology in the era of Big Data requires development of advanced data compression methods. The current manuscript is geared towards creating an integrated computational framework for image compression that allows us to efficiently store, process, and provide rapid access to the high quality images. In this regards, our contribution is twofold;  We presented a task specific image compression framework which provides the user with the flexibility to perform context aware image compression. This enabled us to use a variable compression rate within the same image. As a consequence, the proposed framework minimizes the data transfer between storage and computing nodes and significantly reduces the computational resources consumed by the decompression engine.. For instance, the variable nature of our framework allows for higher compression ratios in areas outside of the tumor and relatively lower compression ratios within the tumor region. This is currently not possible with the existing compression methodologies.
 We presented an efficient method to detect hotspots from whole slide images of breast tissues. On average it took 88.6 seconds (standard deviation of 36.3 seconds) to detect hotspots from whole slide images. This processing time does not include the time required to read/transfer the image.
Natural directions for future research include exploration of the impact of quality scalability features with various image processing methods as well as incorporation of task-based image quality metrics into the proposed framework.