RaCaT: An open source and easy to use radiomics calculator tool

Purpose The widely known field ‘Radiomics’ aims to provide an extensive image based phenotyping of e.g. tumors using a wide variety of feature values extracted from medical images. Therefore, it is of utmost importance that feature values calculated by different institutes follow the same feature definitions. For this purpose, the imaging biomarker standardization initiative (IBSI) provides detailed mathematical feature descriptions, as well as (mathematical) test phantoms and corresponding reference feature values. We present here an easy to use radiomic feature calculator, RaCaT, which provides the calculation of a large number of radiomic features for all kind of medical images which are in compliance with the standard. Methods The calculator is implemented in C++ and comes as a standalone executable. Therefore, it can be easily integrated in any programming language, but can also be called from the command line. No programming skills are required to use the calculator. The software architecture is highly modularized so that it is easily extendible. The user can also download the source code, adapt it if needed and build the calculator from source. The calculated feature values are compliant with the ones provided by the IBSI standard. Source code, example files for the software configuration, and documentation can be found online on GitHub (https://github.com/ellipfaehlerUMCG/RaCat). Results The comparison with the standard values shows that all calculated features as well as image preprocessing steps, comply with the IBSI standard. The performance is also demonstrated on clinical examples. Conclusions The authors successfully implemented an easy to use Radiomics calculator that can be called from any programming language or from the command line. Image preprocessing and feature settings and calculations can be adjusted by the user.


Introduction
Features describing image texture contain valuable information about important image characteristics and are applied in multiple disciplines. They can be used for object identification or the definition of region of interests (ROI) in e.g. radar or satellite images [1,2]. In medical images, textural information extracted from tumor regions has shown to provide valuable information about prognosis, tumor staging, and treatment response [3][4][5].
For this purpose, a large amount of imaging biomarkers is extracted from the tumor region and used for classification purposes. These feature values, also named radiomic features, include besides second-order textural features, shape, first-order statistical, and intensity-histogram based features. Radiomic features are used to build machine learning models which are e.g. used for prediction or classification [6,7]. However, until now, radiomic features are not used for clinical decision making as there is a lack of standardization in the majority of the steps in the radiomics pipeline.
One of these challenges is the lack of a standardized feature definition and calculation. Feature values reported by different institutions do not necessarily follow the same feature definition nor necessarily lead to identical results when used on the same images. This problem is aimed to be solved by the image biomarker standardization initiative (IBSI) by providing mathematical feature definitions and phantom data sets with corresponding feature values in order to standardize feature definitions and calculations [8,9]. Several open-source software packages, like LifeX, IBEX, CaPTK or CGITA, calculating radiomic features have been developed and published [10][11][12][13][14]. However, although this initiative is widely known, only few radiomic feature calculators are also standardizing the image preprocessing part of the radiomic pipeline, which is essential for feature calculations. Furthermore, in the majority of the software packages not all defined features are implemented.
In order to provide a feature calculator that comes with the correct feature implementation of all features defined by the IBSI standard, we developed a Radiomics calculator tool, RaCaT, that is easy to use and does not require any programming skills. We compare the feature values obtained by RaCaT with the feature values reported by IBSI. Moreover, some known feature values were extracted from phantom images and compared with the expected values.

Description of the radiomics calculator
Radiomics Calculator, RaCaT, calculates and returns a wide range of radiomic features for all kind of medical images. It is a standalone executable written in C++ that can be called from the command line but also from any programming language. It loads and preprocesses an input image and the corresponding mask, it calculates radiomic feature values and stores them in a user-defined output file. Furthermore, it stores the used preprocessing and feature calculation information in a separate file so that the user can easily track which settings were used for feature calculation. The workflow of RaCaT is illustrated in Fig 1. code of RaCaT is available and can be downloaded, modified if required, and be built from source. A precise description of the building process and which requirements have to be met can be found on GitHub (https://github.com/ellipfaehlerUMCG/RaCat).

Implementation of the calculator
The implementation of RaCaT is highly modularized and therefore easily extendable. It consists of two basic classes and several feature group classes. The basic classes are used for reading and storing the information which is later passed to the feature group classes: One basic class reads and stores the parameters given in configuration files, necessary for image preprocessing, while the second class reads and preprocesses image and mask and stores the important image characteristics. All image preprocessing steps are implemented using the library Insight Toolkit (ITK) [13]. The following steps are implemented: 1. Image interpolation: The user can choose if the image should be interpolated using 3D or a slice-by-slice 2D interpolation. For both options, the image can be up or down sampled or it can be interpolated to isotropic voxels with 2 mm voxel size. The required interpolation algorithm can be set by the user (possibilities: trilinear, cubic spline or nearest neighbor interpolation).

Image discretization:
Before the calculation of textural features, the image is usually discretized. Two discretization methods are implemented: a discretization with a fixed number of bins and a discretization with a fixed bin width. The number of bins as well as the bin width can be set by the user. Furthermore, the option to discretize the feature group intensity volume histogram separately is implemented.

Re-segmentation:
In order to only include intensity values of a certain range in the volume-of-interest (VOI), the user can set a maximum and minimum intensity value that should be included in the VOI. Furthermore, RaCaT also supports the option to exclude outlier intensities of the VOI.
Every feature group is realized with a separate class. If classes share feature calculations, the classes inherit from each other, but every feature group is independent and can be calculated separately. Every feature is calculated with a separate function. Therefore, additional feature calculations can be added easily. RaCaT is published under the BSD 3-Clause "New" or "Revised" License, what means that users do not have to submit their changes to the RaCaT repository, but that they have to mention the copyright of RaCaT when redistributing the code. Fig 2 displays as an example the implementation of the NGTDM feature class. The attributes of the class are the NGTDM features. For each of these features, a separate function is implemented assigning the feature value to the attribute. Furthermore, the class contains one separate method for the calculation of the NGTD matrix, as well as functions to fill and store the output files. All NGTDM feature groups inherit attributes and feature calculation functions from this class. While every NGTDM feature group calculates a different NGTD matrix and has separate functions to fill and write the output files. All other textural feature classes are implemented in the same way. A more detailed documentation of the code is available on GitHub.

Documentation
The documentation is split in two parts: One part is written for users who are only interested in the use of the calculator. The second part explains more detailed the programming steps, lists classes and functions, and explains the heritages of the feature classes. Additionally, the code contains more comments which are not visible in the documentation.

Usage of the calculator
In order to run RaCaT, some essential files have to be provided to the software which are described in more detail below. The locations of these files have to be given as parameter to the executable, accompanied by specific abbreviations. All required files including the abbreviations are listed in Table 1. Fig 3 illustrates the steps which have to be performed before the feature calculation starts: • First, a configuration file has to be modified: here the desired preprocessing steps can be specified. If the same preprocessing steps are used for several images, the configuration file can be reused and has to be changed only once in the beginning. • Second, if the input image is a PET image, also a patient information file has to be provided. This patient information file contains all important parameters regarding patient demographics and PET study information required to apply scaling of image intensities (activity concentration in Bq/mL) to SUV.
• Third, the user can optionally select only certain features for calculation. He can do this by adapting a feature output definition file.
Examples of frequently used configuration and feature output definition files as well as a patient information file can be found on GitHub. Furthermore, example commands how to call the executable with different image types can be found in the supplemental (S1 Fig).

Feature calculation
RaCaT contains ten feature groups: morphological features providing information about tumor shape, a group of first-order statistical features, statistical intensity histogram features, intensity volume features and local intensity features. Furthermore, the following textural feature groups are implemented: grey-level co-occurrence matrices (GLCM) [14], grey-level runlength matrices (GLRLM) [15], grey-level size zone matrices (GLSZM) [16], grey-level distance zone matrices (GLDZM) [17], neighborhood-grey-tone difference matrices (NGTDM) [2] and neighborhood-grey-level dependence matrices (NGLDM) [18] (see Table 2). First-order, morphological and local intensity features are calculated before discretization. All other feature groups are calculated after image discretization. All textural features can be calculated slice by slice (2D) and by including the whole volume (3D). For both dimensions, different ways to merge texture matrices and features are implemented. This includes the following options: • For each 2D directional matrix, features are calculated and then averaged over the 2D directions and slices • 2D directional matrices are first merged per slice, then features are extracted from this matrix • The 2D directional matrices are merged per direction and then the average of each direction matrix is calculated. From this matrix, features are extracted.
• Before feature calculation, all 2D directional matrices are merged.
• Features are extracted from each 3D directional matrix. These features are averaged over directions.

Required input files
Image and VOI. An image and a corresponding image mask (or VOI) are required as input for the calculator. Mask and image should be aligned and have the same dimensions. The mask can either be provided as binary mask with any constant value marking the VOI (usually 1) or the VOI can be marked by intensity values of a certain range. In this case, the user can set the threshold value up to which percentage of the maximum value the voxels should be included in the mask. This can be done by changing the parameter ThresholdFor-VOI in the configuration file. Mask and image can be given in one of the following formats: nrrd, nifti, DICOM, analyze, as well as raw data. The mask can also be given as a radiotherapy (RT)-struct. If the mask is given as RT-struct, the command to call the executable is slightly different from the call used for the other formats (see Table 1). If image or mask are in DICOM format, it is important that every DICOM image series is stored in a separate folder. The name of this folder has then given to the executable (compare also with the example commands provided in the supplemental S1 Fig). In one run, RaCaT calculates the radiomic features for one image and mask. It is not possible to calculate radiomic features for several images at once. However, an example of a Python script, calling RaCaT for several images and masks is available in GitHub material.
Configuration file. In a config.ini file, the user can select the preprocessing steps that are performed before the feature calculation starts. An example for a config.ini file is displayed in Fig 4. More examples of the config.ini file including the most common used preprocessing steps, can be found in GitHub. If the user wants to calculate radiomic features for several bin width or number of bins, a separate configuration file has to be created for every configuration. As this can be time consuming, a python script how to create several configuration files with different number of bins as well as a script calling the RaCaT executable several times with different configuration files is available.
Additional file for PET images. If the image is a PET image, the program converts the intensity values from Bq/ml to SUV (Standardized Uptake Value) or SUL (Standardized Uptake value normalized to lean body mass). Here fore, some patient characteristics (weight, height, gender) as well as the net injected activity and injection time are required. Furthermore, the user has the possibility to set a scaling parameter. If this scaling parameter is set, all other values are ignored and every image intensity value is simply multiplied with this scaling parameter.
Feature output definition file. Furthermore, the user has the option to select only certain feature groups he wants to include in the calculation. This can be done in a separate file called featureOutputDefinition.ini. This is optional. The location of the feature output definition file has to be given as parameter to the calculator. If no feature output definition file is given, all available features are calculated. An example of a feature output definition file is displayed in Fig 5. Output files. The calculated feature values are stored with floating point precision in one or more comma-separated-value (csv) files. The feature names which are listed in the output files are the names proposed by the IBSI standard. To ease a further documentation, two additional output files are created: The first output is a copy of the used configuration file so that the user can easily access which preprocessing steps were included in the feature calculation. The second output file contains information about the input images and calculated feature groups. The filenames of all output files are aligned so that the user can easily track which output files are belonging to one calculation step.

Testing
To ensure that the toolbox calculates values compliant with the standard, the calculated feature values were compared with the IBSI standard. The initiative provides two phantoms that can be used for comparison: one small, artificial mathematical phantom and a CT-image with a corresponding RT-struct of the VOI. For the CT-image, several configurations with variations in discretization, resegmentation, and interpolation method are available for comparison. In order to validate RaCaT, both phantoms have been used for comparison. The calculated feature values, as well as the corresponding IBSI values are listed in supplemental S1 Table, S2 Table, and S3 Table. For every feature value, IBSI provides tolerance levels depending on the used configuration. As can be seen, almost all feature values are in the provided tolerance levels. Only for morphological features, small deviations were found. Here, the volume differed from 0.2% -1% from the volume given by the IBSI standard, while the surface had a deviation from 2%-10%. Therefore, all morphological features which are dependent of surface and volume also show slight deviations.
Therefore, morphological feature values were further compared with values obtained from a phantom scan. For this purpose, a positron emission tomography combined with computed tomography (PET/CT) scan of the NEMA image quality phantom was acquired on a Siemens Biograph mCT64 (Siemens Healthcare, Knoxville, USA) (see Fig 6). The NEMA image quality phantom consists of six spheres with diameters 37, 28, 21, 17, 13, and 10 mm which are placed in a large background compartment. Spheres were filled with a fluorodeoxyglucose (FDG) activity solution of 19.76 kBq/ml, while the background was filled with 1.94 kBq/ml, so that a sphere-to-background ratio of around 10:1 was obtained. The image was reconstructed to a voxel size of 3.1819 x 3.1819 x 2 mm using the vendor provided PSF+TOF reconstruction method with three iterations and 21 subsets (PSFTOF 3i21s). The spheres were manually delineated in the images by placing a sphere with the exact diameter on the right position in the images. Consequently, the correct shape feature values are known and can be used for comparison with the feature values calculated by RaCaT. The expected and calculated feature values are listed in Table 3. The comparison showed that for the bigger spheres (diameter 37-17 mm) the percentage deviation between calculated and expected shape feature values differs from 1-10%, with 93.75% of the features showing a deviation less than 5% (see Table 3). For the smaller spheres (13 mm and 10 mm), the deviation increased to 1-19%.

Application to clinical data
Moreover, radiomic features were extracted from two PET-images of cancer patients. Both patients were scanned on a Siemens Biograph mCT64, and the images were iteratively reconstructed using the PSF+TOF reconstruction method (PSF+TOF 3i21s) implemented in the scanner and a post-reconstruction smoothing with a 6.5 mm full-width-at-half-maximum Gaussian kernel. Images were reconstructed to a voxel size of 3.1819 mm x 3.1819 mm x 2 mm. Patient 1 was injected with 245 MBq 85 minutes before scan start, while patient 2 was injected with 229 MBq 60 minutes before scan start. Maximum intensity projection images of both patients are displayed in Fig 7. Tumors were manually delineated by an experienced radiologist. All implemented features were calculated by RaCaT and are listed in Table 4. As can be seen, feature values are changing as a function of the tumors.

Discussion
We developed a radiomics calculator that is easy to use and can be called from any programming language. It includes the most frequently used preprocessing steps and complies with the IBSI standards. It can handle several input image formats as well as different VOI types. The created output files are organized in a way that eases further processing of the feature values. In this way, the calculator can be included easily in any radiomics pipeline and the results can be used for further analysis. Furthermore, all preprocessing steps are reported, so that a valid documentation of the performed preprocessing steps can easily be extracted from the output files.            To make radiomic studies comparable across studies and institutions, it is essential that the different radiomic software packages calculate the same feature values for every defined feature. Therefore, the standardization of feature definitions and calculations is essential [19,20]. IBSI provides benchmark feature definitions and feature values extracted from phantom scans. As RaCaT follows these definitions and calculates feature values in compliance with these standard, it could be used to standardize other software packages.
Some small deviations were found in the calculation of the morphological features when compared with the IBSI standard. These deviations include the calculated volume and surface of the object and therefore all features depending on these two values. These deviations are due to a different implementation of the 3D presentation of the image mask. Also when comparing the morphological features extracted from the spheres of the NEMA image quality phantom, the deviations between ideal and calculated volume were in the majority of the cases small. Only for the smaller spheres, the deviation increased. This increase in deviation is likely more due to the partial volume effect than to mistakes in the implementation. The partial volume effect has especially an impact on smaller objects.
One limitation of RaCaT is that it does not provide any Graphical User Interface or automatic algorithm to perform segmentation tasks. It calculates radiomic features from previous performed segmentations. Moreover, after feature calculation, it provides also no further processing of the calculated features. I.e. no machine or deep learning algorithm are implemented and RaCaT can therefore not directly be used to build predictive models. However, as it can be called by any programming language, it can easily be included in any machine or deep learning script.

Further development
The following additional features will be implemented in further releases:

Additional discretization methods
Many other ways for image discretization have been proposed. Among them intensity histogram equalization and the Lloyd-max algorithm [21]. The next release will include both discretization methods.

Read several DICOM series stored in one folder
When images are extracted from the scanner, different image series are often stored in one folder. For a future release, it will be possible to read a folder containing several DICOM image series and the program will calculate for every DICOM image series the features separately.

Several tumors in one mask
Up to now, the calculator can only handle masks that come with one marked VOI. For future releases, it will be possible that more than one VOI can be marked in one mask and the feature values of the different VOIs will be calculated separately.

Additional distances for the calculation of textural matrices
In the current version of RaCaT only distance 1 is used for the calculation of textural matrices as this is the common distance for calculations. In future releases, also other distances can be set by the user.

Additional output formats
Up to now, the output is only available as .csv file. It is planned that an output in ontology format [22] is also available.

Conclusion
We implemented and tested successfully RaCaT, an easy to use Radiomics calculator that can be included in any programming language or used from the command line. The calculated features are meeting the IBSI standards. The calculator is ready to use without requiring any programming skills, but can also be downloaded, built from source and extended if needed. As the implementation of the calculator is highly modularized, it is easily extendable. A documentation including the description of how to use the calculator as well as a more extensive description of the programming concepts, can be found on GitHub.