The shape language in application to the diagnosis of cervical vertebrae pathology

In this paper the possibility of classification of X-ray images of the cervical vertebrae is studied. The images should be classified into one of the following classes—the images of healthy vertebrae and the images of vertebrae with syndesmophytes. The vertebra contours, described unambiguously by using the generalized shape language, are the basis of the analysis. As a result, the contour is represented as a chain of sinquads that determine switches. The found switches are the characteristic points of the analyzed contour. In these points additional features of the contour are determined. On the basis of these features two aforementioned classes of images are defined as fuzzy sets. Such an approach allows us to create a hierarchical algorithm of classification based on the syntactic and fuzzy description of the contour.


Introduction
In recent years the number of medical examinations which consist in analyzing X-ray images of bones has increased rapidly [1]. In general, the images are analyzed in two aspects: the analysis of bone density and the analysis of bone structures. It should be also mentioned that the width of joints is analyzed as well, and this topic has been worked out relatively well [2][3][4][5][6]. Both the aforementioned aspects of bone images analysis provide crucial pieces of information about pathological changes and, as a consequence, play an important role in diagnosis and assessment of the disease progress-see the next section for more details. According to the aforementioned large number of X-ray medical images, the methods of their automatic analysis are being sought intensively. In particular, the following topics are studied: • context-based retrieval of medical images [1,7,8], • automatic localization of cervical vertebrae [9,10], • analysis of contours of finger bones [11][12][13][14][15][16][17][18], • application of image languages to analysis of radiological palm images [19,20], • shape representation based on statistical methods [21]. PLOS  The syntactic analysis of bone contours is among the methods of studies which are based on geometrical features of the examined object. The syntactic pattern analysis, called sometimes syntactic pattern recognition, and the syntactic scene analysis are performed on the basis of a formal, structural representation of the studied object or scene [22,23]. In the syntactic object analysis, the simplest elements from which the object is constructed-so-called primitives-and the structural relations between primitives are studied, whereas in the scene analysis the spatial relations between objects are analyzed, as well. In the analysis of medical X-ray images, bone structures are analyzed both on the level of a single pattern analysis and on the level of the scene analysis. In the first case, the shape, usually the contour of a bone, is examined, whereas in the second case the spatial relations among the bones are investigated. For both cases the syntactic methods are used-string languages are used for the contour analysis [12][13][14]24] whereas graph languages are applied to the analysis of the anatomical structures constituted by a group of bones, for instance in a hand [19,20]. Since the biological structures are irregular, the syntactic approach is frequently aided by fuzzy methods. The perspectives of assessing the disease progress are discussed as well. The proposed syntactic approach, based on the shape language [25,26], is combined with fuzzy methods. As a result, a hierarchical analysis is proposed.
Syntactic methods have their own specific nature. The fact that these kind of methods are extremely sensitive to the pattern distortions is among the crucial ones. Therefore, the analyzed patterns, the bone contours in the considered case, should be obtained from the carefully preprocessed images. The effective preprocessing of medical X-ray images is difficult and the achieved results are far from the satisfactory ones [15]. On the one hand muscles, bones, cartilage, and tendons have various coefficients of X-ray absorption. On the other hand, they cover mutually in a complex way. Therefore, the preprocessing works well with soft-tissue objects but it is poor with bones [27]. This results in creating false bone edges and the discontinuities of contours [28]. That, in turn, causes that such changes as erosions, osteophytes and syndesmophytes are difficult to be detected in early stages of development. Furthermore, it is reported that the contrast of spine X-ray digitized images is low and, as a result, the image quality is poor [1,7,8]. The aforementioned problems make the preprocessing of the X-ray image a challenging task. The images used in the studies described in this paper were preprocessed by using Statistical Dominance Algorithm (SDA, for abbreviation) that is dedicated to preprocessing medical images [29,30]. Application of the algorithm resulted in obtaining contours that have sufficient quality to be the basis for application of the proposed approach.

Clinical background
Let us present a clinical motivation for the presented studies [31,32]. Spondyloarthritis (SpA, for abbreviation) represents the second most prevalent inflammatory rheumatic group (ca. 2% in Caucasians) and it is characterized by chronic inflammation and structural damage involving the axial and peripheral skeleton. SpA in adults consists of several diseases, i.e. ankylosing spondylitis, psoriatic arthritis, reactive arthritis, arthritis in inflammatory bowel diseases and undifferentiated spondyloarthritis. All the diseases share similar axial (sacroiliitis, spondylitis) or peripheral (arthritis, enthesitis, dactylitis) manifestations. The disease is a significant burden both for the health system and for an individuals quality of life because the patients have several unfavorable consequences of chronic inflammation and structural damage to the skeleton. Formation of syndesmophytes in the vertebral bodies of the spine-see Fig 1-is the key issue in structural damage in SpA. Syndesmophytosis is also referred to as osteogenesis or osteoproliferation and from a pathophysiologic point of view, it is a new bone formation. Therefore, in SpA the interaction between chronic inflammation and bone tissue results in a new bone formation that is responsible for the remodeling of the spine, which becomes stiff and inflexible. The remodeling of the spine is associated with decreased quality of life, including daily activities, employment, family life and leisure time. The imaging assessment of growing and established syndesmophytes is of supreme importance for planning lifelong therapy. Fast progressing syndesmophytes requires complex and aggressive therapy, including non-steroidal anti-inflammatory drugs and biologic agents, eg. Tumour Necrosis Factor inhibitors which attenuate the chronic inflammation may inhibit or reduce the osteogenesis. Different scoring methods based on X-ray imaging have been developed for assessing structural damage in SpA, mainly the formation of syndesmophytes. Currently, modified Stoke Ankylosing Spondylitis Spine Score (mSASSS, for abbreviation) is the only system that proved to be reliable and sensitive to change, and, therefore, it is preferred in clinical practice both for detecting and following the disease progression. Nevertheless, the minimal time interval to reveal the least significant change in the structural damage progression has been established for two years, which seems to be, an excessively long period for treatment decision making, including monitoring of the effectiveness of the treatment. In other words, continuation, modification or The shape language in application to the diagnosis of cervical vertebrae pathology PLOS ONE | https://doi.org/10.1371/journal.pone.0204546 October 11, 2018 discontinuation of expensive therapies of significant numbers of SpA patients should be based on the response-to-treatment assessment, by answering the question whether the therapy inhibits new bone progression. That is why new methods of osteogenesis follow-up are of significant importance, particularly if they could make the two-year-time interval of mSASSS assessment shorter.

The state of the art
The methods of X-ray images computer investigations, based on the analysis of the shape of anatomical structures, are not used widely because of their complexity and sensitivity to distortions. In this section some examples of the analytical approaches of bone contours and the related problems in digital X-ray images are briefly recalled.
In the papers [9,33] and [10] the problem of automatic localization of cervical vertebrae was considered. The presented approach is based on the generalized Hough transform, introduced in the paper [34] in order to detect curves which cannot be described by using an analytic formula, as in (simple) Hough transform. As a result, an arbitrary object in the image can only be recognized if it is encoded by its model which also represents the variability of the shape of the recognized object. This approach is invariant to scale and rotation as well as noise.
An important research stream concerns the description of the shape in the context of retrieval of medical images from database. The papers [7,8] are examples of such studies. In the presented approach the vertebrae contours are encoded by using polygon approximation. In this method, the description of the shape of the analyzed contour is based on three mechanisms. Simplifying the shape description by keeping only relevant features is the first one. It is achieved by selecting the points which have the largest contribution to the shape. Representing the contour in the tangent space by using, so-called, turn function is the second one. The similarity of the contours, described by their turn functions, can be obtained by using similarity measurement which is the third mechanism the method is based on. These investigations were a significant contribution to shape-based retrieval techniques for biomedical images. There are a few other methods concerning shape description in order to retrieve content-based databases. Let us mention them very briefly. Some of them use shape properties such as elongation, perimeter, convexity and orientation, whereas others are based on invariant moments, and others are based on multi-scale shape representation-see [8,35] and references given there. Another approach to shape description, proposed in [21], is based on statistical methods. Creation of a database of qualitative anatomical features, derived from the images and based on image characteristics, is the aim of the studies. The model shape is created as the mean value of the sample shapes. A grey-scale profile is the second component the method is based on. In the paper, the method is used to describe the shapes of vertebrae. The authors connect the problem of automatic localization of vertebrae in the X-ray image with the problem of automatic retrieval of medical databases. It should be stressed that the problems discussed above, i.e. shape description in order to create and retrieve content-based databases of X-ray images and automatic localization of anatomic structures in the X-ray images, though related to the topic considered in this paper, are, however, different from it. In this paper, we focus on analyzing a single vertebra contour in order to not only detect the pathological changes in bones such as osteophytes and syndesmophytes but also to create a tool which allows us to assess the progress of the disease. The authors intend to create a software tool that compares the pathological changes in a given vertebra by using a series of X-ray images taken, let us say, every half year for a given patient. This will allow the physician to assess the speed of the changes that take place in the spine. Such an assessment is crucial to judge whether the applied therapy is effective. The most related problem is considered in [1]. The method is based on the aforementioned polygon approximation approach. In the context of detection of the bone pathological changes, only these fragments of vertebra contour where the pathologies can occur are described. The contour has to be described in a maximally compressed way because the pathological changes detection is based on the retrieval of databases in which bone contours with the considered pathologies are stored. The method described in this paper-see Section Cervical vertebrae contours analysis-is based on syntactic analysis of the contour. The analysis is combined with fuzzy inference. Such an approach allows us to describe and analyze very precisely all types of the pathologies which manifest in the vertebra contour.

The generalized shape language
The shape language, introduced by Jakubowski [36] as a syntactic tool for analysis of contours, was used by us as a theoretical starting point. In the shape language, sixteen primitives are defined-eight line segments and eight circle quadrants denoted as s ij , i, j 2 {1, 2, 3, 4}, see The analyzed contour is divided into primitives. Then, the connected strings of primitives are described in the terms of the shape language characterizations. For instance, the connected fragments, that consist of the primitives belonging to the same quadrant of the Cartesian plane, constitute the sinquad. The transitions between sinquads clearly define the characteristic points for the analyzed contour. Two different neighbouring sinquads constitute a biquad.
The key encodes the transitions between successive sinquads, which means that the key describes a sequence of biquads. Thus, the characteristics of the analyzed contour are described by the key. This sequence is analyzed in order to classify types of the recognized contours according to their geometric features. The described approach turned out to be effective in applications to manufacturing [25,37,38]. The Jakubowski's approach, however, turned out not to be the proper tool for bone contour analysis. First of all, bone contours are irregular, in particular, the arcs have variable curvature and, as a consequence, they cannot be segmented into primitives s ij . Therefore, in [13,14] and [39], substantial generalization has been introduced. Namely, the primitives are defined as the classes of abstraction in an equivalence relation. The relation is defined on the set of all smooth curves that have the same local geometric properties at each point. Let us assume that the analyzed contour is sufficiently smooth, which means that in each of its points, except at most the points in which the primitives join, the tangent line exists and the convexity is determined. This means that at each point the first and the second derivative of the contour that is considered, locally, as the graph of a function, can be calculated apart from the cases in which the denominator zeroes. The properties at the point u are described by a four-component vector c (u) = [c t (u), c c (u), c x (u), c y (u)]. The first two components, c t and c c , encode the information about the first and the second derivative. They can be equal to "+", "-" and "0" if the value of the derivative is positive, negative or equal to zero, respectively. If the denominator of the derivative zeroes at the point u, then the value of the corresponding component of the vector c(u) is encoded as "V". The components c x and c y encode the information about the increment of X and Y coordinate along the curve, respectively. In both cases, these components can be positive, negative or equal to zero and they are encoded as "+", "-" and "0", respectively. Each maximal fragment of the contour, which has the same aforementioned four characteristics calculated numerically at each point, is treated as a primitive, let us say p ij , i, j 2 {1, 2, 3, 4}. Thus, each p ij can be treated as the equivalent class to which all the segments of lines, which have the same aforementioned characteristics, belong. The index i corresponds to geometrical features of the primitives, whereas the index j corresponds to the number of a quadrant of the Cartesian plane. It turns out that there exist sixteen equivalence classes (see [14]) and the bi-index of the primitives p is defined in such a way that for each i, j 2 {1, 2, 3, 4}, s ij 2 p ij . Furthermore, the way of contours analysis remains the same as in the original Jakubowski's method although the primitives are substantially generalized. Let us put an example of encoding of a contour-see It should be stressed, however, that even the generalized shape language turns out to be insufficient for classification of the analyzed cases because of the existence of various types of pathological changes. Therefore, the application of hierarchical classification and the application of fuzzy sets algorithm are another improvements of the method. The tree structure of the classification is proposed-see the next section-in the context of analysis of cervical vertebrae contours. In some nodes of the tree the generalized shape language is used, whereas in the others a fuzzy inference algorithm is used.

Cervical vertebrae contours analysis
The analyzed vertebra contours have been obtained from X-ray images by using SDA preprocessing method [30] which is dedicated for preprocessing of X-ray medical images. As it has been aforementioned, the approach proposed in this paper is the syntactic one and it differs significantly from the methods used for description contours of vertebrae by other authorssee the state of the art section. The formalism introduced in the previous section was applied to the description and next to recognition of pathological changes of cervical vertebrae. The data set was acquired from the University Hospital in Kraków, Poland. The study protocol was designed according to the guidelines of the Declaration of Helsinki and the Good Clinical Practice Declaration Statement. Special care was taken regarding personal data safety where all images were anonymized before processing. Informed consent for the publication of anonymized clinical images was obtained from the Scientific Committee of the Department of Diagnostic Imaging. As the current study has a retrospective nature, therefore, a consent form for participants was omitted. The data set contained 166 examples of vertebrae, 33 of them were diagnosed as affected by syndesmophyte. In the experiment six vertebrae, denoted by K 0 , K 1 , K 2 , K 2 , K 3 , K 4 , K 5 , visible on the X-ray images were analyzed-see Fig 4. Since they differ regarding their anatomical structure, each of K i was treated as a separate set. In the first stage of analysis, the received contours of the vertebrae were described by primitives p ij . This description allowed us to divide a given contour into the sinquads that represented the fragments which belonged to the same quadrant of the Cartesian plane. The transitions between sinquads clearly defined the characteristic points for the analyzed contour. In the case of cervical vertebrae these points, called switches, were essential for its shape-see Fig 5.
To obtain a contour description by using the proposed primitives, the vector values of the components of the vector c were calculated. The calculations were carried out on the basis of several neighboring points. In order for small lesions in the outline to be noticed, the number of the points was established in an experimental way. Therefore, the length of the step of numerical calculations of the first and the second derivative was equal to 5 pixels [39]. The accepted step was also equal to the minimal length of primitives. Next, the description of contours by primitives was transformed, according to [13], into the keys. It denoted for the analyzed contour a sequence of biquads which were consecutive transitions between the sinquads. The received keys created equivalent classes that in some cases were sufficient for the process of recognition. It means that they coincided with the expected classification of an initial contour set. If the received equivalent classes do not distinguish the analyzed cases correctly, the fuzzy analysis, based on additional features of biquads, is introduced. In the case of cervical vertebrae, the received keys are presented in Table 1.
The description of vertebrae by strings of biquads allowed us to distinguish the healthy ones-see By early changes, it is understood that a given contour is still described by a typical string of biquads but it can be distinguished from healthy ones due to the features of biquads. Thus, the equivalent class of vertebrae with serious pathological changes does not contain subclasses. Nevertheless, the equivalent class of the healthy vertebrae and the one with early pathological changes consists of two subclasses-see Table 1. Therefore, the fuzzy analysis has to be applied only to this case. The diagram of the proposed method is presented in Fig 9.   Modifiers for the received membership functions were calculated in the way described in [40]. According to it, the area under each graph of a membership function was divided as follows: ; for n 2 ½0; 1; S mtrue ¼ ðb À aÞðn À 1Þ n ; for n 2 ½2; þ1Þ and S t ¼ ðb À aÞn 4 ; for n 2 ½1; 2; The value n corresponds to the modifier m. The interval [2, +1) characterizes the terms with a linguistic truth value greater than or equal to true, whereas [0, 1] characterizes the terms with a linguistic truth value less than or equal to false. The interval (1, 2) allows all the possible variants between true and false to be expressed. Thus, for the membership function m 0 syn , which identified vertebrae in the class K 0 with syndesmophyte, the following areas were calculated:  Table 1. https://doi.org/10.1371/journal.pone.0204546.g008 The shape language in application to the diagnosis of cervical vertebrae pathology if 0:29 ! m 0 syn then a vertebra does not belong to the class with syndesmophyte.
In turn, for the membership function m 0 non s yn ðaÞ; which identified the healthy vertebrae, the calculated areas under this function is: the ones that defines the label true: The received rules have the following form: Analogically, the fuzzy rules for the other classes were created. The received results of classification of some vertebrae are presented in Table 2.
If the value of a membership function is equal to 1 or around 1, then a given vertebra fully belongs to the specified class. If the value is around 0, then it does not belong to this class. In all the cases, the received results coincide with the classification made by an expert but, additionally, the information about diversity in both considered classes was received. In the case of the healthy class, this diversity was small and it resulted from anatomical differences. In the case of the class with syndesmophytes, the diversity was larger and it resulted from the size of pathological changes. In Fig 12 there are three examples of vertebrae from the set K 1 with different values of membership functions m 1 syn . The aim of this paper was to show that the proposed method allows us to classify all vertebrae correctly. If we adopt the simplest rule, namely that a given vertebra belongs to the class for which it has the greatest value of the membership function, then we get 100% correctness. Of course, the different degree of belonging to the class with syndesmophytes is very important in terms of the possibility of assessing the progress of a disease.

Concluding remarks
The method of hierarchical analysis of contours of vertebrae, presented in this paper, is based on syntactic and fuzzy pattern analysis. It should be mentioned that the method is analogous to the method that was applied by the authors to the analysis of contours of finger bones-see [11,12,15] and, first of all, [13]. Though it is related to a few streams of studies which concern the analysis of bone contours, including the contours of vertebrae, the approach, proposed in this paper, is based on a different basis and it is a novel one. The method achieved 100% accuracy provided that the pattern was classified to this class which had a greater value of a  The symbol x means that a vertebra is not visible at the X-ray image.
https://doi.org/10.1371/journal.pone.0204546.t002 membership function. This means that every pattern has been classified as a healthy bone or a bone with pathological changes in the same way, by the algorithm and by an expert. As it has been aforementioned, in the introduction, the proposed method is considered not only in the context of detection of pathological changes in bones but also in the context of the possibility of assessing the disease progress. The fact that the value of a membership function depends on the size of a pathological change is a good starting point for creating a tool which allows us to infer about the progress of the disease. This is planned to be the topic of our future work. It should also be mentioned that the proposed method is not limited to medical application. It can be potentially effective in each problem in which contour analysis according to its shape properties is one of the key tasks. Scene analysis by cognitive vision module of autonomous robots can be put as an example of such a problem [41][42][43]. In the mentioned papers the industrial scene is represented by using only polygonal shapes. It should be stressed, however, that such representation can be insufficient in, for instance, oriental cities, where a lot of domes exists. In such cases both curvilinear and line segments are necessary to represent the objects and, for this reason, the proposed approach can be applied in all its length. The contour representation and analysis in the context of the scene analysis is also examined for solving superimposition problems for aerial photographs in an onboard computer vision systems [44,45]. In this problem contours of natural objects such as river banks and contours of roads are analyzed in order to the work out correction methods for navigation parameters. The landscape representation and analysis in the context of the autonomous robot navigation is also studied in the context of representing the contours of obstacles [46]. The proposed method is potentially applicable to this task as well.