Electrocardiogram ST-Segment Morphology Delineation Method Using Orthogonal Transformations

Differentiation between ischaemic and non-ischaemic transient ST segment events of long term ambulatory electrocardiograms is a persisting weakness in present ischaemia detection systems. Traditional ST segment level measuring is not a sufficiently precise technique due to the single point of measurement and severe noise which is often present. We developed a robust noise resistant orthogonal-transformation based delineation method, which allows tracing the shape of transient ST segment morphology changes from the entire ST segment in terms of diagnostic and morphologic feature-vector time series, and also allows further analysis. For these purposes, we developed a new Legendre Polynomials based Transformation (LPT) of ST segment. Its basis functions have similar shapes to typical transient changes of ST segment morphology categories during myocardial ischaemia (level, slope and scooping), thus providing direct insight into the types of time domain morphology changes through the LPT feature-vector space. We also generated new Karhunen and Lo ève Transformation (KLT) ST segment basis functions using a robust covariance matrix constructed from the ST segment pattern vectors derived from the Long Term ST Database (LTST DB). As for the delineation of significant transient ischaemic and non-ischaemic ST segment episodes, we present a study on the representation of transient ST segment morphology categories, and an evaluation study on the classification power of the KLT- and LPT-based feature vectors to classify between ischaemic and non-ischaemic ST segment episodes of the LTST DB. Classification accuracy using the KLT and LPT feature vectors was 90% and 82%, respectively, when using the k-Nearest Neighbors (k = 3) classifier and 10-fold cross-validation. New sets of feature-vector time series for both transformations were derived for the records of the LTST DB which is freely available on the PhysioNet website and were contributed to the LTST DB. The KLT and LPT present new possibilities for human-expert diagnostics, and for automated ischaemia detection.


Introduction
Ambulatory electrocardiogram (AECG) monitoring of long term electrocardiogram (ECG) records, obtained during the patient's normal daily activities, is important in the assessment of symptomatic and asymptomatic, or "silent", ischaemia which may lead to myocardial infarction and consequently death. Due to the long duration of records (24 hours), which means an enormous amount of data, and due to the possible presence of severe noise, automated procedures for the extraction of diagnostic and morphologic features are becoming very important. A convenient method for the representation and tracking of transient signal-shape changes are time series of features. Automated analysis is of great help to clinicians in early assessment of cardiac ischaemia severity for the accurate interpretation of relevant clinical results and for proper treatment of the patient. Fig 1 shows two typical data segments of AECG records. A transient ischaemic ST segment episode compatible with ischaemia is present in the upper data segment (Fig 1A). An increased heart rate and transient morphology change of the ST segments of heart beats may be observed. The lower data segment (Fig 1B) is an example of severe noise which is often present in AECG records and cause the main problems during the visual and automatic assessing of the severity of ischaemic ST segment episodes. of using the KLT in ECG signal analysis was noise estimation [4], visually identifying acute ischaemic episodes [5], the representation of ECG morphology [4,6], the automated detection of transient ST segment episodes during AECG monitoring [7,8], the analysis of the cardiac repolarization period (ST-T complex) [9][10][11][12], visually identifying and manually annotating the transient ischaemic and non-ischaemic ST segment episodes of the LTST DB [1], and automated ischaemic and non-ischaemic heartbeat classification [13]. The Hermite polynomials were used for estimating ECG wave features [14] and for clustering ECG complexes [15]. The Hermite, Legendre and Chebyshev polynomials were used for filtering and representing ECG morphology [16]. The feasibility of ECG feature extraction and representation of transient ST segment morphology of mice ECG using the Chebyshev-polynomial based transformation was shown [17]. Furthermore, combinations of the KLT, Legendre polynomials and a variety of other ECG features were used for transient ischaemic and non-ischaemic ST episode classification [18][19][20].
Robust methods for parameter extraction for use in intensive care units are becoming important [21]. The severe noise frequently present in AECGs necessitates the development of robust noise resistant delineation systems to accurately trace transient changes of ST segment morphology during myocardial ischemia in terms of diagnostic and morphologic feature-vector time series. Robust parameter extraction techniques yield nearly the same performance no matter which records are analyzed.
In this paper, we present a new robust delineation method for ST segment morphology feature extraction, and transient ST segment morphology-change representation and tracing in the sense of generation of ST segment diagnostic and morphologic feature-vector time series using orthogonal transformations. We present a new approach for shape representation of transient ST segment morphology changes using the orthogonal Legendre polynomials, i.e., Legendre Polynomial based Transformation (LPT) of ST segment. Furthermore, we develop new ST segment KLT basis functions derived from a robustly generated covariance matrix composed of the ST segment pattern vectors of the entire LTST DB. We then assess the representational power of the KLT-and LPT-based derivation of morphology feature-vector time series through a study on the representation of significant transient ischaemic and non-ischaemic ST segment morphology categories. We also evaluate the classification power of the KLTand LPT-based feature vectors to distinguish between the ischaemic and non-ischaemic ST segment episodes of the LTST DB. Fig 2 shows the ECG of a normal heartbeat with marked points and intervals to estimate the ST segment diagnostic and morphologic features that are relevant to represent, monitor and characterize transient ischaemic and non-ischaemic ST segment changes. The diagnostic ST segment feature, like ST segment level, provides direct measurement of raw ST segment pattern vectors in time domain in a single point at a fixed location (J + 80(60) ms), while orthogonal transformation-based ST segment morphologic feature vectors utilize information from the entire ST segment, thus providing high representational power in terms of ST segment morphology categories, as well as subtle morphology features, and differentiation between transient ischaemic and non-ischaemic ST segment events.

Methods
The motivation for a new approach using the orthogonal transformation of ST segment based on orthogonal polynomials comes from observing the shapes of the ST segment KLT basis functions [7] obtained from the European Society of Cardiology ST-T Database (ESC DB) [2,22], the standard reference for assessing the quality of AECG analyzers. These basis functions (see Fig 3) span over two ECG leads. We found that the shapes of the KLT basis functions are similar to the three main morphologic categories of ST segment morphology changes. The first and second KLT basis functions have shapes similar to a constant function and thus correspond to the elevation or depression of ST segments. The third and fourth KLT basis functions have shapes similar to a linear function and thus correspond to the slope of ST segments. The fifth (and sixth) KLT basis function has a shape similar to a quadratic function and thus corresponds to the scooping of ST segments. However, for these basis functions there is no natural mapping between the deflections of ST segment KLT feature-vector time series and the actual deflections of ST segment morphology change categories like: depression/elevation, up-and down-sloping, and scooping. Therefore, we concluded to devise a set of new orthogonal basis functions that span over a single ECG lead, with similar characteristics as the KLT, and with the further advantage of strict correspondence to elevation/depression, slope change, and scooping, to better delineate the transient shapes of ST segment morphology changes. The first three Legendre polynomials (a constant, linear function and square function) uniquely possess these shapes and they are also orthogonal. We used them to derive a new set of basis functions for the Legendre Polynomial-based Transformation of ST segment ( Fig 4A) that span over a single ECG lead.
In addition, we derived a new set of ST segment KLT basis functions, which also span over a single ECG lead, from the entire collection of the records of the LTST DB using a robust covariance matrix, with outliers due to noise rejected. The LTST DB database contains approximately a ten times larger ECG data set compared to the ESC DB database, covers a considerably greater amount of "real-world" data, and spans a wide variety of significant ischaemic and non-ischaemic ST segment episodes and other ST segment morphology change events due to axis shifts and conduction changes. The shapes of the newly derived KLT basis functions ( Fig  4B) are more similar to typical morphology shape changes of ST segments, very similar to the LPT basis functions, and allow more accurate single-lead tracing of ST segment morphology change categories.

The Legendre Polynomial based Transformation of ST segment and derivation of the LPT basis functions
The Legendre polynomials [23] are a class of orthogonal polynomials. They are solutions to the Legendre differential equation. The first five Legendre polynomials are: The Legendre polynomials can be generated by the following recurrence relation: ðj þ 1Þ P jþ1 ðxÞ À ð2j þ 1Þ x P j ðxÞ þ j P jÀ1 ðxÞ ¼ 0 ; j ¼ 1; 2; 3; ::: : They are orthogonal over the range [-1,1] satisfying the orthogonality relationship: Z 1 À1 P n ðxÞ P m ðxÞ dx ¼ 2 2n À 1 d mn ; n ¼ 1; 2; ::: ; m ¼ 1; 2; ::: ; where δ mn is the Kronecker delta. The Legendre polynomials possess the desired orthogonality and desired shapes. Orthogonality is an important property for basis functions to be uncorrelated, thus preventing information scattering among different axes of the transformed space and enabling transformation reversibility and the derivation of residual errors. The shapes of the first three Legendre polynomials (a constant, linear function and square function) have the advantage of direct insight into the most important morphological changes of ST segments in  time domain (elevation or depression, positive or negative slope, and positive or negative scooping) through the feature space, if the polynomials are used as transformation basis functions. Such a similarity in terms of polynomial shapes can also be observed if considering some other classes of orthogonal polynomials like Chebyshev polynomials [17], however the Chebyshev polynomials loose the desired shapes (a constant, linear function and square function) in their orthogonal form.
The Legendre polynomials can also be generated by the Gram-Schmidt orthonormalization [24] to functions of the following form: on the interval [-1,1] with respect to the weighting function, w(x) = 1. In order to derive the discrete basis functions for the orthonormal transformation of the ST segment based on the Legendre polynomials, we first constructed a discrete matrix, O( ij ), which is composed from the polynomials Q j−1 , sampled at M = 32 points in the range [-1,1]: where ψ ij are the elements of the discretized orthogonal Legendre polynomials of the matrix C ( ij ), and w i = 1 is the discretized weighting function, w(x) = 1. Fig 4A shows the first few orthonormal LPT basis functions of the matrix F L = F( ij ), as the order of the polynomials increases. Due to discretization and a degree of numerical instability generally present in numerical algorithms, some loss of orthogonality in the generated basis functions is expected [25]. (Note that this is also true for the discrete KLT.) The LPT basis functions contained in the F L matrix are expected to be orthonormal, where I is the identity matrix. This holds to an adequately high degree of numerical accuracy. If the discrete Gram-Schmidt orthonormalization is applied to the polynomials Q j−1 , the result are orthonormal discretized Legendre polynomials which only slightly differ from their continuous analogue [25]. Due to the iterative nature of the generation algorithm, numerical errors grow with the increasing basis function number. We tested the orthonormality of the discrete LPT basis functions derived by calculating the values of elements of the identity matrix. For the first 10 LPT basis functions of the matrix F L , the maximal numerical error of the diagonal elements of the identity matrix, I, was 3,4 . 10 −6 , and for the off-diagonal elements it was 1, 0 . 10 −5 . The LPT expansion is thus based on mutually orthogonal Legendre polynomials used as the basis functions for the transformation. Since the Legendre polynomials were chosen intentionally due to their shape similarity to the KLT basis functions in the descending order of the associated eigenvalues, it can be reasoned that the Legendre polynomial based expansion contains most of the morphology information in the first few axes of the new coordinate system as well.

The Karhunen and Loève Transformation and the derivation of new KLT basis functions
The KLT expansion is based on mutually orthogonal eigenvectors belonging to eigenvalues in descending order of the covariance matrix associated with the pattern vectors. It is possible to approximate the pattern vector with least mean square error through a feature vector of reduced dimension in comparison to other suboptimal transformations. Detailes about the KLT can be found elsewhere [26,27]. To avoid the problem of sensitivity of eigenvectors to noise pattern vectors, we used the kernel-approximation method [28] by which we rejected noisy outliers and replaced pattern classes by their means yielding a robust covariance matrix [28].
To construct the robust covariance matrix, we used clean heart beats from the LTST DB left after preprocessing the records with robust KLT feature-space based noise and the outlier extraction procedure [7]. The procedure proved to be robust and accurate. On average, 8.51% of heart beats were rejected from each of the LTST DB records.
Then we attached the first and second lead of 86 total records from the LTST DB one after the other. The latter is justified since single lead basis functions independent of the actual physiological ECG lead are desired. Besides, physiological ECG leads are not consistently mapped to the same lead number in the records of the LTST DB. Thus we got 15,661,886 total pattern vectors from 7,830,943 clean heart beats from the database. Input pattern vectors of deviating intervals (ischaemic, non-ischaemic) and of intervals with no deviation of the records of the LTST DB were separated to form classes, which were then replaced by their means, and centralized by subtracting the mean vector obtained over all classes, thus forming a robust covariance matrix. There were 1642 ischaemic intervals, 510 non-ischaemic and 2298 intervals with no deviation. Fig 4B shows the first five newly derived ST segment KLT basis functions, F K , in the descending order of magnitude of their corresponding eigenvalues. Note the similarity between the KLT and LPT basis functions.

Derivation of ST segment diagnostic and morphologic feature-vector time series
The developed delineation method includes a preprocessing step, the derivation of traditional time domain diagnostic features, derivation the KLT-and LPT-based ST segment feature vectors, and construction of the feature-vector time series. The preprocessing step includes following essential tasks: heartbeat detection and classification, estimation of stable fiducial point, estimation of the iso-electric level for each heartbeat, noise removal, and removal of abnormal heart beats and their neighbors. In the preprocessing step, we applied: Aristotle arrhythmia detector [29] detecting and classifying heart beats, and estimating the stable fiducial point, an algorithm that looks for the PQ interval as the "most flat" signal interval prior the heartbeat's fiducial point and estimates the iso-electric level [30], removal of high-frequency noise using a 6-pole low-pass Butterworth filter with the cut-off frequency at 55Hz, removal of baseline wander using cubic spline approximation and subtraction technique, and removal of abnormal heart beats and their neighbors. Instantaneous heart rate, h(j), where j denotes the heartbeat number, is defined by consecutive measurements of RR intervals between heart beats. A sample of the ST segment level time series, s l (i, j), where i denotes the lead number, is defined as: where a 80(60) (i, j) is the ST segment amplitude of the j-th heartbeat, estimating the ST segment amplitude at the point J+80(60) ms, and z(i, j) is its iso-electric level. The point of measurement of the ST segment amplitude, a 80(60) (i, j), is linearly adjusted between the points F(i, j) + 160 ms and F(i, j) + 120 ms as the heart rate h(j) varies between 120 bpm down to 100 bpm. The value of the ST segment slope, s s (i, j), is estimated simply as the amplitude difference: In terms of the coefficient values of the feature vectors, tiny features of the pattern vectors corresponding to higher basis functions with lower standard deviations are thus emphasized. This way, each feature of the feature vectors is normalized with the corresponding standard deviation, so that its standard deviation, B is 1. Standard deviations ρ k and θ k of the KLT and LPT expansion coefficients are those computed on the basis of 7,830,943 clean heart beats from the LTST DB that were used for the construction of a robust covariance matrix to derive the KLT basis functions. Table 1 shows the values of the first five standard deviations, ρ j and θ j , of the coefficients of the KLT and LPT feature-vector time series in descending order. The magnitudes of the standard deviations computed, ρ j , of the KLT feature-vector coefficients are in descending order as expected. The magnitudes of standard deviations, θ j , of the LPT feature-vector coefficients appear to be in descending order too, as the order of the polynomials increases.
The normalization of the coefficients also allows one to express distances between the feature vectors in terms of the Mahalanobis distance measure, d(i, j), between the feature vector of the j-th heartbeat and the feature vector of the first heartbeat as a single-dimensional compound feature useful for more comprehensive visual, as well as machine based, analysis: ðs K; k ði; jÞ À s K; k ði; 1ÞÞ 2 ; ð18Þ where s K, k (i, 1) and s L, k (i, 1) are the feature vectors of the first heartbeat, and N D is the Mahalanobis distance measure dimensionality.

Results
The    4) and the morphology change category can quickly be visually determined from the trend plots. The time course of the LPT coefficient time series in example A shows depression (1st coeff.), positive slope (2nd coeff.) and positive scooping (3rd coeff.); while in example B, elevation (1st coeff.), positive slope (2nd coeff.) and positive scooping (3rd coeff.). This is consistent with the actual transient morphology change of the two episodes, and shapes and signs of the first three LPT basis functions. These significant transient morphology changes are clearly manifested and also visible in the KLT coefficient time series in both cases, but with the opposite sign of the first three KLT coefficient time series. This is due to the close similarity of the first three KLT basis functions to the constant, linear function, and square function, but with opposite sign of the three basis functions (see Fig 4).

Evaluation of the classification between transient ischaemic and nonischaemic ST segment episodes
Since ECG ST segment morphology delineation provides fundamental features to be used in subsequent automatic analysis, we performed an evaluation study on the classification power of the KLT-and LPT-based feature vectors, s K (i, j) and s L (i, j), to classify between the transient ischaemic and non-ischaemic ST segment episodes from the LTST DB. We set out to classify episodes of changed ST segment morphology from the entire LTST DB to the class of episodes caused by ischaemia and to the class of non-ischaemic heart-rate related episodes.
We used five established classifiers: k-Nearest Neighbors (kNN), Classification Tree (CT), Quadratic Discriminant Analysis (QDA), Support Vector Machines (SVM) with second-order polynomial kernel using Least Square method, and Naive Bayes Classifier (NBC) with distribution by kernel smoothing density estimate. Mathworks Matlab Statistics Toolbox algorithm implementations were used. Classification performance results are summarized in Table 2  Each episode of the LTST DB annotated in each single ECG lead according to the annotation protocol B [1] was represented by a mean feature vector of the KLT and a mean feature vector of the LPT coefficients, derived from an interval of 20 seconds around the episode extreme. Classification performance evaluation results are shown in Table 2. Classification power was tested using three different feature transformation-coefficient subsets: coefficients 1-3, coefficients 1-5 and coefficients 1-8. These feature vectors were used as input data for classification performance evaluation by 10-fold cross-validation with 10 repetitions. Classification was performed separately with the KLT and LPT feature vectors. The resulting input data consisted of 1130 instances of ischaemic ST segment episodes and 234 instances of non-ischaemic heart-rate related ST segment episodes. The highest classification performance was obtained using the kNN, k = 3, the KLT coefficients 1-8, Se KLT = 91%, Sp KLT = 85%, and CA KLT = 90%.

Discussion and conclusions
The Legendre polynomials as basis functions of the transformation of the ECG ST segment proved to be convenient for the purposes of feature extraction and shape representation because of their simple generation process and orthogonality. Visual examination confirmed that the new LPT approach based on the Legendre polynomials is a valid representation of ST segment morphology. It has the unique additional benefit of direct insight into the clinically Clinicians can easily examine important features by visual observation of the LPT feature-vector time series trends spanning several hours on a single display. This is significantly faster than evaluating ST segment morphology changes on the level of individual heart beats or examining feature-vector time series without clearly understandable time domain meaning. On the other hand, the developed delineation method, especially using the KLT, offers a more comprehensive estimation of ST segment features for automated systems in contrast to the traditional time domain measurements (ST segment level and slope at fixed points in the ST segment) since the orthogonal transformation based techniques extract information of morphology from the entire ST segment. Another advantage is that detection of the J point can be omitted.
The results of the evaluation study on classification performance (Table 2) show the potential of the feature vectors based on the new KLT and LPT basis functions for classification between ischaemic and non-ischaemic ST segment episodes. Besides assessing classification performance in distinguishing ischaemic from non-ischaemic ST segment episodes, our goal was also to assess classification performance of the KLT compared to the LPT. Classifiers used in this study perform slightly better if using the new KLT feature vectors than using the new LPT feature vectors (Se KLT = 91%, Sp KLT = 85%, CA KLT = 90%; compared to Se LPT = 85%, Sp LPT = 72%, CA LPT = 82%; for the best-performing classifier kNN, k = 3, and using the best performing feature subset, coefficients [1][2][3][4][5][6][7][8]. This was expected, as the LPT is based on the Legendre polynomials and is not fitted to any "training" data in comparison to the KLT. In spite this the classification performances are still comparable. Related studies on the classification between ischaemic and non-ischaemic ST segment episodes employed a variety of other approaches. A decision tree based classification between ischaemic and non-ischaemic ST segment episodes of the LTST DB was performed in [18]. In this study, compound features like heart rate, Legendre polynomial coefficients, and the Mahalanobis distance of the QRS complex KLT coefficients were used. These compound features were actually derived as differences between features of pre-episode onset, pre-episode offset, and episode extreme. Classification performances achieved on the entire set of ischaemic and non-ischaemic ST segment episodes of the LTST DB were 98.4% (Se) and 85.9% (Sp). Using the bootstrap method to assess the robustness of the performance statistics and to predict the real-world performance of the approach, the 5% confidence limits achieved were 97.8% (Se) and 81.5% (Sp). A similar discriminant analysis based approach [19] in the classification of ischaemic and non-ischaemic ST segment episodes of the LTST DB used compound features derived as differences between features of pre-episode onset, pre-episode offset, and episode extreme as well. The features used were heart rate, QT interval, ST segment level, and QRS complex and ST segment KLT coefficients. The classification performances achieved using leave-one-out cross-validation estimation were 84.5% (Se) and 86.6% (Sp). Yet another study [20] conducted a genetic algorithm-based selection to identify the nine, out of 35, most relevant features to classify ischaemic and non-ishaemic ST segment episodes of the LTST DB. The selected features were single features like the low to high frequency ratio (LF/HF ratio) of heart rate ST segment level, and the root mean square of the ST segment shape change, and compound features (differences in features from pre-episode onset and pre-episode offset) like heart rate, ST segment slope, R wave up-slope, root mean square of the QRS complex shape change, and the Mahalanobis distances of the QRS complex and ST segment KLT coefficients. Ischaemic and non-ischaemic ST segment episodes from the LTST DB were separated into training and testing sets. The performances obtained from the testing set using the Relevance vector machine (a special case of a sparse Bayesian learning algorithm) were 88.7% (Se) and 86.8% (Sp).
Our study intentionally used non-compound features and those based solely on the KLT-or LPT-based feature vectors. Our goals were to assess the classification performance to classify between ischaemic and non-ischaemic ST segment episodes, and to assess the classification performance of the KLT-based feature vectors as opposed to the LPT-based feature vectors. To assess the classification performance we used 10-fold cross-validation with 10 repetitions. In comparison to [18][19][20] we also tested a variety of classifiers (kNN, CT, QDA, SVM, NBC). The k-Nearest Neighbors classifier yielded the highest classification performances if using the KLTbased feature vectors, Se KLT = 91% and Sp KLT = 85%. The classification performances achieved are quite comparable to the performances obtained in [18], and higher than those obtained in [19,20]. The Mahalanobis distances of the QRS complex and ST segment KLT coefficients in [18,20] and the QRS complex and ST segment KLT coefficients in [19,20] were derived using the KLT basis functions [7] from the ESC DB. These basis functions span over two ECG leads (for these ST segment KLT basis functions see Fig 3). Consequently, we may expect that the extracted features in terms of the KLT coefficients will less accurately emphasize the presence of ischaemia, since ischaemic ST segment change shapes may be significant in one ECG lead, but less significant, or even absent, in the other ECG lead. One of the advances of the proposed method lies in the fact that the newly developed KLT basis functions, and the LPT basis functions, span over a single ECG lead, thus the extracted features will more accurately emphasize the presence, or not, of ischaemic ST segment shape changes. The next advance of the proposed method lies in the reduced set of features (the KLT-and LPT-based feature vectors only) that yield high classification performance, the only prerequisite is a stable fiducial point for each heartbeat. Furthermore, extracting the morphologic features of ST segments exclusively in terms of the KLT or LPT feature vectors is more robust in the sense that they are less susceptible to ECG signal variability and noise. Another advance of the proposed method are noncompound features that are extracted only at the extrema of ischaemic and non-ischaemic ST segment episodes. There is no need for the accurate detection of the beginnings of the episodes, which is difficult.
The LPT offers an additional asset of immediate insight into the type and shape of ST segment change in time domain. This fact indicates possibilities for the development of new clinical diagnostic criteria for the reliable visual detection of transient ischemia using the LPT. Consequently, significantly lower rates of erroneously estimated ST segment features can be expected.
As for the task of the delineation of significant transient ST segment morphology changes from the entire ST segment, we conclude that the LPT basis functions provide higher accuracy in the representation of transient ST segment morphology categories, while the new KLT basis functions provide higher classification accuracy between ischaemic and non-ischaemic ST segment episodes, offering future applications such as new automated transient ischaemia detection systems.