Efficient intra mode decision for low complexity HEVC screen content compression

High efficiency video coding screen content coding (HEVC-SCC) extension is the latest HEVC development to improve the compression performance of screen content (SC) video. Similar to HEVC, the intra mode selection in HEVC-SCC is performed all the coding unit (CU) partitions to find the least rate distortion (RD) cost. Furthermore, additional intra tools are introduced to improve HEVC-SCC coding efficiency. However, these new tools could cause high computation complexity which restricts HEVC-SCC from ongoing applications. To solve the problem, an efficient intra mode decision for HEVC-SCC that adaptively utilizes the texture complexity of SC treeblock is proposed. The texture complexity of a SC treeblock is first analyzed according to the variation degree of the luminance value. And then, two efficient approaches are proposed based on the constructed model, which are early CU depth level determination and adaptive intra mode selection. Experimental results demonstrate that the proposed method can save 48.5% encoder runtime while keeping nearly the same coding efficiency as the HEVC-SCC encoders.


Introduction
With the development of wireless displays and mobile technologies, screen content (SC) video attracts more and more attentions in the last few years [1][2][3][4]. SC is different from the conventional camera-captured image and has mixed content consisting of graphics, text and natural pictures in the same image with higher resolution. For such different characteristics, SC video compression standard has been investigated by JCT-VC, and high efficiency video codingscreen content coding (HEVC-SCC) is formed as the extension of HEVC since March 2014 [5][6][7][8].
In the HEVC-SCC, additional tools (intra block copy (IBC) [9] and palette coding mode (PLT) [10], etc.) have been proposed to improve coding efficiency in SC video [11]. Here, IBC is a block matching method using fixed block size for better SC video compression, which is similar to the inter-prediction estimation. PLT is the sample in the Coding Unit (CU) which is represented by a small set (as the palette) of representative color values. Even though these newly added intra mode tools could achieve the good rate distortion (RD) performance, it will a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 modes based on a complexity parameter of current SC treeblock to derive the adaptive thresholds for efficient intra mode decision. As far as we know, there is almost no work similar to the proposed algorithm on the HEVC-SCC.
The rest of the paper is structured as follows. Section 2 provides the analysis on HEVC-SCC mode selection. Section 3 presents the proposed intra mode decision method in detail. Simulation results and conclusions are given in Section 4 and 5, respectively.

Observations and analysis
HEVC-SCC is developed for the compression of screen content sharing systems, which is an extension of the HEVC. Similar to HEVC, the mode decision procedure in HEVC-SCC tries all the prediction modes and depth levels to find the one in terms of least RD cost, Where J mode is the RD cost function, SSE lima and SSE chroma are the distortion between the current treeblock and its reconstructed block of luma and chroma component, ω chroma is the chroma parameter, λ mode denotes the Lagrange multiplier, R mode denotes bit rate cost used for HEVC-SCC encoders. This "try all and select the best" mode decision method achieves the good RD performance, but it leads to significant high complexity. Furthermore, additional tools including IBC and PLT have been developed for HEVC-SCC to improve coding efficiency. Compared with conventional HEVC mode prediction, all the IBC and PLT modes are added to the RD search list in HEVC-SCC encoders. These techniques result in extremely huge complexity, which obstructs HEVC-SCC encoders from real-time application. In the current test model, test IBC and PLT modes together with traditional HEVC intra prediction are a big consuming part of HEVC-SCC encoders due to high computational complexity for full RD cost calculation. In fact, larger CUs using traditional HEVC intra mode are suitable for treeblock in flat or homogeneous region, while IBC and PLT modes are more likely to be chosen for the treeblock with sharper object edges or discontinuous-tone screen content region [28]. It is shown that the mode prediction of HEVC-SCC should be adaptively determined, which based on the SC texture content characteristic of current CU. Therefore, if we can exploit texture content characteristic to determine some modes which is rarely used in HEVC-SCC intra prediction, the full RD costs calculation can be effectively skipped in the whole coding process, and the complexity of HEVC-SCC could be significantly reduced.
This is reasonable since screen content video is dominated by purely horizontal and vertical patterns. It is noticed from experimental results in Table 1 that IBC mode and PLT mode utilization decrease as QP increase. The reason is that large bitrate (such QP = 20) has a higher likelihood to select good matches. However, the IBC and PLT mode occur very frequently for large homogeneous and smoothly-varying regions [28]. On the other hand, the IBC and PLT mode are rarely chosen for CUs in screen content blocks that generated by computers. If the optimal intra modes can be decided at the early stage, the wasteful search process of IBC mode, PLT mode and HEVC intra mode will be skipped, and thus a huge amount of complexity can be significantly reduced in HEVC-SCC.
With the similar spatial characteristic, the texture complexity of a SC treeblock is strongly related to that of its luminance value. To characterize this correlation, the variation degree of luminance value in a SC treeblock is defined as follows, where LV represents the different luminance parameter of current SC treeblock, L r (x, y) represents current treeblock luminance value, and L a (x, y) represents the average luminance value of current CU. If the different luminance values of current CU change drastically, a large value of LV is obtained.
Generally, the more complexity of SC treeblock has, the larger value of LV will be. According to LV, each SC treeblock is divided into homogenous, middle or complex texture region as: where S h and S c are texture-weight factors. The selection of the threshold S h and S c should efficiently reduce the HEVC-SSC computational complexity, while they can keep the same RD performance and high accuracy in intra mode decision. Based on extensive experiments, the thresholds S h and S c are set to 30 and 60, respectively, which achieve a good and consistent performance on a variety of test SC sequences with different characteristics. In this paper, the coding information from luminance value variation is extracted to analyze intra prediction mode characteristics of the current treeblock and adjust the HEVC-SCC mode decision process.

Early CU depth determination
Similar to HEVC, the ME process in HEVC-SCC encoders is performed using all depth levels to select the best mode, and HEVC-SCC makes the "try all and select the best" method for each SC treeblock. However, this technique leads to an extremely high complexity. Thus, the exhaustive SCC which is fixed depth level range compression is inefficient, and fast CU depth determination is very desirable for HEVC-SCC encoders.
In HEVC-SCC, the maximum treeblock size is set to 64, and the CU depth range is 0 to 3. The CU depth level has a fixed range for a whole encoding process. In fact, small depth level is chosen for treeblocks with static region, while large depth level is suitable for treeblocks with complex motion region. It is observed that the depth level value of "0" occurs very frequently for the CUs in the static region. On the other hand, the depth level value of "0" is rarely chosen for CUs with complex motion [33]. These results show that the depth range in HEVC-SCC can be adaptively determined by complexity characteristics of SC treeblocks.
According to Eq (3), each SC treeblock can be classified into three types, which are homogenous treeblock, middle texture treeblock and complex texture treeblock. By exploiting the above test conditions and test sequences in Section 2, the depth distribution of HEVC-SCC could be analyzed for three types of SC treeblocks. Table 2 gives the depth level distribution of HEVC-SCC for three types of treeblocks, in which "Depth 0", "Depth 1", "Depth 2" and "Depth 3" are the depth level values of SC treeblocks. It can be observed that for treeblocks in homogenous region, about 61.2% of SC treeblocks select the optimal depth level with "0" (CU size 64×64), and about 34.0% of SC treeblocks select the optimal depth level with "1" (CU size 32×32). Thus, if the maximum value of depth is set to "1", it will cover about 95.2% of SC treeblocks. The intra mode prediction on depth value of "2" and "3" can be skipped. For treeblocks with middle texture region, about 94.6% of SC treeblocks select depth levels with "0", "1" and "2" (CU size 64×64, 32×32, and 16×16). If the value of depth level is in the range of "0" to "2", it will cover about 94.6% of SC treeblocks. For treeblocks with complex texture region, the probability of choosing the depth values of "2" and "3" (16×16 and 8×8) in HEVC-SCC is more than 91.6%, and thus depth levels of both "0" and "1" can be skipped. Based on that analysis, the optimal depth levels that will be tested in HEVC-SCC for three types SC treeblocks are given in Table 3. With the proposed early depth level range determination algorithm, most of SC treeblocks in HEVC-SCC intra prediction can skip one or two depth values.

Adaptive intra mode selection
In the HEVC-SCC encoders, two newly intra modes, e.g., IBC and PLT, are applied for SC coding. Compared with HEVC, after the selection of the best one in the HEVC mode prediction, all IBC and PLT mode are added to HEVC-SCC encoder. This method will cost extremely high computational complexity. Fig 1 shows the runtime analysis of HEVC-SCC intra mode prediction. It is observed from Fig 1 that the major computation complexity is consumed by conventional HEVC intra modes and IBC. The PLT is a smaller percentage on all the depth levels. The PLT mode computation complexity at depth level 0 is vain since PLT mode is disabled at CU size 64×64. It also can be observed from the figure that the IBC utilization increases with the increase of depth value. This is because that the larger CU depth values have higher likelihood to search precise matches. Moreover, the new developed tools (IBC and PLT modes) represent 56% coding time of the HEVC-SCC intra mode coding, while the conventional HEVC has the remaining 44%. Experimental results analysis shows that testing the IBC and PLT mode become the most computationally intensive part of HEVC-SCC encoders. Therefore, the complexity of the IBC and PLT mode should be reduced significantly while the same coding performance is nearly maintained.
In the current HEVC-SCC, two newly developed modes: IBC and PLT mode are mainly for the effective compressing of SC video, while they might be inefficient for the homogenous texture region, and the HEVC intra modes (Planar and DC mode) can work well. IBC mode copies the treeblocks from the reconstructed area in the same frame, which is an inter-alike block matching coding tools. Repeated pattern is copied from the area with block vectors. It is mainly effective for the regions with complex texture and sharp edges. Similarly, for PLT mode, two tables are created to store major colors on the palette predictor. It is effective for the blocks with major colors. Based on the aforementioned analysis, it is noticed that most of the intra modes in HEVC-SCC should be skipped when the current treeblock is homogenous. In traditional HEVC intra coding, Planar and DC mode are often selected as the best mode for treeblocks with slowly varying values. It can be seen that when the best mode is Planar or DC mode, the treeblock is very likely to be the simple or homogeneous area. Meanwhile, both IBC and PLT mode are designed for SC video transition, which are inefficient for homogeneous region. Thus, most of IBC and PLT mode search can be skipped in homogeneous region. Based on these observations, if a treeblock contains homogeneous region, we only choose Planar and DC for candidate intra mode in HEVC-SCC coding.
To evaluate the performance of proposed selection method for treeblock with homogenous region, extensive experiments have been performed. Table 4 gives the accuracy of the skipping Horizontal mode, Vertical mode, and Angular modes 2-9, 11-25, 27-34, IBC and PLT mode for treeblock with homogenous region. The accuracy of the proposed selection method is more than 98.3% (from 98.3% to 99.7%). Although the proposed method misses some cases when skipping IBC and PLT mode is still chosen as the best mode in HEVC-SCC, the loss rate which is less than 1.7% is negligible. Tables 5 and 6 show the intra mode distribution for treeblocks in middle texture and complex texture region. It can be observed from Table 5 that for a treeblock in the middle texture region, the probabilities of choosing IBC mode, Planar mode, DC mode, Horizontal mode, and Vertical mode are 13.9%, 26.3%, 14.0%, 18.2% and 20.4%, respectively. The probability of PLT mode and other intra modes is less than 4.9%. For the sequence of "EBURainFruits" and "Kimono1", the percentage of choosing IBC mode is very low, 2.5%, because it's the cameracaptured content, and the IBC mode is developed for computer generated contents coding. The percentage of optimal intra modes (IBC, Planar, DC, Horizontal, and Vertical mode) is covered 92.6% candidate modes. Thus, from simulation results of Table 5 we can conclude that a treeblock in the middle texture region is likely to choose IBC, Planar, DC, Horizontal, and Vertical mode in HEVC-SCC encoders. For a treeblock of complex texture region, the probabilities of choosing Planar and DC mode are no more than 2.8% in Table 6. The total probabilities of IBC, PLT, Horizontal, Vertical and other intra modes are 95.8%. Thus, it is not necessary to run Planar and DC in the complex texture region. Based on the above analysis, Table 5. Intra mode distribution for treeblock with middle texture region. the adaptive intra mode selection that will be tested in HEVC-SCC encoders is summarized in Table 7.

Overall method
According to the above analysis, the proposed method is to adjust various steps of intra mode decision based on the texture complexity of SC treeblocks. The proposed overall method including early CU depth level determination and adaptive intra mode selection is shown as follows: Step 1) Start intra mode prediction.
Step 2) Compute LV(i, j) based on (2), S h and S c for classification using (3), classify the SC into the homogenous, middle texture and complex texture region.
Step 3) Test early CU depth level determination. If current SC belongs to homogenous region, the optimal depth range is "0" to"1"; if current SC belongs to middle texture region, the optimal depth range is "0" to"2"; else if SC belongs to complex texture region, the optimal depth range is set to be "2" to"3".
Step 4) Adaptive intra mode selection. If the SC treeblock belongs to homogenous region, Planar and DC mode are only used for intra coding; when the SC treeblock belongs to middle texture region, the optimal modes are IBC, Planar, DC, Horizontal, and Vertical mode. If the SC treeblock belongs to complex texture region, the optimal modes are IBC, PLT, Horizontal, Vertical and other intra modes.
Step 5) Determine the best intra mode.

Experimental results
To evaluate the performance of the proposed method, the experiments are implemented on the recent HEVC-SCC reference encoder (SCM 5.2) [34]. The proposed method is tested on fourteen sequences with different resolutions and motion activities in Table 8. Test videos are available from this location: ftp://hevc@ftp.tnt.uni-hannover.de/testsequences. All the simulations are performed by the SCC CTC [35]. Test conditions are listed as follows: Loss coding, all-intra configurations, four QPs are chosen with 22, 27, 32, and 37, the number of test coding frames is 120, and only YUV444 format results are reported by the computer space limit. In this section, we compare the proposed low complexity method in Tables 9-11 with the original HEVC-SCC and the state-of-the-art methods [25,27,28], where the compression efficiency is measured by BDPSNR and BDBR [36]. The "Dtime (%)" is used to represent the runtime saving in percentage, where Time proposed and Time origional denote the encoding runtime of proposed scheme and original method for same test sequence, respectively. Table 9 shows the individual results of the proposed approaches compared to the HEVC-SSC encoders, e.g. early CU depth level determination (ECUDLD) and adaptive intra mode selection (ADIMD). For ECUDLD approach in Table 9, 37.6% coding time has been saved on average. For most SC sequences, the ECUDLD approach reduces computational complexity by 25.9%-54.2%. These coding time reductions are particularly high for text and graphics sequences as "sc_SlideShow" (-54.2%), but they are still evident for animation sequences as "sc_robot" (-25.9%). The runtime reduction of "sc_SlideShow" and "Kimono1" sequences are larger than other test videos, because these sequences contain more homogeneous texture than other test videos. The runtime reduction is particularly high because a number of unnecessary depth levels are reasonably skipped in homogeneous region. Meanwhile, the PSNR loss is 0.12 dB, which is negligible. The above result indicates that the ECUDLD can efficiently skip unnecessary depth level of the HEVC-SSC. As for the ADIMD approach, about 24.0% coding time has been saved on average with the minimum of 17.8% in "sc_robot" and the maximum of 31.7% in "sc_SlideShow". On the contrary, the average loss of PSNR is 0.04%, which is negligible. Therefore, the ADIMD can greatly save the encoding time while keeping the similar coding efficiency of HEVC-SSC.

Results of the overall method
In the following, we further show the coding performance of the proposed overall method in Table 10, which involves ECUDLD and ADIMD. It can be found that the proposed overall method saves 48.5% runtime on average, with the maximum of 66.3% for "sc_SlideShow" (TGM,1280×720) and the minimum of 35.2% for "sc_robot" (A,1280×720). The "sc_Slide-Show" time reduction is larger than others, because "sc_SlideShow" contains more homogeneous texture than others. The computation reduction is high because unnecessary intra modes and depth level are skipped in the larger homogeneous region. Meanwhile, the average BDPSNR drop is 0.14 dB, which is negligible. Additionally, the results of proposed method for four categories of test sequences ("text and graphics with motion (TGM)", "mixed content (M)", "animation (A)", and "camera-captured content (CC)") are shown in Table 11. Table 11 gives the coding results of the two sub fast approaches and overall method. It can be seen that the coding runtime saving for the ECUDLD, ADIMD, and overall algorithm are 39.0%, 24.9%, and 50.0% for "TMG" sequences, 37.3%, 21.0%, and 42.4% for "M" sequences, 25.9%, 17.8%, and 35.2% for "A" sequences, 38.3%, 24.0%, and 49.6% for "CC" sequences, respectively. Those coding runtime reductions are particularly high for "CC" and "TGM" sequences, but Efficient intra mode decision for HEVC screen content they are still evident for "A" sequences. The above result analysis indicate that the proposed sub fast approaches and overall method can achieve a consistent gain in coding speed for different four categories sequences. From those results, it can be also observed that the proposed overall method primarily saves the encoding time with only about 0.16-1.81% RD performance loss for all the sequences. Therefore, the proposed overall method can significantly save the encoding time with similar RD performance. Fig 2 shows more detail simulation results of the proposed overall method for four typical test sequences "sc_flyingGraphics" (TGM,1092×1080), "EBURainFruits" (CC,1092×1080), "sc_robot"(A,1280×720), and "Basketball_Screen" (M,2560×1440). As shown in Fig 2, we can observe that the proposed overall method can achieve consistent runtime saving from low to high bitrate range with similar RD performance compared with HEVC-SCC. Moreover, with the decrement of compression bitrate and increment of QP value, the coding time savings increase in the curves. This is because that with the increment of QP values, the probability of only checking two depth levels (0,1) for homogenous treeblocks using ECUDLD, and the probability of testing Planar/DC Modes for screen content treeblocks using ADIMD are increased.

Performance comparison with the state-of-the-art methods
In addition to HEVC-SCC encoders, we compared the proposed overall method with recent state-of-the-art fast schemes with the SCM 5.2 implementation in Figs 3 and 4. These are FIMBM [25], FMPML [27], and FIPZPA [28], which are well-known fast and efficient methods for HEVC-SCC encoders. Among these four schemes, the proposed overall method has the better gain in RD performance compared with FIMBM, FMPML and FIPZPA. For four categories (TGM, M, A, and CC) of SC sequences, the proposed overall method achieves 35.2%-50.0% computation reduction on average with increasing only 0.2%-1.8% BDBR. As shown in Figs 3 and 4 the FIPZPA method performs the largest computation reduction, but its RD performance is poor. Compared to FIPZPA, the proposed overall method can achieve Efficient intra mode decision for HEVC screen content better RD performance. About 1.4%-3.9% BDBR can be further reduced in HEVC-SCC encoders. Meanwhile, the average computation reduction loss is negligible, 2.5% encoding time increase. Moreover, the proposed overall method can achieve the better gain (4.0%-10.0%) in coding time compared with FIMBM and FMPML schemes, and with a better coding efficiency (0.8%-1.3% BDBR decrease). The above simulation results indicate that the proposed overall method is efficient for all categories (TGM, M, A, and CC) of SC sequences and outperforms the recent state-of-the-art methods for HEVC-SCC.

Conclusion
In this paper, a fast intra mode decision method is proposed to reduce the complexity of SC compression, which comprises two approaches, i.e., early CU depth determination and adaptive intra mode selection. It makes use of the texture complexity classification of the SC treeblock to predict the current CU and early skip unnecessary intra mode. The experimental data shows that the proposed method can save about 48.5% coding time of the HEVC-SCC while keeping nearly the same RD performance.