Unequal error protection technique for video streaming over MIMO-OFDM systems

In this paper, a novel unequal error protection (UEP) technique is proposed for video streaming over multiple-input multiple-output orthogonal frequency-division multiplexing (MIMO-OFDM) systems. Based on the concept of hierarchical quadrature amplitude modulation (HQAM) UEP and multi-antenna UEP, the proposed technique combines the relative protection levels (PLs) of constellation symbols and the differentiated PLs of the transmit antennas. In the proposed technique, standard square quadrature amplitude modulation (QAM) constellations are used instead of HQAM so that the QAM mapper at the transmitter side and the soft decision calculation at the receiver side remain unchanged, but the UEP benefit of HQAM is retained. The superior performance of the proposed technique is explained by the improved connections between data with various priorities and data paths with various PLs. The assumed video compression method is H.264/AVC, which is known to be commercially successful. The IEEE802.16m system is adopted as a data transmission system. With the aid of realistic simulations in strict accordance with the standards of IEEE802.16m systems and H.264/AVC video compression systems, the proposed technique HQAM-multi-antenna UEP is shown to improve the video quality significantly for a given average bit error rate when compared with previous techniques.


Introduction
Video streaming services, such as video conferencing, multimedia messaging, and video telephony, operating under unstable wireless channel environments present significant challenges [1]. To overcome unstable channel environments, various network layer unequal error protection (UEP) techniques have been proposed [2], [3], [4]. However, these network layer techniques do not directly consider the unstable channels of the physical (PHY) layer but use various abstraction methods of the wireless channels. Another classical UEP implementation method is the use of channel coding with different redundancies. An efficient implementation of numerous different redundancies involves performing puncturing after channel coding with a high redundancy as in [5] and [6], which used convolutional code and turbo code, respectively. The work in [5] considered only flat fading channels; these are not suitable for broadband communications, which are considered in this paper. Broadband  sub-channel SNRs cannot be used for multiple-antenna systems. Furthermore, normal square quadrature amplitude modulations (QAMs) have been adopted in the contemporary cellular mobile systems such as IEEE802.16m and LTE-A systems; thus, the previous HQAM UEP techniques cannot be applied in their present forms. In this paper, a novel HQAM-multi-antenna UEP technique is proposed, considering realistic multiple-antenna channels. Pilot insertion, channel coding for error correction, resource allocation, and precoding at the transmitter side are performed in strict accordance with the IEEE802.16m system standard [10]. Channel estimation at the pilot tone positions is performed using the least square (LS) method; subsequently, the channel gains at data positions are obtained via the linear interpolation technique, which is used widely in contemporary systems. The proposed technique combines the different error protection properties of QAM constellation symbols and the antenna reliability information, using both multi-antenna UEP [15], [17] and the concept of HQAM UEP [20], [21]. The standard square QAM instead of HQAM is adopted in the proposed technique; thus, the constellation in the standard need not be modified, but the benefit of HQAM is retained. We note that the spatial streams are different from the transmit antennas when precoding is used as in IEEE802.16m systems. However, we use the two terms interchangeably because precoding is not an issue in this study and the term "transmit antenna" is more easily visualized. The simulation will demonstrate the successful combination of the HQAM concept with the previous multi-antenna UEP, which offers improved video quality. A short abstract form of this paper was published as a conference paper [22]; in the present extended paper, a more detailed explanation of the proposed technique and more extensive experiment results are provided. This extended version also includes the PHY layer wireless resource usage of the previous and proposed UEP techniques in order to explain the performance improvement achieved using the proposed UEP technique.
The remainder of the paper is organized as follows. Section 2 describes a cross-layer system including the H.264/AVC application layer [1] and the IEEE802.16m physical layer [10]. Section 3 summarizes the previous two main UEP techniques: HQAM UEP and multi-antenna UEP. A novel UEP video-streaming technique exploiting both multi-antenna UEP and the idea of HQAM is described in Section 4. The improvement in video peak signal-to-noise ratio (PSNR) performance achieved using the proposed technique is confirmed in Section 5 with the aid of an extensive set of computer simulations. Section 6 presents the conclusions.

APP-PHY cross-layer system model
This section describes the application-physical (APP-PHY) cross-layer system illustrated in Fig 1. The APP layer is H.264/AVC, which produces network abstraction layer (NAL) unit streams [1]. In this study, H.264/AVC encodes a group of pictures (GOP) into 16 NAL units (an NAL unit corresponds to a picture), and cyclic redundancy check (CRC) parity bits are added to each NAL unit. The white area in the left side of Fig 1(a) indicates an NAL unit, and the shaded area at the end of each NAL unit indicates the CRC bits. The 16 NAL units are thereafter prioritized and zero-padded so that a GOP (after the CRC encoding) is an integer multiple of 4 (the number of priority levels) times the PHY packet size; here, the PHY packet size represents the size before PHY layer channel encoding with the convolutional turbo code (CTC). If the number of priority levels is changed, the prioritization in Fig 1(a) should be changed accordingly. Assuming that a GOP is composed of one I-picture and 15 P-pictures as shown in Fig 1(a), the prioritization is straightforward. Thus, the first quarter of the GOP that includes the I-picture is of the highest priority, the second quarter is of the second highest priority, and so on. Each prioritized NAL stream is divided to be of PHY packet size and stored in the data buffer of the corresponding priority level, as shown on the right side of Fig 1(a). Each PHY payload data packet is channel-encoded using CTC, and is mapped to the QAM symbols as shown in Fig 1(b). The four PHY packet streams are assigned to appropriate spatial streams using the reliability information from the receiver. Precoding in the frequency domain and OFDM modulation are performed. Finally, multiple packets of different priorities are transmitted simultaneously from multiple antennas.
Assuming four transmitter and four receiver antennas, the relationship between the transmitted signals and the received signals in the PHY layer is described in the frequency domain as follows: where x k = [x 1,k x 2,k x 3,k x 4,k ] T , x i,k is the transmitted signal from the ith transmitter antenna, y k = [y 1,k y 2,k y 3,k y 4,k ] T , y j,k denotes the received signal at the jth receiver antenna, H k is a 4 × 4 matrix in which h ji,k is the channel gain between the ith transmitter antenna and the jth receiver antenna, z k = [z 1,k z 2,k z 3,k z 4,k ] T , z j,k denotes the additive noise at the jth receiver antenna, and the subscript k denotes the OFDM subcarrier index allocated to the target user by a scheduling algorithm. In this study, we do not consider scheduling for multiple users; instead, we allocate sufficient wireless resource to a single target user considering the resource allocation unit of WirelessMAN-Advanced system standard. At the receiver side, channel gains are estimated using the linear interpolation method; thus, estimation errors are included in the considered systems, and the PHY packets of different priorities are detected using a joint detection technique such as QR-LRL, which uses the estimated channel gains. The stream reliability is calculated, and subsequently, the reliability order information is passed to the transmitter side to be used in the antenna mapping in Fig 1(c). Stream reliability assessment technique and the assumed joint signal detection technique proposed by Yun et al. are adopted, and are briefly reviewed in Section 3.1. The soft decisions calculated using the joint detection technique are passed to the CTC decoder for error correction. The CRC is thereafter checked for each NAL unit, and the acknowledgement (ACK) or negative ACK (NACK) result is passed to the APP layer to be used possibly for either retransmission request or error concealment. Frame-copy error concealment is used and retransmission is not allowed in the systems considered in the study.
Notably, we deal with single-user MIMO systems, indicating that a resource block (allocated to a specific user) occupies all the spatial streams, and that the APP layer video data (prioritized and divided) originates from a single user.

UEP using multiple antennas
Yun et al. proposed a UEP technique for multiple-antenna systems using the joint signal detection technique QR-LRL, which is known to achieve a near-ML performance [15]. Using QR decomposition H k = [h 1,k h 2,k h 3,k h 4,k ] = Q k R k where Q k is a unitary matrix and R k is an upper triangular matrix, Eq (1) is modified as follows:  Table 1, whereỹ j;k is the jth entry ofỹ k , O denotes the set of constellation points, |O| denotes its cardinality, and D [.] denotes the slicer for the adopted constellation. Although there is generally no order of  where h (i),k , i = 1, 2, 3, 4 denotes the ith column vector after the ordering. The ordering in Eq (3) is conducted so that the diagonal entries of the matrix R k,ord , which are obtained through the QR decomposition of H k,ord , satisfy the following conditions: where r (ji),k denotes jith entry of R k,ord . Yun et al. showed that the reliability order of multiple transmitter antennas is statistically the same as Eq (4) when QR-LRL detection is used, i.e., the transmitter antenna that corresponds to r (33),k (or h (3),k ) shows the highest reliability, the antenna that corresponds to r (11),k (or h (1),k ) shows the third highest reliability, and so on [15]. Note that the antenna reliability information should be calculated for each OFDM subcarrier index k. The same reliability ordering was shown to be valid even when ML signal detection is used [17]. The candidate vectors in the set S QR−LRL are used for the calculation of soft decisions fed to the CTC decoder of WirelessMAN-Advanced systems.

UEP using hierarchical QAM
The concept of HQAM constellation can also be used to implement UEP. A hierarchical 16-QAM constellation is illustrated in Fig 2, where the two most significant bits (MSBs) are bold-faced. From Fig 2, it can be observed that the minimum distance between symbols with two different MSBs is d 1 , and that for two different least significant bits (LSBs) is d 2 . With the parameter α = d 1 /d 2 , the bit error probability of the MSBs and LSBs is expressed as follows: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where γ = E b /N 0 . As the parameter α increases, the protection of two MSBs become stronger, but simultaneously the LSBs become more erroneous owing to the reduced d 2 . Chang et al. applied the HQAM technique to wireless video transmission, considering AWGN channels [20]. However, when the spectral bandwidth is as large as 20 MHz or 40 MHz as in contemporary communication systems [10], [11], realistic wireless channels are frequency-selective fading channels. Li et al. used the idea of HQAM in conjunction with OFDM to enable wireless video transmission over frequency-selective channels [21]. The bit level SNR at the kth subcarrier of OFDM systems is evaluated as follows: where H k is the gain of the kth subchannel of OFDM systems, and E b and N 0 denote the bit energy and noise energy, respectively. Replacing γ in Eqs (5) and (6) with γ k shown above, Li et al. successfully combined the ideas of HQAM and OFDM. The subcarriers were classified into two categories, i.e., high quality (HQ) and low quality (LQ), based on the SNR evaluation method Eq (7). Video streams with high significance are transmitted over HQ subcarriers, and similarly, video streams with low significance are transmitted over LQ subcarriers, where each HQ or LQ subcarrier uses the normal square QAM constellation. If the number of HQ subcarriers is not sufficient to accommodate the video streams with high significance, the LQ subcarriers are again divided into two categories: medium quality (MQ) and LQ. Subsequently, the HQAM UEP is applied to MQ subcarriers so that the MSBs of the MQ subcarriers are used for a part of a video stream with high significance. The quality assessment of subcarriers in Eq (7) holds for single-antenna systems, because simple one-tap equalization is optimal for OFDM systems with a single antenna. When multiple antennas are used as in IEEE802.16m systems [10] and LTE-A systems [11], unfortunately, the channel quality assessment method in Eq (7) cannot be adopted. Consequently, we require a novel technique that considers the HQAM UEP, multiple-antenna systems, and the structure of the adopted joint detection method for successful video streaming over contemporary multi-antenna communication systems. We also note that all the previous HQAM UEP techniques in their present forms cannot be applied to contemporary mobile systems such as IEEE802.16m and LTE-A systems because normal square QAMs have been adopted in the standards.

Proposed HQAM-multi-antenna UEP technique
In this section, the error performance of QAM is briefly reviewed first. Subsequently, a novel UEP technique that utilizes the concepts of HQAM UEP and multi-antenna UEP is proposed.

Review of QAM performance
When the HQAM parameter α = 1, the two distances in Fig 2 become equal, i.e., d 1 = d 2 = d = 2, resulting in the normal rectangular QAM constellation. The QAM symbols corresponding to LSB = 0 (i.e., the fourth bit is 0) are −1 + j3, −1 + j, −1 − j, −1 − j3, 1 + j3, 1 + j, 1 − j, and 1 − j3. When −1 + j3 is considered, the closest symbol with flipped LSB, i.e., LSB = 1, is −3 + j3, and the distance between the two symbols is d. The minimum distance for all the other seven symbols is readily shown to be also d. Similarly, it can be shown that the minimum distance between symbols with flipped third bits is also d. In summary, the two LSBs (the third and fourth bits) are equally protected, which is characterized by the minimum distance of d.
The QAM symbols corresponding to MSB = 0, i.e., the first bit is 0, are 1 + j3, 1 + j, 1 − j, 1 − j3, 3 + j3, 3 + j, 3 − j, and 3 − j3. The minimum distance between the symbols with flipped MSB is d when the four symbols 1 + j3, 1 + j, 1 − j, and 1 − j3 are considered, but the minimum distance is 2d when the four symbols 3 + j3, 3 + j, 3 − j, and 3 − j3 are considered. It can also be shown that the minimum distance between symbols with flipped second bits is d for four symbols and 2d for the other four symbols. In summary, even when d 1 = d 2 = d, the first two bits (MSBs) are sometimes protected more than the last two bits (LSBs). Thus, the performance difference between MSBs and LSBs is crucial when realistic systems are considered, which include rather strong error correction techniques such as CTC, channel estimation error owing to fixed pattern and linear interpolation process, realistic resource allocation, and realistic wireless channel realizations.
As will be shown in the simulation section, the average performance difference between the MSBs and LSBs even with α = 1 is significant when realistic channel coding, pilot pattern insertion, precoding, and resource allocation scheme are used in accordance with the Wire-lessMAN-Advanced standard. Using a normal QAM constellation is beneficial because the parameter α need not be calculated to satisfy the desired performance requirement. Furthermore, the QAM mapper at the transmitter and the soft decision calculation at the receiver need not be changed. Combining the multi-antenna UEP in Section 3.1 and the concept of HQAM UEP in Section 3.2, we propose the resource mapping method in Table 2. As shown in Table 2, the highest protection is achieved by the PHY packets that correspond to the 2 MSBs of QAM symbols transmitted from the first reliable antenna. It can be observed that the two MSBs of the fourth reliable antenna are assumed to provide stronger protection than the two LSBs of the first reliable antenna in the proposed technique. We reiterate that the HQAM parameter α is assumed to be 1 in the proposed technique.  Fig 3(a), H.264/AVC video is segmented into 8 priority levels instead of 4 levels as in the multi-antenna UEP in Fig 1(a); each video segment of a particular priority is fragmented into blocks that are half the size of a PHY packet, and thereafter, each block is CTC encoded. Thus, the top block stream on the right side of Fig 3(a) is of the highest priority, the second block stream from the top is of the second highest priority, and so on. Fig 3(b) shows an example of QAM mapping, in which two block streams are mapped onto the MSBs and LSBs of QAM symbols. MSBi and LSBi, i = 1, 2, 3, 4 are mapped onto symbols transmitted from the transmitter antenna of the ith reliability. For example, when MSB2 and LSB2 are 01 and 10, respectively, as in Fig 3(b), the four bits are mapped onto a complex symbol 1 − j3, according to the QAM constellation in Fig 2, and the complex symbol is transmitted from the second reliable antenna. The PHY packet streams after the QAM mapping are transmitted from the transmitter antennas as in the previous multi-antenna UEP in Fig 1(c). Owing to the QAM mapping method in the proposed technique, the ith most reliable antenna transmits the ith and (i + 4)th priority video data.

Comparison of UEP techniques in usage of wireless resources
An important aspect of wireless video streaming is how wireless resources with different protection levels (PLs) are utilized for streaming video data with different priority levels. In this section, the two previous and the proposed UEP techniques are compared in terms of how PHY layer resources are used. The number of PHY packets of a PL that are used for streaming the pth picture in the gth GOP is denoted as N(PL, p, g), where p and g denote the picture index and GOP index, respectively. Thus, the average number of PHY packets used for streaming a picture of pth priority can be calculated as follows: where G denotes the number of GOPs. We reiterate that the picture index p can also be considered as the source significance information (SSI), i.e., a smaller p indicates a higher significance of the picture. Tables 3-5 show the distribution of the PHY packets according to PL (the antenna reliability, MSBs or LSBs), and the source significance (picture index p) when the City video is considered. We note that the average numbers were rounded to the nearest integers in Tables 3-5. As smaller p indicates higher-priority video data and smaller PL denotes a resource with stronger protection, intuitive mapping is diagonal mapping from the upper left corner to the bottom right corner.
In the multi-antenna UEP technique, the I-picture uses 134 PHY packets of PL = 1 (the first reliable antenna MSBs), 5 PHY packets of PL = 2 (the second reliable antenna MSBs), 134 PHY packets of PL = 5 (the first reliable antenna LSBs), and 5 PHY packets of PL = 6 (the second most reliable antenna LSBs) as shown in Table 3. In the multi-antenna UEP technique, the video data of p = 1 are transmitted mostly using the wireless resources of the first and the fifth PLs. In HQAM UEP technique, the I-picture uses 69 PHY packets of the 1st through to the 4th PLs, as shown in Table 4. The remaining (137 − 69 = 68) resources of PL = 1 are used for streaming video data with p = 2, 3, � � �, 7 as in the first row of Table 4. From Table 5, it can be observed that the I-picture uses all 137 PHY packets of PL = 1, 130 PHY packets of PL = 2, and 10 PHY packets of PL = 3 in the proposed HQAM-multi-antenna UEP technique. We note that the numbers in Tables 3-5  When compared with the proposed HQAM-multi-antenna UEP in which Ipicture uses all PHY packets, which is protected the strongest among the PHY packets of 8 PLs, the HQAM UEP technique uses only 69 PHY packets of PL = 1 for the I-pictures, resulting in degraded video quality. Although only the I-picture is discussed in this section, the same discussion can be extended to the other pictures. It can be observed that the resource mapping of the proposed UEP is desirably more diagonal than that of the other two previous UEPs.

Simulation
In this section, the efficacy of the proposed HQAM-multi-antenna UEP technique is demonstrated with the aid of computer simulations. The 16-QAM constellation in Fig 2 was adopted with a spectral bandwidth of 20 MHz, a pilot tone pattern for OFDM systems with four antennas was used according to the standard [10], and linear interpolation was used for the channel estimation at the data subcarriers. The wireless channel is the ITU-R Pedestrian A model. A sub-band logical resource unit (SLRU) occupies 18 consecutive subcarriers and 6 OFDM symbol periods, including 16 pilots; hence, 18 × 6 − 16 = 92 subcarriers per SLRU are used for the data transmission. As 8 SLRUs are used for a physical frame with a period of 5 ms, the overall effective data transmission rate from four transmitter antennas is calculated as follows: where 640/1472 is the CTC channel coding rate (I_size_offset parameter in the standard is 20), and 31 bits of the CRC are considered. The test video streams are 256 frames of City and Soccer videos with 4CIF resolution. The GOP size is 16 (1 I-picture and 15 P-pictures as in Figs 1(a) and 3(a)), the frame rate is 10 fps, the quantization parameters (QPs) of I and P pictures are the same as 28, and no B-pictures were used. The motion vector search range is ±16 pixels with a resolution of 1/4 pixel, the number of reference pictures is 1, and CABAC was used for entropy coding. The required data transmission rates are approximately 854.6 kbps and 852.2 kbps for City and Soccer videos, respectively. Consequently, the data transmission rate in Eq (9) is sufficient for these test videos.  Table 2. Note that the average PHY packet error rates of the three UEP techniques are almost the same when they are averaged over all antennas and LSBs/MSBs, because the same joint signal detection technique and the same channel decoder are used for the three schemes. We note that a better UEP video streaming technique achieves higher video quality with a fixed overall PHY layer packet error rate, especially when the channel SNR is low.

PHY packet error performance
When the target PHY error performance is fixed at 0.01, the most (least) reliable antenna in the multi-antenna UEP requires SNR of approximately 15.8 dB (17.9 dB). At the same target PHY error rate, an MSB (LSB) type PHY packet in the HQAM UEP requires SNR of approximately 15.4 dB (17.7 dB). In the proposed UEP technique, the most reliable combination (MSBs of the first reliable antenna) requires SNR of approximately 12.5 dB, which is the lowest among the three techniques. The least reliable combination (LSBs of the fourth reliable antenna) requires SNR of 18.2 dB, which is slightly higher than that of the LSB type packet in HQAM UEP and that of the most unreliable antenna in the multi-antenna UEP. It can be concluded that the proposed UEP technique protects video data of the highest priority stronger than the previous two UEP techniques. The purpose of this section is to show that a larger number of differentiated PLs enable more flexible and powerful UEP.  UEP technique for video streaming over MIMO-OFDM systems the statistics. All three UEP techniques achieve lower NACK ratios for video data with higher priority, i.e., smaller p than for lower-priority video data. It can be observed that the NACK ratio of I-picture (p = 1) achieved using the proposed HQAM-multi-antenna UEP is the lowest among the three techniques. It can also be observed that the NACK ratio of the last picture (p = 16) achieved using the HQAM-multi-antenna UEP is the highest among the three techniques. Thus, it can be concluded that the proposed HQAM-multi-antenna UEP most severely sacrifices the video data of lower priority in order to protect video data with higher priority among the three techniques. The average NACK ratios (averaged over 16 pictures) are 0.2617, 0.3141, and 0.2734 for the multi-antenna UEP, HQAM UEP, and the proposed HQAM-multiantenna UEP techniques, respectively, for the City video. The average NACK ratios of the three UEP techniques are more or less the same, but there is PSNR video quality improvement of 2 dB or 4 dB owing to the proposed technique at fixed channel SNRs.  Fig 7, the proposed UEP technique, which combines the HQAM UEP and the multi-antenna UEP, offers the highest PSNR performance among the three techniques for all channel SNR values and for both test videos. The PSNR improvement achieved using the proposed UEP technique is more significant for Soccer video especially when the channel SNR is low. Given that the numbers of differentiated PLs of HQAM UEP, multi-antenna UEP, and the proposed UEP are 2, 4, and 8, respectively, we can conclude that the video quality improves as the number of PLs increases. Fig 7 also shows the PSNR performance when a linear detection (ZF) method is adopted, because the HQAM-multi-antenna UEP technique can be combined with ZF detection. When ZF detection is assumed, postdetection SNR is used for the assessment of antenna reliability. The performance improvement with ZF, however, is observed to be limited mainly owing to the poor error performance of the linear detection techniques. From Fig 7, it can also be stated that the most significant factor in terms of video quality is the choice between linear detection and joint detection rather than the use of HQAM UEP and/or multi-antenna UEP.

Conclusion
In this paper, a novel UEP technique that uses both the previous multi-antenna UEP and the concept of HQAM UEP is proposed. As normal square QAM constellation, which is equivalent to HQAM with α = 1, is used in the proposed technique, the calculation of parameter α to achieve the desired error performance is not necessary; furthermore, QAM mapping at the transmitter side and the soft decision calculation at the receiver side need not be changed. Although standard square QAMs are used in the proposed technique, the term HQAM in the name of the proposed technique indicates that the UEP advantage of HQAM is retained. With a set of realistic simulations of the application-physical cross-layer in strict accordance with the IEEE802.16m standard and H.264/AVC video compression standard, it was shown that the proposed HQAM-multi-antenna UEP offers a significantly improved PSNR performance  UEP technique for video streaming over MIMO-OFDM systems when compared with that of the previous UEP techniques. We also observed how wireless resources with different PLs are used for streaming videos with different significance levels in order to explain the improved video quality achieved using the proposed UEP. In an effort to even better improve the video quality than the proposed technique, a possible future work is to transmit pictures selectively and/or to allocate more power to a subset of selected transmit antennas.