Scanning double-sided documents without incurring show-through by learning to fuse two complementary images using multilayer perceptron

This paper presents a novel method for scanning duplex-printed documents without incurring the unwanted show-through artifact. The proposed method achieves the goal of eliminating the leaked-out reverse-side content by fusing a white backed scan image with a black backed scan image of the document. The fusion is accomplished using a multilayer perceptron having learned a fusion mapping from manually corrected document images. The main novel contributions of this work include (1) being the first to propose to accomplish the goal of show through free scanning by fusing a white backed scan image with a black backed scan image of the document; (2) proposing a learning approach using a multilayer perceptron to learn the fusion mapping from manually corrected scan images; and (3) proposing to use the pixel value histogram of reverse-side-printed area as well as the pixel value histogram of duplex-printed area to quantitatively indicate show through severity to facilitate objective comparison of the methods in consideration. The experiment results show that the proposed method is remarkably more powerful in eliminating show through than the two state-of-the-art methods in comparison.


Introduction
Document scanning has become an office routine being performed every day and everywhere to capture digital image of document page for convenient storage, copying, transmission, processing, analysis, and recognition etc. One major deficiency of the existing scanning methods is that the text and image content on the reverse side of duplex-printed document may show through the paper substrate to appear in the scan image. Fig 1 displays a part of the scan image of a duplex-printed book page where the reverse side content (displayed in Fig 2) leaks out into the scan image. The lower the paper substrate's opacity, the severer the show through becomes. The leaked out reverse side text and image content reduces the scan image's aesthetic quality, decreasing its readability for both the human eye and the Optical Character Recognition (OCR) system. PLOS  Considerable research has been performed over the past decades to develop methods to eliminate the unwanted show through artifact. One existing approach is to apply a sheet of black paper to back the document for scanning [1]. This method significantly suppresses show through thanks to the reduction of the light reflected back to the page from the backing surface. This approach however brings about the undesirable side effects of showing up the material texture of the paper substrate, and producing black spots if the document contains worn out holes, and giving rise to black borders if the document is smaller than the scan area.   Scanning double-sided documents without incurring show-through by learning to fuse two complementary images Another existing way to overcome the show through problem is to scan both the front side and back side of the page, and then use the front side image and a flipped and registered version of the reverse side image to achieve the goal [2][3][4][5][6][7][8][9]. This class of methods does not produce the side effects of [1], but they require accurate registration of the two scan images which is often difficult to accomplish since the leaked out reverse side content is typically blurred and much fainter than the front side content resulting in registration difficulty and error that will significantly decrease the performance and may even lead to failure of these methods. The performance of this group of methods also relies on accurate modeling of the show though optical process which is difficult to obtain, resulting again in performance degradation.
As of today, the performance of the existing methods have not been adequate, and the research problem of how to better overcome the show-through interference is still open. Our work is motivated from a new insight, that is, from the observation that the advantages of applying white backing to the document during scanning are complementary to that of applying black backing. Inspired by this insight, we have developed a method to fuse the two images to take both advantages. The experiment results show that our proposed approach offers significantly better performance than the state-of-the-art methods in comparison.
The main novel contributions of this work include (1) being the first to use a learned fusion mapping to fuse a white backed scan image with a black backed scan image of the document to realize the goal of show through free scanning; (2) proposing a learning approach using a multilayer perceptron to learn the fusion mapping from manually corrected scan images; and (3) proposing to use the pixel value histogram of reverse-side-printed area as well as the pixel value histogram of duplex-printed area as a measure of show through severity to facilitate objective comparison of the methods in consideration.
The remaining part of this paper is organized to first present the proposed method in the next section, and then to present and discuss the experiment results in the subsequent section, and finally to draw conclusions in the last section.

The proposed show-through elimination method
Inspired by the observation that applying black backing to a document for scanning offers advantages complementary to applying white backing, we have developed a method to fuse the two scan images using a multilayer perceptron to achieve the goal of show through free scanning of two-sided documents. The proposed method is comprised of three steps as shown in

Capturing white backed scan image and black backed scan image
The first step of the proposed method is to use the backing surface of the scanner which is usually white (otherwise to use a sheet of white paper) to back the document and scan it to obtain a white backed scan image I w , and then use a sheet of black paper to back it to obtain a black backed scan image I b .

Registering black backed scan image to white backed scan image
The second step is to register black backed scan image I b to white backed scan image I w in order to be able to fuse them. To this aim, we first detect key points in images I b and I w using Scale Invariant Feature Transform (SIFT) [10,11] which has been reported to be powerful and robust. Other detectors such as SURF [12] and SUSAN [13] may also be used. We then match the detected key points of the two images according to the similarity of their descriptor vectors. Thereafter we use RANSAC [14,15] to determine the parameters (a, b, θ) of the rigid transform C (Eq 1) from the I b plane to the I w plane We then apply the transform C and use bilinear interpolation to obtain the registered image I b~, that is, we first computẽ where └.┘ denotes the integer floor operator, C x is an abbreviated notation for the x component of C(x, y). We finally obtaiñ Fusing the two scan images using trained multilayer perceptron The third step is to fuse image I w with registered image I b~u sing the learned fusion mapping Θ(x, y) as discussed in the next section. For monochrome image, the fusion is accomplished by in which μ is the scaling factor as discussed in Section 2.4. For color images, the R, G, and B channels of the image are individually fused using Eq 5.

Learning a fusion mapping with multilayer perceptron
We employ multilayer perceptron and error back propagation algorithm [16,17] to learn the fusion mapping Θ(x, y) used in Step 3 of the proposed method (as discussed in Section 2.3).
The reason for choosing multilayer perceptron and back propagation algorithm is due to its proven strong ability to learn complex mappings. Our network consists of 3 layers of sigmoidal neurons of the following sigmoidal activation function where y(v i ) is the output of the i-th neuron and v i is the weighted sum of the input synapses. The network has 2 inputs and 1 output, and the hidden layer contains 10 neurons. The training samples are obtained from manually corrected scan images together with their corresponding white backed scan images and black backed scan images. For one manually corrected image I c , and its corresponding white backed scan image I w and registered black backed scan image I b~, we obtain one training sample from each pixel triple of the image triple (I c , I w , I b~) . For each training sample (p c , p w , p b~) , the input to the network is (μ.p w , μ.p b~) and the expected output is μ.p c . The scaling factor μ is used to fit the pixel value range to the input and output ranges of the perceptron, and is set to 0.8/256 for conventional digital image of 8-bitper-channel pixels.

Experiments and discussions
We use HP Scanjet G4050 at 600 dpi resolution and duplex-printed book pages for the experiments to evaluate the performance of the proposed method and to compare it to two state-ofthe-art methods [1,9]. The proposed method is implemented in C++ with Open Source Computer Vision [18] and Fast Neural Network Library [19], and runs on an ordinary desktop PC with Intel i5-4460T CPU and 8G memory.
In addition to visual inspection and comparison of the resulting images obtained by the three methods, we propose to use the pixel value histogram of reverse-side-printed area (image area that contains reverse-side content but no front-side content) as well as the pixel value histogram of duplex-printed area (image area that contains both front-side printing and reverse-side printing) to quantitatively indicate the severity of the show-through to enable objective comparison. The pixel value histogram of reverse-side-printed area will have one highly concentrated peak at high pixel value location if the area does not contain show through, and the histogram will become less concentrated and spread more to lower pixel value region as the area contains more leaked out reverse side content. We use its standard deviation to quantitatively indicate the severity of the show-through, as a larger standard deviation value indicates more dispersion of the peak, which corresponds to severer showthrough.
The learning of the fusion mapping Θ(x, y) by the multilayer perceptron is accomplished using manually corrected scan image together with the original white backed scan image and black backed scan images of one book page. The training process needs to be carried out only once and takes about 10 seconds to complete.
We use 2 duplex-printed book pages to evaluate and compare the three methods, and include the white-backed scan images and the black-backed scan images of the front side, and white-backed scan images of the reverse side of the two duplex-printed book pages in S1 through S6 Figs. We now present and analyze the experiment results of the proposed method and two state-of-the-art methods in comparison.

Method [1]
A resulting image obtained by the method proposed in [1] is displayed in Fig 4. We can see that the black backing proposed by [1] reduces the show-through back-side printing as compared to the white-backed scan image displayed in Fig 5 thanks to the reduction of the light reflected back from the reverse side. It however brings about material texture of the paper substrate appearing as blueish image texture, which is undesirable.
We now use the pixel value histogram (the blue line in Fig 6) of a reverse-side-printed area (marked with a red rectangle in Fig 4) to evaluate its performance. We can see from Fig 6 that the histogram is not ideally concentrated at high pixel value location, and in fact the peak appears less concentrated than that for the raw white-backed scan image (the dashed black line in Fig 6). This is explainable since the blueish image texture brought about by [1] increases the dispersion of the peak, and the increase is more than the decrease from the reduction of shown-through reverse-side content.
In order to quantitatively indicate the dispersion of the peak, we use the standard deviation of the histogram. The standard deviation values for [1] are listed in Table 1. We can find from the table that the standard deviation values for [1] are larger than those for raw white-backed scan image. This agrees with our visual inspection of the histogram graphs.
In addition to the pixel value histogram of reverse-side-printed area, we also use the pixel value histogram (the blue line in Fig 7) of a duplex-printed area (marked with a blue rectangle in Fig 4) to analyze its performance. We can see from Fig 7(A) that this histogram has an additional smaller peak to the left of the larger peak. The smaller peak is resulted from the frontside printed text in the area, and the larger peak is resulted from areas containing no front-side content nor reverse-side content. The larger peak also occurs in the pixel value histogram of reverse-side-printed area while the smaller peak does not. We can see from Fig 7(B) that the peak for [1] is not ideally concentrated at high pixel value location, and in fact it appears less concentrated than that for the raw white-backed scan image. This is consistent with the results using the pixel value histogram of reverse-side-printed area.

Method [9]
A resulting image obtained by the method proposed in [9] is displayed in Fig 8. We can see that this method considerably reduces show-through, and moreover, unlike [1], it does not bring about the undesirable material texture of the paper substrate, which is a significant advantage over [1]. We can however still observe visible remnant shown-through reverse-side printing in the resulting image.
We now use the pixel value histogram (the red line in Fig 6) of a reverse-side-printed area (marked with a red rectangle in Fig 8) to analyze its performance. We can see from Fig 6 that the histogram for [9] is more concentrated than that for [1] and that for the raw white-backed scan image but still has some spread over the lower pixel value region (the left side tail), reflecting the visible residue of the leaked-out reverse-side printed text.
We use the standard deviation of the histogram to quantitatively indicate the dispersion of the peak. The standard deviation values for [9] are listed in Table 1. We can find from the table that they are consistently smaller than those for the raw white-backed scan image and those for the resulting image by [1]. This agrees with our visual inspection of the histogram graphs.
In addition to the pixel value histogram of reverse-side-printed area, we also use the pixel value histogram (the red line in Fig 7) of a duplex-printed area (marked with a blue rectangle in Fig 8) to evaluate the performance. We can see from Fig 7(B) that the peak is more concentrated than that for [1] and that for the raw white-backed scan image but still has some spread over the lower pixel value region (the left side tail). This is consistent with the results using the pixel value histogram of reverse-side-printed area.  [1], the red line for [9], the green line for our method, and the dashed black line for the raw white-backed scan image.

Proposed method
A resulting image obtained by our method is displayed in Fig 9. We can see that it contains significantly less remaining leaked-out reverse-side printing than that of [1] and [9], indicating that our method is more effective than [1] and [9] in removing show-through. Another advantage of our method is that it does not produce unwanted side effects such as bringing about the material texture of the paper substrate. We now use the pixel value histogram (the green line in Fig 6) of a reverse-side-printed area (marked with a red rectangle in Fig 9) to evaluate its performance. We can see from Fig 6 that the histogram for our method is significantly more concentrated than that for [1] and [9], reflecting its superior show-through elimination ability and the advantage of causing no side effects.
We use the standard deviation of the histogram to quantitatively indicate the dispersion of the peak. The standard deviation values for our method are listed in Table 1. We can find from the table that they are consistently smaller than those for [1] and [9]. This agrees with our visual inspection of the histogram graphs.
In addition to the pixel value histogram of reverse-side-printed area, we also use the pixel value histogram (the green line in Fig 7) of a duplex-printed area (marked with a blue rectangle in Fig 9) to evaluate its performance. We can see from Fig 7(B) that the peak for our method is more concentrated than that for [1] and [9]. This is consistent with the results using the pixel value histogram of reverse-side-printed area.

Conclusions
We have presented in this paper a novel method for scanning duplex-printed documents without incurring the undesirable show through interference. The main novelty of the proposed  Scanning double-sided documents without incurring show-through by learning to fuse two complementary images method is to achieve the goal of show through free scanning by fusing a white backed scan image with a black backed scan image of the document. The fusion is accomplished using a multilayer perceptron that learns the fusion mapping from manually corrected scan images. We have also proposed to use the pixel value histogram of reverse-side-printed area as well as the pixel value histogram of duplex-printed area to indicate the severity of the show through to enable objective comparison of the methods in consideration. Experiment results show that the proposed method offers substantially stronger show through suppression ability than the two state-of-the-art methods in comparison.