The TRICLOBS Dynamic Multi-Band Image Data Set for the Development and Evaluation of Image Fusion Methods

The fusion and enhancement of multiband nighttime imagery for surveillance and navigation has been the subject of extensive research for over two decades. Despite the ongoing efforts in this area there is still only a small number of static multiband test images available for the development and evaluation of new image fusion and enhancement methods. Moreover, dynamic multiband imagery is also currently lacking. To fill this gap we present the TRICLOBS dynamic multi-band image data set containing sixteen registered visual (0.4–0.7μm), near-infrared (NIR, 0.7–1.0μm) and long-wave infrared (LWIR, 8–14μm) motion sequences. They represent different military and civilian surveillance scenarios registered in three different scenes. Scenes include (military and civilian) people that are stationary, walking or running, or carrying various objects. Vehicles, foliage, and buildings or other man-made structures are also included in the scenes. This data set is primarily intended for the development and evaluation of image fusion, enhancement and color mapping algorithms for short-range surveillance applications. The imagery was collected during several field trials with our newly developed TRICLOBS (TRI-band Color Low-light OBServation) all-day all-weather surveillance system. This system registers a scene in the Visual, NIR and LWIR part of the electromagnetic spectrum using three optically aligned sensors (two digital image intensifiers and an uncooled long-wave infrared microbolometer). The three sensor signals are mapped to three individual RGB color channels, digitized, and stored as uncompressed RGB (false) color frames. The TRICLOBS data set enables the development and evaluation of (both static and dynamic) image fusion, enhancement and color mapping algorithms. To allow the development of realistic color remapping procedures, the data set also contains color photographs of each of the three scenes. The color statistics derived from these photographs can be used to define color mappings that give the multi-band imagery a realistic color appearance.


Introduction
The significance of image fusion Night vision cameras are a vital source of information for a wide-range of critical military and law enforcement applications such as surveillance, reconnaissance, intelligence gathering, and security [1,2]. Currently, most night-time imaging systems are either low-light-level cameras which amplify reflected visible (0.4-0.7μm) to near-infrared (NIR, 0.7-1.0μm) light or thermal, long-wave infrared (LWIR, 8-14μm) cameras which convert thermal energy into a visible image. Because these systems operate in different spectral bands they typically represent different aspects of a scene. For instance, after a period of extensive cooling (e.g., after a long period of rain or early in the morning) the background (i.e., vegetation or soil areas, buildings and other manmade objects) of an outdoor scene may be represented in full detail in the visible bands, but may be much less detailed in the infrared bands due to the low thermal contrast in the scene. On the other hand, objects like vehicles or human beings, which often have an appreciable temperature contrast with their surroundings, will typically be shown with high contrast in the infrared bands. They may be (nearly) invisible (camouflaged-) in the visible bands when their luminance and/or color approaches that of their surroundings. In such cases the thermal component of a fused visible/thermal image may help to detect and localize targets in the context provided by the visible component [2] and contribute to situational awareness [3]. In addition, appropriate color mappings may serve to give multi-band night-vision imagery an intuitive color appearance, which may in turn lead to faster and more accurate scene recognition [4,5]. Examples of observational tasks that will evidently benefit from realistically rendered fused multi-band imagery are navigation and surveillance tasks.

The state of the art in multiband nighttime image fusion
The increasing availability of sensors operating at low light levels and in multiple spectral bands has spurred the development of image fusion and enhancement algorithms for surveillance and navigation applications. The general aim is to provide imagery that is both rich in information content (more informative than the individual bands), easy to interpret (ergonomic in a cognitive sense) as well as robust against degradation of environmental conditions and/or sensor performance. To this end many different image fusion techniques have been proposed and new studies appear regularly (for a recent review see [49]). Most methods apply fusion to combine context information from the Visual band with the LWIR band [50][51][52][53][54][55][56][57][58][59][60][61][62][63]. Some fusion methods employ the NIR channel to enhance contrast in the Visual image, exploiting the high permeability of NIR against atmospheric haze [64][65][66]. Other fusion methods use statistical information [67][68][69] color lookup tables [70,71], histogram matching techniques [72] or color transforms [73] to give fused multiband imagery a realistic color appearance by transferring the color characteristics of associated visible color imagery. All methods implicitly assume the availability of spatially registered multiband imagery. Most of the aforementioned fusion methods [50-54, 56-62, 69, 73] were developed with the limited set of static registered multiband (Visual, NIR, LWIR) imagery which we provided earlier [74,75]. None of the methods discussed have been applied to dynamic multiband image sequences, most likely because of this type of imagery is currently not publicly available.

The need for registered multi-band imagery
Despite of the ongoing interest in the fusion of multi-band (specifically visual, NIR and LWIR) images for surveillance applications and the steady stream of publications in this area, there is only a very small number of static registered multi-band test images (and a total lack of dynamic image sequences) publicly available for the development and evaluation of image fusion algorithms (e.g., [76]). Moreover, there is no central repository from which these images can be obtained. (The site www.ImageFusion.org, which had provided some multispectral imagery, closed several years ago, although recent publications still refer to this site). To the best of our knowledge there are currently only two dedicated image data sets for the development of image fusion techniques that are available from a public repository. The first is the TNO Image Fusion Data Set [74]. It contains only a limited set of static multispectral (intensified visual, NIR, and LWIR) nighttime imagery of military-relevant scenarios. The second dataset is the Kayak image fusion sequence [75]. It provides registered intensified visual, midwave IR (3-5μm) imagery, and LWIR dynamic imagery depicting three kayaks approaching a shore in a cluttered maritime background. The only other publicly available database of which we are aware contains aligned dynamic Visual and LWIR imagery representing driving scenarios in urban environments. It is intended for the development of visual place recognition algorithms, and not suited for the development of image fusion algorithms [77].
We provide the TRICLOBS image data set (described in this paper) to fill this data gap. This data set contains registered three-band dynamic imagery of different surveillance scenarios showing various types of human activity. This imagery can be used for the development and evaluation of (both static and dynamic) image fusion, enhancement and color mapping algorithms. The content outline for the rest of this paper is as follows. First, we describe the TRICLOBS camera system that was used to collect the imagery and the locations and scenarios that were registered. Next, we describe the structure and content of the TRICLOBS data set. Then, we present some sample applications that use the TRICLOBS imagery as input. Finally, we end with some concluding remarks and note some limitations of the data set.

Materials and Methods
The TRICLOBS multi-band camera system We recently developed the TRICLOBS (TRI-band Color Low-light OBServation) all-day allweather surveillance and navigation system which contains three optically aligned cameras that are sensitive to the Visual, NIR and LWIR parts of the electromagnetic spectrum [78]. The system combines (fuses) the three camera signals in real-time in a false-color RGB signal. It has the capability to perform color remapping using either pre-recorded lookup tables [79] or by deriving the color mapping in real-time from synthetic imagery or Google Earth data [80].
As a result the system provides dynamic imagery with a realistic color appearance in conditions of low visibility (low light levels, smoke). A recent evaluation study using static imagery has shown that human scene inspection and recognition performance with color remapped TRICLOBS imagery resembles the performance with standard color photographs [5].
The TRICLOBS can also be extended with a synthetic 3D scene generation system in combination with an additional image fusion and image processing module. The result is INVIS, an Integrated Night VIsion surveillance and observation System: [81]. For an online demonstration of INVIS' capabilities, see [82,83]).
This section describes the TRICLOBS system and the image registration procedure. In addition, we briefly explain the color remapping procedure that was used to create the falsecolor movies thus illustrating the contents of this data set. More details of the hard-and software of the TRICLOBS system [78] and the color mapping procedure [79] are presented elsewhere. The thermal camera is a XenICs Gobi 384 uncooled a-Si infrared microbolometer (www. xenics.com). It has a 384 × 288 pixel focal-plane array, and a spectral sensitivity range of 8-14μm. This is the range of most interest for outdoor surveillance applications. The camera is equipped with an Ophir supIR18mm F/1 lens (www.ophiropt.com) providing a 29.9˚× 22.6F OV. The Gobi 384 has 16-bit Ethernet and CameraLink interfaces running at 44 frames/s. Two beam splitters are deployed to direct the appropriate band of the incoming radiation to each of the three individual sensors (Fig 1). The incoming radiation is first split into a (thermal) long-wave infrared (LWIR) part and a Visual+NIR part by a heat reflecting (hot) mirror. The hot mirror is a custom made Melles Griot dichroic beam splitter consisting of Schott N-BK7 Borosilicate Crown glass with an Indium Tin Oxide coating. It has a reflection coefficient of R > 85%. The LWIR part of the spectrum is reflected into the lens of the thermal camera, while the Visual+NIR light is transmitted to a combination of two digital image intensifiers that are mounted at an angle of 90 degrees. Next, an NIR reflecting mirror with 45 deg angle of incidence, Borofloat glass, and type Edmund Optics B43-958, 101 × 127 × 3.3 mm (see: www.edmundoptics.com separates the incoming light by transmitting the Visual (0.4-0.7μm) and reflecting the NIR part (0.7-0.9μm), in such a way that one image intensifier registers the visual part and the other one only detects the NIR part of the incoming radiation. The sensor geometry is such that the optical axes of all cameras are aligned. The sensors and the mirrors are mounted on a common metal frame. The whole configuration is portable and contained in a dust and water resistant housing (Fig 2a and 2b) that could be mounted easily onto a mobile platform (Fig 2c). A Germanium window covers the aperture of the thermal camera. The sensor suite delivers both analog video and digital signal outputs.

GPS receivers
An internal U-blox EVK-5P Positioning Engine (www.u-blox.com) provides a position and orientation (i.e., sensor location and viewing direction) signal through the high-speed 7-port USB 2.0 hub. The accuracy in position is less than 3m. The accuracy in orientation is less than The TRICLOBS Dynamic Multi-Band Image Data Set 5 degrees. In local area operations when high accuracy is required, an external Trimble SPS751 GPS receiver set (www.trimble.com) is connected to the system, to achieve high position accuracy (< 1 cm) through real time kinematic (RTK) GPS signal correction. The position information provided by the U-blox Positioning Engine can be used to load color look-up tables (see the section on color remapping) that are optimal for the environment in which the TRI-CLOBS system is being deployed [80][81][82][83].

Electronic compasses
An internal Silicon Labs F350-COMPASS-RD multi-axis electronic compass (www.silabs. com) provides the azimuth and tilt angle of the optical axis of the sensor suite with an accuracy of a few degrees (Fig 2b). When the viewing direction needs to be known with higher accuracy, an external Xsens 3D inertial measurement unit (IMU) motion sensor with an accelerometer, magnetometer and gyroscope (www.xsens.com) is connected to the system to measure Yaw, Roll en Pitch with an accuracy less than 0.1˚. The viewing direction provided by the electronic compass can for instance be used to render a view from a synthetic geometric 3D scene model that corresponds to the viewing direction and viewpoint of the TRICLOBS camera system [80][81][82][83].

Computer
A Dell Precision M2400 Intel Core Duo P8600 2.4GHz laptop with a solid state hard disk is used to store, colorize, and visualize the sensor signals and to generate and display the synthetic scene view. The current implementation achieves real-time (~25 Hz) visualization, signal enhancement and data registration. Displays Two 6.4" TFT video displays, embedded in the system casing, enable simultaneous monitoring of two of the three video signals (either Visual/NIR, Visual/LWIR, or NIR/LWIR; Fig 2a). The laptop display (14 inch, 1440x900 pixels) is used to view the final fused, colored and enhanced images.

Data transfer and storage
The Photonis ICU's are connected to a high-speed 7-port USB 2.0 hub. This enables the user to interface with the ICU's and to adjust their settings, or to download and install preferred settings.
A Pleora iPORT PT1000-ANL-2/6 frame grabber (www.pleora.com) digitizes the analog video output signals of (1) both ICU's and (2) the Gobi 384. Digitization is performed at a rate of 25 frames/s with a resolution of 640x480 pixels and 10 bits per pixel.
The three sensors are not frame synchronized. Instead, the images of each of the three sensors are stored with a time stamp (in ms). A three band image sequence is then constructed by combining each Visual frame with the NIR and LWIR frames with nearest time stamps in a single RGB frame. This procedure results in an average temporal offset between the Visual and NIR bands of 7.5±1 ms and an average temporal offset between the Visual and IR bands of 9±1 ms. Thus, the temporal offset between the individual bands was less than one frame.
The Pleora transmits these signals to a Netgear Gigabit Ethernet switch. The 16-bit TCP/IP Ethernet interface of the XenICs Gobi 384 is also directly connected to the Netgear Gigabit Ethernet switch.
Three Pinnacle Video Transfer Units (www.pinnaclesys.com/PVT) store (a) the analog video signals of all three cameras, and (b) the audio signals of two (optional) external microphones, either on 3 internal 320 Gb hard disks, or on USB memory sticks. The microphones can be positioned on the front and back of the camera suite. The microphone on front could then be used to register relevant audio information from the registered scene, and the second microphone could be used to record spoken annotations.

Image registration
The Visual (0.4-0.7μm) and NIR (0.7-0.9μm) images provided by the two ICU digital image intensifiers have a size of 640 × 480 pixels and represent a FOV of 30.7˚× 24.8˚. The LWIR (8-14μm) image provided by the XenICs Gobi 384 thermal camera has a size of 384 × 288 pixels and represents a FOV of 29.9˚× 22.6˚. As a result of the optical alignment of the camera systems, the FOV of the LWIR image corresponds to the central part of the FOV of the Visual and NIR images. The size of this common FOV area is 621 × 461 pixels in the Visual and NIR images. To enable the fusion of the LWIR image with the other two channels, the LWIR image (384 × 288 pixels) is bi-linearly interpolated and up-sampled (by a factor of about 1.6) to 621 × 461 pixels. The Visual and NIR images are cropped to their central part of 621 × 461 pixels so that only the common FOV area remains. Finally, all images are rescaled to 640 × 480 pixels. This is also the size of the three-band color images provided in the database presented in this study. As a result, an individual pixel represents about 2.8 × 2.8 min of arc of the visual field.

Color remapping
The TRICLOBS system has the option to deploy a recently developed color remapping technique [79]. This mapping assumes a fixed relation between false-color tuples and natural color triplets for bands near the visual spectrum. This allows its implementation as a simple color table swapping operation. For bands that are not correlated with the visual spectrum (such as LWIR) this assumption evidently does not apply. In that case, the color remapping could be used to improve the detectability of targets through contrast enhancement and color highlighting [84].
Color remapping can be achieved by mapping the multi-band sensor signal to an indexed false-color image and swapping its color table with that of a regular daylight color image of a similar scene (see Fig 3). Different (e.g., urban, rural, maritime, or desert) environments may require specific color tables. However, in practice we found that an entire environment is well represented by a single color table, as long as the environmental characteristics don't change too drastically [85]. Thus, only a limited number of color tables is required in practice. These tables need to be constructed only once, before the system is deployed.
For a given environment, the lookup color table transformation can be derived as follows. First, take a multispectral image of a scene that is typical for the intended operating theater and transform this image to an indexed color representation. Second, take an indexed representation of a regular color photograph of a similar scene. Then there are two options [86]. Either transfer the first order statistics of the lookup table of the color photograph to the lookup table of the false-color multispectral image [68], or establish a direct mapping between corresponding entries in both tables [79].
When matching multispectral and daylight color images of the same scene are available, a realistic color mapping can be obtained by establishing a direct relation between the values of corresponding pixels in both images [79]. When there is only a daylight color image available of an environment that is similar to the one in which the multispectral sensor suite will be deployed, a mapping can still be established by transferring the color statistics of the daylight image to the multispectral image [68]. Although the first approach yields more specific colors, both approaches produce intuitively correct and stable color representations. Note that the statistical approach can even be used with imagery from sources like artificial terrain databases or Google Earth [80] (for demonstrations see [82] and [83]). The specificity of the lookup-table color remapping has the additional advantage that it enables to selectively enhance and emphasize details of interest (e.g., camouflaged targets) in a given scene [79,87,88].
For the sake of completeness we will briefly describe our color transformation here, using the example shown in Fig 3. (A full description of the method is presented elsewhere [79]). Fig  3a depicts the full color daytime reference image, which is a color photograph taken with a standard digital camera . Fig 3b and 3c show a visible and near-infrared image of the same scene. Fig 3f shows the result of applying daytime colors to the two-band night-time sensor image using our new color mapping technique. The color transfer method works as follows. First, the three-band sensor image is transformed to a false-color RGB image by taking the individual Visual (Fig 3b), NIR (Fig 3c), and LWIR bands as input to the R,G and B color channels respectively. The result is a false-color RGB-image (Fig 3d). In this example we assume that the LWIR signal is absent (black) and we consider only the Visual and NIR (R and G) channels, because these two channels are typically used to produce naturalistic colors while the LWIR (B) channel usually serves to enhance target distinctness. Note that the actual order of the initial mapping of the sensor channels to the RGB channels is irrelevant. Mapping the sensor bands to a false-color RGB-image allows us to use standard image conversion techniques, such as indexing [89]. In the next step the resulting false-color (RGB) image (Fig 3d) is converted to an indexed image. Each pixel in such an image contains a single index. The index refers to an RGB-value in a color look-up table. The number of entries can be chosen by the user. In the present example of a sensor image consisting of two bands (R and G; Fig 3d) the color look-up table contains various combinations of R and G values. Here the B-values are ignored since only the Visual and NIR bands are considered, however the procedure is identical when all three channels are considered. For each index representing a given R,G combination (i.e., for a given false color) the corresponding realistic color equivalent is obtained by locating the pixels in the target image with the same index and collecting the corresponding pixels in the (realistic color) reference image (Fig 3a). First, the RGB-values are converted to perceptually de-correlated lαβ values [90]. Next, the average lαβ-vector is calculated over this ensemble of pixels. Averaging guarantees that the computed average color reflects the perceptual average color. Averaging automatically takes the distribution of the pixels into account. Colors that appear more frequently are given a greater weight. For instance, let us assume that we would like to derive the realistic color associated with color index i. In that case we locate all pixels in the (indexed) false-color multi-band target image with color index i. We then collect all corresponding pixels (i.e., pixels with the same image coordinates) in the reference daytime color image, convert these to lαβ, and calculate the average lαβ-value of this set. Next, we transform the resulting average lαβ-value back to RGB. Finally, we assign this RGB-value to index i of the new color look-up table. These steps are successively carried out for all color indices. This process yields a new color look-up table containing the realistic colors associated with the various multi-band combinations in the false-color (RGB) look-up table. Replacing the RGB-color look-up table (left side of Fig 3e) by the realistic color look-up table (right side of Fig 3e) yields an image with a realistic color appearance, in which the colors are optimized for this particular sample set (Fig 3f).

Registration sites and conditions
The imagery in the TRICLOBS data set were collected at three different locations and show three different scenes.
The first part of the imagery (TRI_A) was collected at Marnehuizen, The Netherlands (53.386311 deg North latitude, 6.262761 deg East longitude) (see https://nl.wikipedia.org/wiki/ Marnehuizen). Marnehuizen is a Dutch mock-up village built to train soldiers and police for operations in urban environments. Marnehuizen consists of houses, sheds, shops, a bank, a school, a town hall, a gas station and a village square. Street furniture like lights, telephone booths, shrubs and trees, street signs, fences, parking lots, and car wrecks give the village a realistic appearance.
The second part of the imagery (TRI_B) was registered at the training grounds of the Royal Netherlands Army Camouflage School at Reek, The Netherlands (51.729450 deg North latitude, 5.705908 deg East longitude). This site also contains a few realistic houses with some street furniture for training purposes.
The third part of the imagery (TRI_C) was collected at the site of the TNO laboratories in Soesterberg, The Netherlands (52.115469 deg North latitude, 5.290877 deg East longitude). The imagery was collected near a side entrance of the main building, where the walls were covered with foliage and a path leads to glass doors.
An 'additional_data' folder in the TRICLOBS data set contains Excel and CSV files that present information on the environmental (atmospheric and lighting) conditions during the actual image registration periods. Table 1 gives a brief description of the different scenarios represented in the TRICLOBS dynamic image data set. The scenes that are registered are representative for typical shortrange (military or civilian) surveillance scenarios. They include people, vehicles, foliage and buildings or other man-made structures. The people are either stationary, walking or running and dressed in military or in civilian clothing. They perform a range of different (sometimes suspicious) activities, such as carrying different objects (box, gun, axe, stick), loitering, hiding in the foliage, inspecting and entering buildings, fighting, walking around and driving vehicles. Their image may vary from completely unobstructed, to partially occluded and fully occluded during the course of a scene. The imagery was collected for a range of different lighting conditions, ranging from just before (Scenario C) and just after (Scenario B) sunset to full darkness (Scenario A). The range was chosen so as to ensure a variation in the information content in the different spectral bands. Most scenarios present outdoor human activities, while some also show activities inside buildings (Scenarios B1-3, C3-4). Some scenes (Scenarios A3, C2) contain smoke that obscures the Visual and NIR channels. Note that scenes in which people are behind smoke (opaque for Visual and NIR) or glass (opaque for LWIR) yield imagery with complementary information content because the different spectral bands represent different The camera pans over a scene with houses, trees and semi shrubs and an abandoned car wreck.

A3 1744
The scene represents a path between two houses. Smoke from a grenade that is thrown into the scene fills the space between the two houses. An armed soldier appears from behind the house on the left, crawls over the ground and enters the house on the right. The soldier leaves the house and walks away.

A4 1573
The camera pans back and forth over a scene with a row of houses and a bank.
There is a roll of barbed wire on the ground in front of the bank. A person runs from right to left and disappears behind the house next to the bank. An armed soldier leaves the bank.

B4
2574 Two persons, one carrying a stick and the other one carrying a small axe, pass behind a glass shelter, enter the scene from behind the shelter, start a fight using their axe and stick, and walk away.

C1 6133
The scene shows the facades of two houses with a lawn in the foreground. In front of the house on the left there is a hedge. A small hot object lies on the ground at the right end of the hedge. A person enters the scene from left, disappears behind the hedge, returns with a box, drops the box over the hot object on the ground, and leaves the scene on the left. Somewhat later the person returns, removes the box from the hot object, drops it behind the hedge, and leaves the scene again. Later, armed soldiers and a civilian walk through the scene from left to right, passing by the hot object on the ground.  10365 Same scene as in C1. Two persons with a jeep repeatedly drive along to deliver goods: first two times to the house on the right, then twice to the house on the left.
C6 6657 Same scene as in C1. A person enters the scene from left, disappears behind the hedge, returns with a box, drops the box over the hot object on the ground, and leaves the scene on the left. Civilians carrying a range of different objects (e.g., a briefcase, a rake) and soldiers (some carrying guns) walk through the scene past the hot object.
The TRICLOBS Dynamic Multi-Band Image Data Set details in the scene. The scenarios are particularly useful for the evaluation of image fusion algorithms.
The only individuals that appear in this manuscript and in the TRICLOBS dynamic image data set are the authors of this paper. These individuals have given written informed consent (as outlined in PLOS consent form) to publish these case details.
The main folders TRI_A, TRI_B, and TRI_C contain 4, 4, and 8 subfolders, respectively. Each subfolder contains TRICLOBS imagery of the different scenarios registered at each of the three different locations. In addition, each of the folders TRI_A, TRI_B, and TRI_C also contains a folder with the extension 'photographs', which contains full-color photographs of the registration site. These photographs can be used to derive color mappings that give the TRI-CLOBS multi-band imagery a realistic color appearance.
The subfolders for the individual scenarios each contain a folder with the extension 'frames' and a movie in MP4 format.
The 'frames' folders contain sequentially numbered false-color RGB images in 24-bit BMP format and with a size of 640 × 480 pixels (e.g., Fig 4). Each of these RGB images is composed of three corresponding (i.e., nearly simultaneously grabbed) 8-bit frames from the Visual (R), NIR (G), and LWIR (B) channels, respectively, of the TRICLOBS system. Since all frames are sequentially numbered the user may use any video capture and processing tool to view them as a video stream. An example of an efficient tool for this purpose is VirtualDub, which is licensed under the GNU General Public License and is freely available from http://www. virtualdub.org. With VirtualDub the images in a given frames folder can simply be inspected as a video stream by using the 'File > Open video file' option and selecting the first image in the frames folder. VirtualDub will then automatically open the entire range of sequentially numbered frames and a slider below the image display allows easy navigation through the motion sequence.
The MP4 movies consist of four panels: the lower three panels represent the Visual (left), NIR (middle) and LWIR (right) channels, while the upper panel shows the fused result after color remapping on the TRICLOBS system (e.g., Fig 5). These movies only serve to give the user a quick impression of the contents of the corresponding frames folders (by dynamically showing the individual channels in the lower three panels) and the effects that may be achieved by realistic color remapping of the false-color frames (upper panel). They are not intended for further processing. The TRICLOBS Dynamic Multi-Band Image Data Set

Example Applications
Image fusion  [91], using simple averaging to compute the lowest resolution level of the fused image representation [92]. Fig 6a shows a frame from the TRI_B1 sequence in the TRICLOBS database. This scene represents a building with a glass door and a person carrying a box behind the door. There is foliage on both sides of the path leading to the door. Smoke rises from a chimney on the left. Notice that there is a person behind the glass door carrying a large box. This person is quite distinct in the Visual band (Fig 6b), much less visible in the NIR band (Fig 6c) and invisible in the LWIR band (Fig 6d) because glass is opaque to thermal radiation. The smoke from the chimney on the roof of the building on the left is clearly visible in the LWIR band (thermal contrast) but not represented in the other two bands. Note that all these details are clearly represented in the grayscale fused image that is obtained by fusion of the three individual bands with a Laplacian pyramid algorithm (Fig 6e). Fig 7a shows a frame from the TRI_B2 sequence in the TRICLOBS database. This scene is similar to the scene in Fig 6 except for the person, who now emerges from the foliage on the left side of the pathway. His light jacket and dark trousers blend well into the light foliage and dark shadows beneath the semi-shrubs with virtually no luminance contrast. Similarly, the smoke rising from the chimney on the left has no luminance contrast with the sky. As a result both details are hard to distinguish in the Visual (Fig 7b) and NIR (Fig 7c) bands. However, they are both highly visible in the LWIR band (Fig 7d) due to their temperature contrast. Both  (Fig 8d: high thermal contrast), while the person behind the closed window can only be seen in the NIR band (Fig 8c) since glass is opaque to LWIR. The two soldiers walking in front of the hedge in the background are most distinct in the Visible (Fig 8b) and in the NIR (Fig 8c) band. Note that all persons are clearly visible in the grayscale fused image (Fig 8e). This image results from fusion of the three individual bands with a Laplacian pyramid algorithm. color image. The B or LWIR channels is not considered here since it has no relation with natural colors. The color table shown in Fig 9b was generated by relating all the RG tuples to corresponding RGB triples in a color photographs matching a TRICLOBS false-color image. This mapping gives the false-color frames a realistic color appearance, as shown in Fig 6d, 6f and 6h. In a previous study we found that this type of color-remapped multi-band image significantly enhanced human perception. With color remapped imagery, observers correctly perceive more details (i.e., they can extract the gist) of a scene in a single glimpse compared to conventional monochrome image representations [5].

Concluding Remarks
In this paper we presented the TRICLOBS image data set which is the first publicly available dataset that provides registered Visual, NIR and LWIR dynamic image sequences representing different surveillance scenarios. The imagery was collected during several nightly field trials using our newly developed TRICLOBS camera system. The resulting data set is intended for the development and evaluation of image fusion, enhancement and color mapping algorithms The TRICLOBS Dynamic Multi-Band Image Data Set for military and civilian short-range surveillance scenarios. Individual frames from the data set contain registered Visual, NIR and LWIR images. These can be used to develop and test static image fusion and color remapping algorithms. Sequences of frames can be used for the development of dynamic image fusion algorithms. TRICLOBS imagery has been used successfully in previous studies to develop new color mapping schemes to give multi-band night vision imagery a realistic color appearance [73,86], to design new image fusion schemes [63], to assess the added value of color fused image representations for human observation [5], to construct an augmented reality nighttime surveillance system [80], and to evaluate a synthetic observer approach to multisensory resolution assessment [93,94].

Limitations of the data set
The TRICLOBS image data set also has some limitations.
Currently only a limited number of scenarios and scenes are included. To increase the value of the data set we defined scenarios that are generic for (military and civilian) The TRICLOBS Dynamic Multi-Band Image Data Set surveillance applications, including a wide range of different objects (white hot targets, weapons, briefcase, vehicles, etc.). Also, the range of atmospherics and lighting conditions for which imagery was registered is rather limited, ranging from just before, just after, to long after sunset. In the future, we plan to extend the data set with imagery of other natural, as well as urban, environments, registered for a wide range of atmospheric conditions. The digitization of the three TRICLOBS sensor channels was not synchronized. However, the maximal time delays were quite small (< 9 ms). As a result, registration errors may only become visible when objects move through the scene at high speed.
The two Photonis ICUs had independent automatic gain regulation, which resulted in some minor flicker effects in some conditions. The user can correct these effects by normalizing both Visual and NIR image sequences.
The images of the two Photonis ICUs show some vignetting (a reduction of the image brightness towards the edges). The user can either correct this effect by applying a digital contrast enhancement procedure or by using only the central part of the frames for image fusion applications.