Fig 1.
Configuration of a retinal prostheses.
The external and internal components include a micro camera, a transmitter, a external processing unit and a implanted electrode array. First, the external camera acquires an image. Then, the external processor converts the image to a suitable pattern of electrical stimulation of the retina through an electrode array.
Fig 2.
Top row: Example of a bathroom scene with the three processing methods used in this work (a) Direct image, (b) Edge image and (c) SIE-OMS image. Bottom row: the three processing methods in the SPV.
Fig 3.
The stimulation of the electrode array is based on two information pathways to extract the regions of pixels that represents important objects (OMS) and structural edges (SIE). The regions are computed using two different types of FCN from He et al. [55] and Fernandez-Labrador et al. [54].
Fig 4.
Scene layout from an indoor image.
Using [54] we detect the main structure of the room extracting the structural informative edges (SIE) (right) which are those formed by the intersection of walls, ceiling and floor of the room (middle).
Fig 5.
Above: box branch for classification and bounding box regression. Below: mask branch for predicting segmentation masks on each Region of Interest (ROI). Numbers denote spatial resolution and channels. Arrows denote either convolutions, deconvolutions, or fully conected layers. The x4 means 4 consecutive convolution layers. (Adapted from He et al. [55]).
Fig 6.
Objects masks and silhouettes (OMS).
Object masks were generated from [55] and were sorted by probability scores to avoid occlusions between objects. The extracted information was combined in an image highlighting the silhouettes of the objects in white with the object masks in gray.
Fig 7.
SPV setup: Subjects were seated on a chair facing a computer screen at 1m distance. The visual field was 20 degrees that simulates the prostheses device. Trial setup: Each gray rectangle represents the image shown on the computer monitor during the trial. Each image appeared for 10 seconds and switched for the next image automatically. Break time between image sequences was 30 seconds. The complete experiment took approximately 15 minutes.
Fig 8.
Examples of stimuli used in the experiment.
Six examples of indoor environments represented with 1024 phosphenes (rows: bathroom, bedroom, dining room, kitchen, living room and office, respectively). Each column shows: a) input images, b) images processed using the Edge method, c) images processed using the Direct method and d) images processed by our SIE-OMS method, respectively.
Table 1.
Global object recognition (OR) and room identification (RI) values for each phosphenic stimuli method.
Comparison of mean responses and standard deviation grouped by type of phosphenic image method (Edge, Direct and SIE-OMS). 95% of confidence interval for the mean difference.
Fig 9.
Global results by phosphenic stimuli method.
Percentage of correct, incorrect and not answered responses in a single trial. Higher scores in correct responses indicate that subjects were able to identify and recognize the objects and the type of room in each test image. Higher ratios of not answered indicate that subjects were not able to identify and recognize the objects and the type of room in each test image. The general findings are that: SIE-OMS method improves the identification of the objects resulting to be the most effective method. This translates in an increase in the number of correct answers for the room type identification test for the SIE-OMS method. Results also show that the Edge method is the least effective with the highest percentage of non responses images for the two tasks. The test found significant difference between SIE-OM and Direct method (p<.001). The same conclusion was found between SIE-OM and Edge method (p<.001). Where: *** = p<.001; ** = p<.01; * = p<.05; ns = p>.05. All t-tests paired samples, two-tailed.
Fig 10.
Object recognition results for each room-type.
Higher scores in correct responses indicate that subjects were able to recognize the objects in each room. Higher ratios in non responses indicate that subjects were not able to recognize the objects in each room. The SIE-OMS method obtained the highest score of the three methods in all room types compared with Edge and Direct methods. The results also show how the most difficult room was the kitchen. *** = p<.001; ** = p<.01; * = p<.05; ns = p>.05. All t-tests paired samples, two-tailed.
Fig 11.
Room identification results for each room-type.
Higher scores in correct responses indicate that subjects were able to recognize the type of room in each test image. Higher ratios in non responses indicate that subjects were not able to recognize the type of room in each image. The SIE-OMS method obtained the highest score of the three methods in all room-type compared with Edge and Direct methods. In the same way as in the identification of objects, results also showed how the most difficult room was the kitchen. *** = p<.001; ** = p<.01; * = p<.05; ns = p>.05. All t-tests paired samples, two-tailed.
Fig 12.
Successful and failed images results.
Some examples of phosphenic images generated with the three methods. Successful images (top rows) and cases of images failed by the subjects (bottom rows) with the three approaches: Edge, Direct and SIE-OMS, respectively.
Table 2.
Confusion matrix results for room identification based only on answered images (correct and incorrect responses) using SIE-OMS method.
Table 3.
Confusion matrix results for room identification based on the total images (correct, incorrect and no answer (NA)) using SIE-OMS method.