Fig 1.
An example of general Japanese table grape fields targeting this research.
Fig 2.
The procedure of estimating 3D positions of a whole grape bunch.
(a) The input video is taken with an omnidirectional camera. (b) The input video is split into several parts, and 3D reconstruction is performed for each part. The results are integrated to obtain (c) the 3D bunch. The details of the beige rectangle are shown in Fig 6.
Fig 3.
Overview of the stereo vision.
Best viewed in color.
Fig 4.
Unsupervised monocular depth estimation using differentiable DIBR.
Fig 5.
Coordinate systems of Ct and are in green or blue, respectively.
Fig 6.
Flow of estimation berry positions of grape bunches.
Fig 7.
Estimated 3D shapes of grape berries.
(a) and (b) show different bunches. First, second, third and forth columns are input, inverse depth estimation results (in Jet colormap, namely the closer the redder), generated point cloud (diagonal view), and generated point cloud (side view), respectively. Regions closer than a certain degree are converted to point clouds. Red points and numbers are manual annotations which show berry correspondence.
Fig 8.
Camera poses and berry positions after bundle adjustment.
Points and pyramids in black indicate initial positions and poses of berries and cameras. Those in blue or red indicate results after bundle adjustment.