Services on Demand
Journal
Article
Indicators
Cited by SciELO
Access statistics
Related links
Cited by Google
Similars in SciELO
Similars in Google
Share
Revista Facultad de Ingeniería Universidad de Antioquia
Print version ISSN 0120-6230
Abstract
DIAZ-TORO, Andrés Alejandro; PAZ-PEREZ, Lina María; PINIES-RODRIGUEZ, Pedro and CAICEDO-BRAVO, Eduardo Francisco. Dense tracking, mapping and scene labeling using a depth camera. Rev.fac.ing.univ. Antioquia [online]. 2018, n.86, pp.54-69. ISSN 0120-6230. https://doi.org/10.17533/udea.redin.n86a07.
We present a system for dense tracking, 3D reconstruction, and object detection of desktop-like environments, using a depth camera; the Kinect sensor. The camera is moved by hand meanwhile its pose is estimated, and a dense model, with evolving color information of the scene, is constructed. Alternatively, the user can couple the object detection module (YOLO: you only look once [1]) for detecting and propagating to the model information of categories of objects commonly found over desktops, like monitors, keyboards, books, cups, and laptops, getting a model with color associated to object categories. The camera pose is estimated using a model-to-frame technique with a coarse-to-fine iterative closest point algorithm (ICP), achieving a drift-free trajectory, robustness to fast camera motion and to variable lighting conditions. Simultaneously, the depth maps are fused into the volumetric structure from the estimated camera poses. For visualizing an explicit representation of the scene, the marching cubes algorithm is employed. The tracking, fusion, marching cubes, and object detection processes were implemented using commodity graphics hardware for improving the performance of the system. We achieve outstanding results in camera pose, high quality of the model’s color and geometry, and stability in color from the detection module (robustness to wrong detections) and successful management of multiple instances of the same category.
Keywords : Dense reconstruction; camera tracking; depth sensor; volumetric representation; object detection; multiple instance labeling.