SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again

@article{Kehl2017SSD6DMR,
  title={SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again},
  author={Wadim Kehl and Fabian Manhardt and Federico Tombari and Slobodan Ilic and Nassir Navab},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017},
  pages={1530-1538},
  url={https://api.semanticscholar.org/CorpusID:10655945}
}
A novel method for detecting 3D model instances and estimating their 6D poses from RGB data in a single shot that competes or surpasses current state-of-the-art methods that leverage RGBD data on multiple challenging datasets.

Dense Color Constraints based 6D object pose estimation from RGB-D images

6D pose estimation performance can be improved effectively by using dense corresponding color constraints, and this work proposes that DCC (Dense Color Constraints).

Estimating 6D Pose From Localizing Designated Surface Keypoints

The core of the approach is that a set of surface points on target object model are designated as keypoints and then train a keypoint detector (KPD) to localize them and a PnP algorithm can recover the 6D pose according to the 2D-3D relationship of keypoints.

HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects

This paper presents a dataset for 6D pose estimation that covers training from 3D models (both textured and textureless), scalability, occlusions, and changes in light conditions and object appearance, and presents a set of benchmarks to test various desired detector properties.

Single Shot 6D Object Pose Estimation

This paper introduces a novel single shot approach for 6D object pose estimation of rigid objects based on depth images, where the 3D input data is spatially discretized and pose estimation is considered as a regression task that is solved locally on the resulting volume elements.

Realtime RGB-Based 3D Object Pose Detection Using Convolutional Neural Networks

A new approach to efficiently detect the 3D pose of objects in images using a single neural network called TQ-Net to predict the translation vector and the quaternion and adding a normalization layer to get a more precise result.

DPOD: Dense 6D Pose Object Detector in RGB images

This work proposes a new method for simultaneous object detection and 6DoF pose estimation that is real-time capable, conceptually simple and not bound to any particular detection paradigms, such as R-CNN, SSD or YOLO.

DPOD: 6D Pose Object Detector and Refiner

A novel deep learning method that estimates dense multi-class 2D-3D correspondence maps between an input image and available 3D models and demonstrates that a large number of correspondences is beneficial for obtaining high-quality 6D poses both before and after refinement.

DPCAE: denoising point cloud auto-encoder for 6D object detection

The approach for 3D orientation estimation is based on a Denoising Point Cloud Auto-encoder which can avoid the rendering problem and eliminate cluttering and occlusion and competes with state-of-the-art approaches with real pose-annotated images.

PVN3D: A Deep Point-Wise 3D Keypoints Voting Network for 6DoF Pose Estimation

A deep Hough voting network is proposed to detect 3D keypoints of objects and then estimate the 6D pose parameters within a least-squares fitting manner, which is a natural extension of 2D-keypoint approaches that successfully work on RGB based 6DoF estimation.
...

Deep Learning of Local RGB-D Patches for 3D Object Detection and 6D Pose Estimation

A 3D object detection method that uses regressed descriptors of locally-sampled RGB-D patches for 6D vote casting that generalizes well to previously unseen input data, and delivers robust detection results that compete with and surpass the state-of-the-art while being scalable in the number of objects is presented.

Learning 6D Object Pose Estimation Using 3D Object Coordinates

This work addresses the problem of estimating the 6D Pose of specific objects from a single RGB-D image by presenting a learned, intermediate representation in form of a dense 3D object coordinate labelling paired with a dense class labelling.

Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image

A regularized, auto-context regression framework is developed which iteratively reduces uncertainty in object coordinate and object label predictions and an efficient way to marginalize object coordinate distributions over depth is introduced to deal with missing depth information.

Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes

A framework for automatic modeling, detection, and tracking of 3D objects with a Kinect and shows how to build the templates automatically from 3D models, and how to estimate the 6 degrees-of-freedom pose accurately and in real-time.

PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization

This work trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation, demonstrating that convnets can be used to solve complicated out of image plane regression problems.

Hashmod: A Hashing Method for Scalable 3D Object Detection

It is shown empirically that the complexity of the method is sublinear with the number of objects and it enables detection and pose estimation of many 3D objects with high accuracy while outperforming the state-of-the-art in terms of runtime.

Technical Demonstration on Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes

Each step of the pipeline is shown, starting with the fast reconstruction of arbitrary 3D objects, followed by the automatic learning and the robust detection and pose estimation of the reconstructed objects in real-time, making the framework suitable for object manipulation e.g. in robotics applications.

6D Object Detection and Next-Best-View Prediction in the Crowd

This work presents a complete framework for both single shot-based 6D object pose estimation and next-best-view prediction based on Hough Forests, the state of the art object pose estimator that performs classification and regression jointly, and proposes an unsupervised feature learnt from depth-invariant patches using a Sparse Autoencoder.

3D Bounding Box Estimation Using Deep Learning and Geometry

Although conceptually simple, this method outperforms more complex and computationally expensive approaches that leverage semantic segmentation, instance level segmentation and flat ground priors and produces state of the art results for 3D viewpoint estimation on the Pascal 3D+ dataset.

Detection and fine 3D pose estimation of texture-less objects in RGB-D images

Experimental evaluation shows that the proposed method yields a recognition rate comparable to the state of the art, while its complexity is sub-linear in the number of templates.