Applied AI / Industrial perception
Robotic bin-picking perception
An RGB-D industrial perception pipeline that segments visible object instances, extracts 3D geometry features from depth, and ranks candidate objects for grasp selection.
Project notes
This page is meant as a quick read: what the project did, how it was built, and what the reported results mean.
-
Pick selection needs more than a segmentation mask.
In cluttered scenes, the visible object is not always the best object to pick. The project adds geometry and ranking so candidate picks can be compared within each image.
-
YOLO masks, RGB-D geometry, and a learned ranker.
Each visible mask is paired with depth and camera intrinsics, back-projected into a point cloud, converted into object-level features, then ranked with a PyTorch MLP trained against a heuristic target.
-
Depth, visibility, position, extent, class, and orientation.
The feature extractor uses point-cloud statistics, bounding boxes, image-position cues, valid-point counts, object class, centroid, approximate extent, and PCA orientation axes.
-
Good agreement with the heuristic target and known-mask baseline.
The ranker was evaluated on 2,016 object-candidate rows with R2 = 0.9307 and Pearson r = 0.9661. The integrated YOLO-plus-ranker pipeline placed the selected object in the heuristic top three in 81.9 percent of scenes.
-
The repo includes code, figures, report, and evaluation scripts.
Dataset files are referenced separately and are not committed to the repo. The project code and documentation are released under the MIT License.