Performance of EfficientPose Method with Reduced BiFPN Layer for 6D Pose Estimation
DOI:
https://doi.org/10.11594/nstp.2025.4741Keywords:
Smart technologies, position of objects, 3D space, EfficientPoseAbstract
Smart technologies, such as automatic self-driving cars, autopilot aircraft, and self-controlled robots, are devices that have the intelligence to control the steering system automatically. In this study, we propose an approach to predict the position of objects in 3D space for controlling autopilot smart devices with more diverse and accurate response capabilities when in contact with various objects around them, such as slowing down, accelerating, avoiding, approaching, changing direction, or picking up objects. One of the state-of-the-art methods in this problem is EfficientPose. This method uses a deep learning approach with a backbone network using EfficientNet and a feature fusion network using BiFPN (Bidirectional Feature Pyramid Network). Through this study, we conduct an experiment on the EfficientPose method by reducing the number of BiFPN layers. This is done because the computational cost required is very large. Reducing the number of BiFPN layers is expected to make the model more efficient (lower computational cost). However, it is possible that the reduction in the number of layers affects the effectiveness of the method. The experimental results showed that reducing the number of BiFPN layers reduced the effectiveness of EfficientPose by 7.81%. However, this step was able to increase efficiency, where the number of parameters decreased by 12.36%, the execution time decreased by 30.63%, and the number of FPS increased by 44.15%. These results provide important information regarding the development of a more efficient 6D pose estimation method using the EfficientPose method as its framework base. Some modifications or additions of certain parts may be possible to improve its effectiveness.
Downloads
References
Baras, N., Nantzios, G., Ziouzios, D., & Dasygenis, M. (2019). Autonomous obstacle avoidance vehicle using LIDAR and an embedded system. 8th International Conference on Modern Circuits and Systems Technologies (MOCAST), 1–4. https://doi.org/10.1109/MOCAST.2019.8742065
Brynte, L., & Kahl, F. (2020). Pose proposal critic: Robust Pose Refinement by Learning Reprojection Errors. Computer Vision and Pattern Recognition. https://api.semanticscholar.org/CorpusID:218613659
Bukschat, Y., & Vetter, M. (2020). EfficientPose: An efficient, accurate and scalable end-to-end 6D multi object pose estimation approach. ArXiv. https://api.semanticscholar.org/CorpusID:226281757
Chen, B., Chin, T.-J., & Klimavivcius, M. (2021). Occlusion-robust object pose estimation with holistic representation. IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2223–2233.
Chen, J., Zhang, L., Liu, Y., & Xu, C. (2020). Survey on 6D pose estimation of rigid object. 39th Chinese Control Conference (CCC), 7440–7445. https://doi.org/10.23919/CCC50068.2020.9189304
Di, Y., Manhardt, F., Wang, G., Ji, X., Navab, N., & Tombari, F. (2021). SO-pose: Exploiting self-occlusion for direct 6D Pose Estimation. IEEE/CVF International Conference on Computer Vision (ICCV), 12376–12385.
Georgios, A., Nikolaos, Z., Anastasios, D., Dimitrios, Z., & Petros, D. (2020). DronePose: Photorealistic UAV-assistant dataset synthesis for 3D pose estimation via a smooth silhouette loss. In A. Bartoli Adrien and Fusiello (Ed.), Computer Vision – ECCV 2020 Workshops, pp. 663–681. Springer International Publishing.
Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Konolige, K., Bradski, G. R., & Navab, N. (2013). Model-based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. Asian Conference on Computer Vision (ACCV). https://api.semanticscholar.org/CorpusID:17334031
Iwase, S., Liu, X., Khirodkar, R., Yokota, R., & Kitani, K. M. (2021). RePOSE: Fast 6D object pose refinement via deep texture rendering. IEEE/CVF International Conference on Computer Vision (ICCV), 3283–3292.
Jayaweera, N., Rajatheva, N., & Latva-aho, M. (2019). Autonomous driving without a burden: view from outside with elevated LiDAR. IEEE 89th Vehicular Technology Conference (VTC2019-Spring), 1–7. https://doi.org/10.1109/VTCSpring.2019.8746507
Kehl, W., Manhardt, F., Tombari, F., Ilic, S., & Navab, N. (2017). SSD-6D: Making RGB-Based 3D detection and 6D pose estimation great again. IEEE International Conference on Computer Vision (ICCV), 1530–1538. https://doi.org/10.1109/ICCV.2017.169
Li, Y., Wang, G., Ji, X., Xiang, Y., & Fox, D. (2018). DeepIM: Deep iterative matching for 6D pose estimation. International Journal of Computer Vision, 128, 657–678.
Sundermeyer, M., Márton, Z.-C., Durner, M., & Triebel, R. (2019). Augmented autoencoders: Implicit 3D orientation learning for 6D Object Detection. International Journal of Computer Vision, 128, 714–729.
Tremblay, J., To, T., Sundaralingam, B., Xiang, Y., Fox, D., & Birchfield, S. (2018). Deep object pose estimation for semantic robotic grasping of household objects. ArXiv, abs/1809.10790. https://api.semanticscholar.org/CorpusID:52893770
Wang, C., Xu, D., Zhu, Y., Martín-Martín, R., Lu, C., Fei-Fei, L., & Savarese, S. (2019). DenseFusion: 6D object pose estimation by iterative dense fusion. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 3338–3347.
Wang, G., Manhardt, F., Tombari, F., & Ji, X. (2021). GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 16606–16616.
Wu, D., Zhuang, Z., Xiang, C., Zou, W., & Li, X. (2019). 6D-VNet: End-to-end 6DoF vehicle pose estimation from monocular RGB images. Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
Xu, Y., Lin, K.-Y., Zhang, G., Wang, X., & Li, H. (2022). RNNPose: Recurrent 6-DoF object pose refinement with robust correspondence field estimation and pose optimization. 14860–14870. https://doi.org/10.1109/CVPR52688.2022.01446
Downloads
Published
Conference Proceedings Volume
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this proceedings agree to the following terms:
Authors retain copyright and grant the Nusantara Science and Technology Proceedings right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this proceeding.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the proceedings published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this proceeding.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See the Effect of Open Access).