MFGaussian: multi-modal data fusion based 3D Gaussian splatting for accurate and robust scene representation

传播影响力
本库下载频次:
本库浏览频次:
CNKI下载频次:0

成果归属机构:

信息工程学院

作者

Yang, Ran ; Liu, Yang ; Zhong, Ruofei ; Wei, Zhanying ; Xu, Mengbing ; Liu, Shuai ; Yan, Peng

单位

Capital Normal Univ, Coll Resource Environm & Tourism, Beijing 100048, Peoples R China;Minist Educ, Key Lab 3 Dimens Informat Acquisit & Applicat, Beijing 100048, Peoples R China;Beijing Quick Sensing Technol Co Ltd, Beijing, Peoples R China;China Univ Geosci Beijing, Sch Informat Engn, Beijing, Peoples R China;Jilin Gen Aviat Vocat & Tech Coll, Sch Smart Aviat, Jilin, Peoples R China

摘要

At present, the widely used traditional three-dimensional (3D) reconstruction techniques are still insufficient to adapt to various diverse scenarios. Compared to traditional methods, emerging technologies like Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) for novel view synthesis offer more realistic and comprehensive expression capabilities. However, most of these related technologies still rely on traditional methods and require extensive and dense input views, which poses challenges to the reconstruction in the real-world scenarios. We propose MFGaussian, a framework based on 3DGS for 3D scene representation by fusing multi-modal data obtained from the Mobile laser scanning system (MLS) to achieve high robustness and accuracy even with limited input views. MFGaussian employs stepwise training approach to independently learn the global information and details of the scene. During pre-training, a substantial number of virtual training views are generated by projecting color point clouds, thereby enhancing the model's robustness. Subsequently, the model is fine-tuned using the original training views. This method initializes the laser point cloud as 3D Gaussian, obtains camera parameters through multi-sensor calibration and subsequent spherical interpolation, thus obtaining high-precision initial data without relying on Structure from Motion(SfM), and further ensures accurate geometric structure through the partial optimization. Furthermore, an analysis has been conducted on how variations in lighting brightness within the scene affect the view synthesis from diverse perspectives and positions, with an appearance model incorporated to eliminate the resulting color ambiguity. Our method, tested on our dataset and the ETH3D stereo benchmark, demonstrates enhanced capability and robustness of 3DGS in diverse scenarios without SfM or dense view inputs. It outperforms several state-of-the-art methods in both quantitative and qualitative evaluations. Our code will be open sourced soon later after the publication of this manuscript (https://github.com/oucliuyang/MFGaussian).

基金

National Natural Science Foundation of China [U22A20568, 42071444]; National Key Technologies Research and Development Program of China [2022YFB3904101]

语种

英文

来源

INTERNATIONAL JOURNAL OF DIGITAL EARTH,2025(1):.

出版日期

2025-12-31

提交日期

2025-04-18

引用参考

Yang, Ran; Liu, Yang; Zhong, Ruofei; Wei, Zhanying; Xu, Mengbing; Liu, Shuai; Yan, Peng. MFGaussian: multi-modal data fusion based 3D Gaussian splatting for accurate and robust scene representation[J]. INTERNATIONAL JOURNAL OF DIGITAL EARTH,2025(1):.

回到顶部
Baidu
map
Baidu
map