3D Urban Scene Synthesis from Multi-View Satellite Imagery
Synthesizing real-time, navigable 3D urban environments from multi-view satellite imagery using 3D Gaussian Splatting and generative refinement, with a focus on a case study in Turin.
Requirements
- M.Sc. in Machine Learning, Data Science, Computer Science, Mathematics, Telecommunications, or similar
- Good knowledge of Python
- Software development skills
- Basic concepts of image processing
- Basic concepts of data science, concerning data analysis, data processing and deep learning
Description
Synthesizing large-scale, immersive, and geometrically accurate 3D urban scenes is a challenging task with crucial applications in urban planning, gaming, and robotics. While traditional 3D scanning is labor-intensive, satellite imagery provides extensive geographic coverage and automated collection. However, satellite data often lacks the parallax necessary to reconstruct building facades and street-level details accurately.
Inspired by the SkyFall-GS framework, this thesis proposes a two-stage pipeline for virtual city creation. The first stage involves coarse geometry reconstruction from multi-view satellite imagery using 3D Gaussian Splatting (3DGS). The second stage leverages open-domain text-to-image diffusion models to hallucinate realistic appearances in occluded areas, ensuring a strong satellite-to-ground 3D consistency. This research will focus on a case study in Turin, utilizing satellite imagery of the city to create a navigable and immersive 3D environment.
The main activities of the thesis include:
- Reviewing the literature on 3D Gaussian Splatting, satellite-based 3D reconstruction, and diffusion-driven 3D refinement.
- Exploring the available imagery for the Turin case study, identifying specific Areas of Interest (AOIs) with diverse architectural features.
- Implementing initial 3DGS reconstruction, incorporating appearance modeling
- Developing a curriculum-based refinement strategy to progressively enhance geometric completeness and texture realism from the sky to the ground.
- Evaluating the performance against baseline 3D reconstruction methods using perceptual and pixel-level metrics.
- Analyzing and visualizing the final 3D representation to demonstrate real-time, free-flight navigation of the synthesized Turin model.
At the end of the thesis, a paper publication to a conference is planned.