- vision
- language
- time series
- multimodal
- embedded
- internship
- thesis
•
•
•
•
•
•
-
Efficient Adaptation of Vision–Language Models for Artistic Metadata Generation
This thesis involves specializing Vision–Language Models (VLMs) for multimodal artistic metadata generation through parameter-efficient fine-tuning and knowledge distillation approaches.
-
LLM Fine-Tuning
Fine-tuning a small LLM on a domain-specific knowledge base.
-
Deep Learning for wildfire spread modeling
The thesis aims at developing a Deep Learning model to predict wildfire spread using multimodal and multivariate data.
-
Beyond the Canvas: A Systematic Review of Generative AI for Image Synthesis and Editing
Providing a comprehensive review of state-of-the-art image generative models, exploring architectural evolutions from GANs to Diffusion Models and hybrid systems, while analyzing evaluation paradigms and ethical challenges.
-
3D Urban Scene Synthesis from Multi-View Satellite Imagery
Synthesizing real-time, navigable 3D urban environments from multi-view satellite imagery using 3D Gaussian Splatting and generative refinement, with a focus on a case study in Turin.