- vision
- language
- time series
- multimodal
- embedded
- internship
- thesis
•
•
•
•
•
•
-
Efficient Adaptation of Vision–Language Models for Artistic Metadata Generation
This thesis involves specializing Vision–Language Models (VLMs) for multimodal artistic metadata generation through parameter-efficient fine-tuning and knowledge distillation approaches.
-
Deep Learning for wildfire spread modeling
The thesis aims at developing a Deep Learning model to predict wildfire spread using multimodal and multivariate data.
-
Beyond the Canvas: A Systematic Review of Generative AI for Image Synthesis and Editing
Providing a comprehensive review of state-of-the-art image generative models, exploring architectural evolutions from GANs to Diffusion Models and hybrid systems, while analyzing evaluation paradigms and ethical challenges.
-
Classifying Multimodal Post Content through Multimodal Large Language Models
This thesis involves specialize MLLMs to Multimodal Post Content Classification.
-
Adaptive Granularity Retrieval for Retrieval-Augmented Generation
This thesis explores adaptive retrieval for Retrieval-Augmented Generation, developing a system that dynamically adjusts the granularity of retrieved context (document, section, or passage) based on query intent. The goal is to improve both precision for fine-grained questions and coherence for broad, open-ended queries.