Katedra informatiky - Detaily bakalárskej práce

Meno:	Alex
Priezvisko:	Haščík
Názov:	Commonsense Causal Reasoning in Multimodal Language Models
Vedúci:	Mgr. Marek Šuppa
Rok:	2025
Kľúčové slová:	large language models, commonsense causal reasoning, multimodal models
Abstrakt:	In this thesis, we describe the main ideas behind large language models and the benchmarks used to evaluate commonsense causal reasoning in multimodal language models. We focus on VCOPA, a benchmark that extends the original text-based COPA task into a visual form. VCOPA tests a model's ability to choose the more plausible cause or effect in a scenario shown through images. While VCOPA is useful for evaluating causal reasoning in visual contexts, it has several limitations. Some questions contain visual bias or inconsistencies between images, which can directly affect model’s evaluation. To address these issues, we recreate the VCOPA dataset. The new version aims to reduce visual bias and improve consistency in individual questions. We test several state-of-the-art multimodal models from Google and OpenAI on both the original and the recreated VCOPA tasks. The models achieve high overall accuracy on both versions of the dataset. However, we identify a small subset of questions where even the best-performing models consistently fail to identify correct causal dependencies, suggesting limitations in their understanding of certain causal relationships.

Súbory bakalárskej práce:

hascik_bakalarska_praca.pdf

priloha-hascik.html

Súbory prezentácie na obhajobe:

hascik_obhajoby.pdf