Scientific Publications

VERGE in VBS 2026

This paper describes VERGE, an interactive system for retrieving video content, supporting searches over images extracted from videos. The system retains its core retrieval modalities and fusion techniques in an improved form, while introducing new modalities, including optical character recognition, video question answering, underwater object detection, image quality-based retrieval, and surgical scenes understanding.

Read more…

[WACV 2026] AuViRe: Audio-visual Speech Representation Reconstruction for Deepfake Temporal Localization (with Model Checkpoints)

AI4TRUST partners published a paper that presents a novel approach to temporal localization of deepfakes by leveraging Audio-Visual Speech Representation Reconstruction (AuViRe). Specifically, our approach reconstructs speech representations from one modality (e.g., lip movements) based on the other (e.g., audio waveform).

Read more…

An LLM Framework for Long-form Video Retrieval and Audio-Visual Question Answering Using Qwen2/2.5

AI4TRUST partners published a paper that presents our approach to tackle the tasks of Known Item Search (KIS) and Video Question Answering (Video QA) by combining state-of-the-art LLMs and cross-modal video retrieval methods.

Read more…

Face the Facts! Evaluating RAG-based Fact-checking Pipelines in Realistic Settings

AI4TRUST partners FBK submitted a Paper with the goal to benchmark, under more realistic scenarios, RAG-based methods for the generation of verdicts i.e., short texts discussing the veracity of a claim – evaluating them on stylistically complex claims and heterogeneous, yet reliable, knowledge base in May 2025.

Read more…

Improving the Perturbation-Based Explanation of Deepfake Detectors Through the Use of Adversarially-Generated Samples

AI4TRUST partners CERTH submitted a Paper introducing the idea of using adversarially-generated samples of the input images that were classified as deepfakes by a detector, to form perturbation masks for inferring the importance of different input features and produce visual explanations, at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops (WACVW 2025) in Tucson, Arizona in February 2025.

Read more…

VERGE in VBS 2025

AI4TRUST partners submitted a Paper which presents VERGE, an interactive system designed for searching and browsing video content. VERGE allows users to explore collections of images extracted from videos, offering various search options like free-text queries, concept-based searches, color matching, detection of faces and people, as well as searches based on visual and semantic similarities. This paper was presented at the 31st International Conference on MultiMedia Modeling (MMM2025) in Nara, Japan in January 2025.

Read more…

ITI-CERTH participation in ActEV and AVS Tracks of TRECVID 2024

AI4TRUST partners submitted a Paper on the overview of the runs related to Ad-hoc Video Search (AVS) and Activities in Extended Video (ActEV) tasks at the TRECVID 2024 Workshop in November 2024.

Read more…

Disturbing Image Detection Using LMM-Elicited Emotion Embeddings

AI4TRUST partners submitted a Paper dealing with the task of Disturbing Image Detection (DID), exploiting knowledge encoded in Large Multimodal Models (LMMs). at the IEEE Int. Conf. on Image Processing Workshops (ICIPW2024), in Abu Dhabi in October 2024.

Read more…

Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch

AI4TRUST partners submitted a Paper at the 2024 IEEE 20th International Conference on Intelligent Computer Communication and Processing (ICCP 2024) , Cluj-Napoca, Romania, in October 2024.

Read more…

Translating speech with just images

AI4TRUST partners presented a paper on visually grounded speech models link speech to images – linking images to text via an existing image captioning system, and as a result gain the ability to map speech audio directly to text, at Interspeech 2024, Kos, Greece in September 2024.

Read more…

Hybrid-Diarization System with Overlap Post-Processing for the DISPLACE 2024 Challenge

AI4TRUST partners presented a paper at Interspeech 2024, Kos, Greece on the team’s collaborative efforts in participating in the Track 1 for Speaker Diarization of the Diarization of Speaker and Language in conversational Environments (DISPLACE) Challenge 2024, in September 2024.

Read more…

Pages: 1 2 3