A Prompt-based Vision-language Model for Offensive Meme Detection

A Prompt-based Vision-language Model for Offensive Meme Detection
Author	Xiaoyu Guo
Co-Author(s)	Jing Ma; Xufeng Zhao; Yu Bai; Yongwei Chi
Abstract	Internet memes have become prevalent as a mean to share the public opinion through the social media by mixing text and image. Research enabling automated analysis of memes has gained attention in recent years, including among others the task of detection the offensive meme. In this paper, we propose a novel model, prompt-based vision-language model (PVM), for detecting offensive meme. PVM is a multi-modal model that leverages the benefits of deep learning in combination with prompt learning, which enhances the features of text. We make use of the cloze questions as the prefix of text to unlock the potential of language models and fuse them with image features.
Keywords	Internet Memes, multi-modal, prompt learning, disinformation detection, vision-language model

		Article #: DSBFI23-76

Proceedings of 2nd ISSAT International Conference on Data Science in Business, Finance and Industry
January 8-10, 2023 - Da Nang, Vietnam

	International Society of Science and Applied Technologies