What is it about?

The paper introduces a fusion network that integrates both text and image analysis for the task of hate speech detection. This multimodal approach allows for a more nuanced understanding of the content by leveraging information from both modalities.

Featured Image

Why is it important?

- Multimodal Approach to Hate Speech Detection: The paper introduces a fusion network that integrates both text and image analysis for the task of hate speech detection. This multimodal approach allows for a more nuanced understanding of the content by leveraging information from both modalities. - Use of State-of-the-Art Pre-Trained Models: The study utilizes state-of-the-art pre-trained language models such as DistilBERT and MPNet for text representation and ResNet and EfficientNetV2 for image processing. This ensures that the model benefits from advanced feature extraction capabilities of these models. - Evaluation on a Specialized Dataset: Extensive experiments are conducted on the MMHS150K dataset, a manually annotated dataset specifically designed for multimodal hate speech analysis. The results demonstrate competitive performance compared to existing methods. - Comparison of Context-Free and Contextual Language Models: The paper compares the effectiveness of context-free word embedding models like GloVe and fastText with contextual language models like DistilBERT and MPNet. It provides insights into how contextual embeddings improve the performance of hate speech detection. - Fusion Network Architecture: The proposed architecture combines text and image features into a fused tensor, which is then processed by downstream models such as multilayer perceptron (MLP) or convolutional neural network (CNN) to extract more complex and enriched features. - Public Availability of Resources: The resources and code for the proposed model are made publicly available on GitHub, facilitating further research and development in the field of multimodal hate speech detection.

Read the Original

This page is a summary of: Fusion Network for Multimodal Hate Speech Detection, February 2024, ACM (Association for Computing Machinery),
DOI: 10.1145/3654522.3654562.
You can read the full text:

Read

Contributors

The following have contributed to this page