What is it about?

By quantizing and creating an artificial feature layer from a VLM, an efficient predictor can be created. In this case to predict Hateful Memes for the AISG Challenge. Using an implicit ensembling method to create a more exact representation of numbers for any LLM, and if used on an LLM output, allows for finetuning on this artificial layer. Enabling fine-tuning on any existing LLM or VLM, open-source or commercial, using almost no computing power.

Featured Image

Why is it important?

Hateful memes are a constant threat to our society, especially in an age where generated such memes can be created automatically and with the purpose to destabilize certain groups. Therefore it is important to address this problem with more powerful tools using state of the art technology. Especially for countries with a very diversified culture & languages, that are often not the main target of DL research. On a technical level, LLMs and VLMs usage and training comes with massive computational costs & semi-random behavior. The solution shown in this paper allows for more precise representation of numbers in LLMs, and enables the extremely cheap fine-tuning of any existing VLM/LLM which allows for token candidate access, which includes most if not all models.

Read the Original

This page is a summary of: OSPC: Artificial VLM Features for Hateful Meme Detection, May 2024, ACM (Association for Computing Machinery),
DOI: 10.1145/3589335.3665996.
You can read the full text:

Read

Contributors

The following have contributed to this page