NLP

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

This paper presents a technical report of Gemini 1.5 developed by Google.

Identifying High Quality Training Data for Misinformation Detection

This paper presents an analysis of different methods for collecting training data in order to train a machine learning classifier for misinformation detection.

Investigating Scientific Misinformation Using Different Modes of Learning

This paper presents an initial analysis of scientific misinformation from research papers.

Detecting and Understanding of Information Pollution on Social Media

Social media and the web have become primary sources for obtaining information and news. Given the speed and spread of information on social media, effects of poor-quality information, especially with respect to health-related information, can be …

DeMis: Data-efficient Misinformation Detection using Reinforcement Learning

We propose a novel reinforcement learning framework for misinformation detection on Twitter. We release both code, data and pre-trained models.

PoliBERTweet: A Pre-trained Language Model for Analyzing Political Content on Twitter

We propose pre-trained language models for political Twitter data. We evaluate all models and report results. We release both data and pre-trained models.

Language Models

Inferring #MeToo Experience Tweets Using Classic and Neural Models

We propose pre-trained language models for political Twitter data. We evaluate all models and report results. We release both data and pre-trained models.

Traditional and Context-specific Spam Detection in Low Resource Settings

We propose a novel taxonomy for false information on social media and a new concept of context-specific spam. We release both data and models.

Misinformation Detection Datasets

More resources are coming soon