Investigating Scientific Misinformation Using Different Modes of Learning


This paper presents an initial analysis of scientific misinformation from three areas of research including Computer Science, Environmental Science, and Medicine. We investigate key words in publication titles and abstracts from retracted publications, which we view as a proxy for misinformation publications. Using the Altmetric Attention Score as a signal of publication popularity, we group articles into low-popularity and high-popularity subsets. We apply three modes of learning (unsupervised, semi-supervised, and supervised), to identify main themes from scientific research publications and compare the results between publication popularity sets. We find that while there is overlap among the terms identified by different methods, they are not the same. However, general topic coverage using different words is similar, highlighting the difficulty in identifying keyword “markers” for popular, poor-quality scientific information.

In Proceedings of the Workshop on Scientific Document Understanding at AAAI (SDU@AAAI)
Kornraphop Kawintiranon
Kornraphop Kawintiranon

My research interests include AI/ML, NLP and Data Science.