Automatic Fact Checking for Biomedical Information in Social Media and Scientific Literature - Fundamentals of Natural Language Processing

Automatic Fact Checking for Biomedical Information in Social Media and Scientific Literature (FIBISS)

Most research on methods and models for automatic fact checking, which can distinguish misinformation and desinformation from correct information, focus on the news domain. News, including those shared in social media spaces, are checked for their truthfulness.Such methods have not been developed for the biomedical domain yet. Challenges include the richness of (established) sources of information, the complexity of information, as well as the differences between the language of experts and medical laypeople.In this project, we develop information extraction systems for laypeople and expert language, map the extracted information onto each other and finally check their truthfulness, based on established sources.The project combines therefore methods from transfer learning, information extraction, and fact checking for the biomedical domain, especially in social media.

FIBISS started in 2021 (at the Universität Stuttgart) and will finish in 2024.

The project is funded by the German Research Foundation (DFG).

Publications

Velutharambath, Aswathy/Klinger, Roman (2023): UNIDECOR: A Unified Deception Corpus for Cross-Corpus Deception Detection. In: Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis. Toronto: Association for Computational Linguistics. S. 39–51.

Wührl, Amelie/Grimminger, Lara/Klinger, Roman (2023): An Entity-based Claim Extraction Pipeline for Real-world Fact-checking. In: Proceedings of the Sixth Fact Extraction and VERification Workshop (FEVER). Dubrovnik: Association for Computational Linguistics. S. 29–37.

Mohr, Isabelle/Wührl, Amelie/Klinger, Roman (2022): CoVERT: A Corpus of Fact-checked Biomedical COVID-19 Tweets. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference. Marseille: European Language Resources Association. S. 244–257.

Wührl, Amelie/Klinger, Roman (2022a): Recovering Patient Journeys: A Corpus of Biomedical Entities and Relations on Twitter (BEAR). In: Proceedings of the Thirteenth Language Resources and Evaluation Conference. Marseille: European Language Resources Association. S. 4439–4450.

Wührl, Amelie/Klinger, Roman (2022b): Entity-based Claim Representation Improves Fact-Checking of Medical Content in Tweets. In: Proceedings of the 9th Workshop on Argument Mining. Gyeongju: International Conference on Computational Linguistics. S. 187–198.

Grimminger, Lara/Klinger, Roman (2021): Hate Towards the Political Opponent: A Twitter Corpus Study of the 2020 US Elections on the Basis of Offensive Speech and Stance Detection. In: Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics. S. 171–180.

Wührl, Amelie/Klinger, Roman (2021): Claim Detection in Biomedical Twitter Posts. In: Proceedings of the 20th Workshop on Biomedical Language Processing. Association for Computational Linguistics. S. 131–142.

Thorne, Camilo/Klinger, Roman (2018): On the Semantic Similarity of Disease Mentions in MEDLINE and Twitter. In: Natural Language Processing and Information Systems: 23rd International Conference on Applications of Natural Language to Information Systems, NLDB 2018, Paris, France, June 13-15, 2018, Proceedings. Cham: Springer International Publishing. S. 324–332.

Thorne, Camilo/Klinger, Roman (2017): Towards Confidence Estimation for typed Protein-Protein Relation Extraction. In: Proceedings of the Biomedical NLP Workshop associated with RANLP 2017. Association for Computational Linguistics. S. 55–63.