Trick fake news detectors with malicious user comments
Fake news detectors, used by social media platforms like Twitter and Facebook to warn misleading posts, have traditionally flagged online articles as false based on the headline or content of the story. However, more recent approaches have taken into account other signals such as network functions and user interactions in addition to the content of the story in order to increase their accuracy.
However, new research by a team at Penn State College of Information Sciences and Technology shows how these fake news detectors can be manipulated by user comments to mark true news as false and false news as true. This attack could allow adversaries to influence the detector’s assessment of the story, even if they are not the original author of the story.
“Our model does not require opponents to change the title or content of the target article,” said Thai Le, lead author of the article and a graduate student at the College of IST. “Instead, opponents can easily use random social media accounts to post malicious comments, either to demote a real story as fake news or to promote a fake story as real news.”
That is, instead of deceiving the detector by attacking the content or source of the story, commentators can attack the detector itself.
The researchers developed a framework called Malcom to generate, tweak, and add malicious comments that were readable and relevant to the article in order to fool the detector. Then they rated the quality of the artificially generated comments by seeing if people could tell them apart from those generated by real users. Eventually, they tested Malcom’s performance on several popular fake news detectors.
Malcom outperformed the baseline for existing models by fooling five of the leading neural network-based fake news detectors more than 93% of the time. To the knowledge of the researchers, this is the first model that attacks fake news detectors using this method.
This approach could be attractive to attackers as they do not have to follow the traditional steps of spreading fake news, which is primarily about owning the content. The researchers hope their work will help those tasked with developing fake news detectors develop more robust models and improve methods for detecting and filtering out malicious comments, ultimately helping readers get accurate information to make informed decisions.
“Fake news was spread with the deliberate intention of widening political rifts, undermining citizens’ trust in public figures, and even creating confusion and doubt in communities,” the team wrote in its paper released during IEEE 2020 The international conference on data mining will be presented virtually.
Le added, “Our research shows that attackers can exploit this reliance on user engagement to fool the detection models by posting malicious comments on online articles, and it highlights the importance of robust detection models for fake news that defend against enemy attacks. “
Project contributors include Dongwon Lee, Associate Professor, and Suhang Wang, Assistant Professor, both at Penn State College of Information Sciences and Technology. This work was supported by the National Science Foundation.
Source of the story:
Materials provided by Penn State. Originally written by Jordan Ford. Note: Content can be edited for style and length.