The findings show that participants generally favor deleting comments containing insults, and prefer more interactive moderation methods, such as counterspeech, to address misinformation. Overall, human moderators are trusted more than AI systems, and users' individual attitudes toward censorship and their political identity significantly shape their moderation preferences.
Article by Aline Vianne Barr
Online forums and social media platforms were once regarded as promising spaces for democratic discourse. However, the growing presence of uncivil comments – ranging from personal attacks to the spread of misinformation – has been shown to undermine political trust, democratic legitimacy, and users' well-being. To counter these effects, platforms have increasingly relied on moderation systems, where either human moderators or automated tools assess user comments according to community guidelines or legal standards. Yet, both professional moderators and users often hold diverse opinions on how such comments should be handled, influenced by cultural, social, and individual factors. When moderation decisions do not align with users' expectations, they can erode trust and discourage engagement.
This study focuses specifically on comment moderation, one of the most common and controversial forms of content regulation. The research distinguishes between non-interactive moderation, such as deleting comments or blocking users, and interactive approaches, which involve engaging with users through warning labels, guideline reminders, or counterspeech. While the former tends to remove problematic content without explanation, the latter seeks to encourage more respectful communication and reflection.
To better understand how users perceive these strategies, a survey with a representative sample of 572 Austrian internet users aged 18 to 69 was conducted. Participants were asked to evaluate randomly assigned scenarios depicting different combinations of moderation factors. These scenarios varied by the type of incivility (insults vs. misinformation), the source of moderation (human vs. AI), the level of transparency (explanation provided or not), and the degree of interactivity (deletion, warning label, guideline reminder, or counterspeech).
