DISARM Countermeasures Workshop Series — How can platforms counter disinformation campaigns?

DISARM Foundation
5 min readJun 6, 2023

Victoria Smith

This year, DISARM has hosted a series of workshops exploring countermeasures to online harms. The workshop series is generously supported by Craig Newmark Philanthropies. The objective of the workshops is to gather feedback on the types of countermeasures used to counter online disinformation and other harms, and how to make advice on mitigations and countermeasures to these threats accessible and practical to those who need it. The feedback from these sessions will feed into DISARM’s work to update and improve the existing ‘Blue Framework’ of countermeasures.

This workshop focused on what social media platforms can do to protect their users from the threat of online disinformation campaigns. Participants had experience of working in social media companies.

Introduction

Participants thought the increased transparency requirements of the EU’s Code of Practice on Disinformation were important and useful. However, they were concerned that standardised reporting failed to recognise the sometimes substantial differences between platforms and could in fact reduce the amount of useful information they provide.

In terms of countermeasures, participants saw value in balancing small changes that affect a large user base and larger changes that affect a smaller number of users. Used in combination, the changes can potentially have a greater impact than the sum of their parts.

Finally, participants stressed the importance of developing platform policies that can be effectively enforced at scale. Changes may need to be made to the data a platform collects in order to enforce certain types of policy, and this needs to be considered to avoid publishing policies that are unenforceable.

Key Takeaways:

What should the DISARM Blue Framework do?

Variations in platform size and functionality make standardised reporting in the industry difficult. DISARM countermeasures should reflect a range of potential actions, recognising that not every countermeasure may be suitable for every platform; Participants emphasised that it was unfair to hold platforms of different sizes to the same standard. Small platforms have fewer resources than large platforms, so diverting staff away from enforcement and into reporting has a larger impact. Small platforms may have fewer users, less revenue and the problems of disinformation may manifest in different ways. Another challenge cited was the varying ways platforms collect and store data. Participants warned in fulfilling specific reporting requirements, there can be a gap between useful information a platform could share, but is not required to, and potential limitations of the information a specific platform can provide that is in line with reporting requirements.

Over time, platform policies and moderation efforts have become more nuanced. But more nuanced content moderation policies increase the workload on staff to enforce them; DISARM should recognise that platforms are moving away from binary leave up or take down decisions towards more nuanced approaches depending on the potential threat, reach, context or actor involved. Platforms need to decide who enforces policy violations and how. The challenge is often not in developing the policy but how you operationalize the enforcement of it at scale and it is resource intensive to train content moderation teams.

There is a difference between product features and trust and safety features; DISARM should encourage teams to develop product features that consider the trust and safety implications of their design.

What should the DISARM Blue Framework consider?

Enforcement requires resources and the ability to apply the rules at scale; Social media platforms must balance developing user policies with their ability to rigorously enforce those policies.

Platforms face a range of online harms and must develop ways to prioritize the most rapid response against content that is assessed to cause the most harm; From child sexual exploitation to violent, extremist or terrorist content, pornography and spam, platforms face a wide range of content moderation decisions with finite resources. Platforms must also balance the speed of their response — removing content that is likely to spread very quickly, before it is too late. The threat posed by disinformation campaigns must therefore be factored into this wider matrix.

Platforms must be prepared to handle threats relating to diverse geographic areas and themes; From elections to natural disasters, some threats can be better predicted than others. Platforms need the ability to quickly react to emerging situations and be able to make assessments on content in different languages, cultures and contexts.

Machine learning systems on content moderation can degrade quickly as humans behave unpredictably; Machine learning systems degrade very quickly when applied to human behaviour, because humans do not always behave predictably, consistently or rationally. These systems therefore require regular retraining. As these models scale up, they can be difficult to operationalise, resulting in high error rates. One example provided was developing custom solutions to remove commercial spam. Once the solution is applied and the actor senses their content is being restricted, they will change their behaviour or try to test the system to see what it targets and where its vulnerabilities are.

No single solution is perfect, but incremental improvements can be achieved when a range of solutions are used together; Some solutions have a significant effect on a very small number of users, others might have a small effect on many users. Platforms must balance ways to mitigate the most extreme harms, while also making incremental improvements for all users. Giving people more control over their own experience, improving abuse reporting and management mechanisms, or making product changes that improve security can all contribute to an ecosystem that improves the user experience.

Crowdsourced knowledge like Twitter’s Community Notes is less effective in highly contested spaces; Community Notes work when people from opposite sides of the argument find areas of broad consensus. Highly polarized debates are less likely to result in a shared consensus and therefore the most polarizing content will not end up with a Community Note.

The policies a platform can enforce are subject to the data the platform collects; For example, enforcing a policy against misgendering a transgender person requires the platform to collect data on gender. Policies can be developed, but can be unenforceable at scale if they fail to take into consideration the limitations of a platform’s ability to collect and analyze data.

Platforms can more effectively counter online harms by building trust and safety into their sales and product, rather than seeing content moderation as a downstream cost centre; From incentivising the sales team to conduct due diligence on potential advertising customers, to conducting trust and safety assessments on new product features, platforms must improve how they build trust and safety into their design.

Other areas of discussion:

  • Small platforms may have more time to respond than large platforms; Some small platforms have found that they tend to see some types of content after they have already received a lot of attention on larger platforms. This can give them more time to prepare their response.
  • Transparency and reporting obligations, such as those required by the EU’s Code of Practice on Disinformation, can divert resources from trust and safety teams; Participants thought transparency reporting in line with the Code of Practice on Disinformation was a positive step to increase transparency. However, they warned against confusing reporting with action and said that in most cases, the team members who have the responsibility for drafting the reports are drawn from content moderation or site integrity teams, meaning that resources are potentially diverted from enforcement or other more impactful work. Resource constraints are particularly acute at the moment given recent staff layoffs at social media and tech companies.

--

--

DISARM Foundation

We are home to the open DISARM Framework — a common language and approach for diverse teams to coordinate their efforts in the fight against disinformation