Content moderation is arguably one of the trickiest problems on the internet.
Every day, billions of pictures, videos, and text are uploaded to the web. Most of it is good stuff. But peppered in with all the cat memes, TikTok dances, and DIY tutorial videos, there’s also a lot of stuff that platforms don’t want to host, like hate speech, misinformation, and depictions of graphic violence or sexual abuse. How do you promote the good stuff and prevent the bad stuff from spreading?
Generally speaking, there are two approaches. The first is human-based curation. Many online platforms use moderators, who sift through new submissions and flag anything that runs afoul of the platform’s community standards. But this method has limits. Human moderation doesn’t scale well for large communities. Even big teams of human moderators are too slow to parse through the avalanche of content that’s posted online every day. A team of 10,000 people working round the clock, for example, could only vet a tiny fraction of the four million YouTube videos uploaded every day.
For this reason, many of the web’s largest platforms rely primarily on algorithms to curate what users see in their feeds and only call on human moderators when something needs closer inspection. This is a much more cost-effective way to parse through billions of user posts—but it also has its own set of shortcomings. Research has shown that many curatorial algorithms on popular social media sites tend to favor content that’s divisive, inflammatory, or provoking in some way, simply because those kinds of posts get the most engagement from users. So if you’ve noticed that the web feels more negative and adversarial lately, that may be why.
But what if it wasn’t like this? What if we could have the best of both worlds and combine the creativity and complexity of human curation with the speed, scale, and unrelenting watchfulness of algorithms?
That’s the concept behind Cura: a clever new content moderation tool developed by a group of machine learning researchers at Stanford University.
A New Breed of Auto-Moderator
Here’s how it works: Imagine a large social media service with a group of designated human curators. Normally, they’d struggle to sift through the heaps of posts beyond a certain volume and community size. Instead of adding more moderators, a social network can deploy Cura to help parse everything.
Cura is essentially an AI-powered content moderation bot designed to mimic a human curator’s behavior. Once deployed, the system observes all the actions of its human partner and learns to predict what they would approve and disapprove of. It then adopts those preferences as its own and acts on the curator’s behalf to moderate content. “Cura’s goal is to facilitate online communities whose content reflects the curators’ opinion on what to share, and can do so at scale without curator review on every submission,” the creators explain.
“There’s currently a mismatch between what algorithmic curation provides and what human curation surfaces—and it’s the latter that online audiences prefer if offered an option.”
This approach is notably different from how most algorithmically assisted content moderation is performed. Instead of treating likes, upvotes, reposts, and comments as the key indicators of a post’s value (as many popular social media platforms do), Cura simply tries to replicate the opinions and idiosyncracies of human tastemakers. To achieve this, the system models not only the curator’s preferences but also the likes and dislikes of the platform’s users. If a few users, for example, have consistently liked a certain curator’s picks before, Cura assumes that those users’ choices are also an indicator of what the curator would like. Therefore, in the future, when those users upvote a post, the AI can reasonably assume its curator would upvote that post too, and automatically boost it for the rest of the community.
This way, human curators don’t have to review each submission and only need to contribute sporadically to ensure the algorithms don’t skew too much away from their opinions. It’s basically autopilot that moderators can tune to their preferences.
“This approach allows communities to run at scale on user contributions while maintaining a north star of their curators’ opinions,” the researchers explain in their introductory paper.
Making the Web a Better Place
Laura Herman, an Oxford Internet Institute researcher who studies how AI-based curation affects creativity, told PCMag she believes a shift back to human curation of the web is long overdue, and finds Cura an exciting solution.
According to Herman, there’s currently a mismatch between what algorithmic curation provides and what human curation surfaces—and it’s the latter that online audiences prefer if offered an option. While human curation focuses on meaning, concepts, creativity, uniqueness, abstraction, and emotion, she says, “algorithmic curation predictably foregrounds recognizable, trending, quick-to-consume, commercialized content.”
“[Cura] roughly halves the number of anti-social violations in a large community—without requiring any additional moderation effort.”
Cura could help fix that in a big way. In tests performed on a Reddit community, the system estimated curators’ votes with 81.96% accuracy, whereas the traditional majority vote (Reddit’s ‘score’) performed at 65.9%.
Wanrong He, one of Cura’s developers, emphasizes that in addition to surfacing better content, it’s also extremely effective in reducing the spread of toxic, polarizing content that’s often favored by today’s algorithmically driven feeds. “We measure that [Cura] roughly halves the number of anti-social violations in a large community—without requiring any additional moderation effort,” he says.
Recommended by Our Editors
Are More Algorithms Really the Answer?
As promising as Cura’s AI-assisted approach to content moderation may be, it’s definitely not a perfect solution. Even its creators admit it’s “not a good fit for every community.” Due to how it operates, the system’s AI struggles in small groups where there’s less voting data available for the model to learn from.
Another risk, says Herman, lies in having certain “elite tastemakers” determine what floats to the top. She draws parallels with the art industry, which in earlier generations was at the whims of individual gallerists or museum curators, who tended to come from privileged backgrounds and inadvertently crowded out more diverse perspectives from exercising influence over the direction of global culture.
Mike McCue, CEO of the news discovery app Flipboard and proponent of human-based curation, contends that no matter how much automated feeds might improve, we should always leave room for human judgment to decide “what’s cool, tastes good, is insightful or inspiring,” and “ensure the content meets editorial standards.”
“In our opinion that is how AI and human curation should work together, leveraging the best of both,” McCue told PCMag.
Going Beyond Proof of Concept
Whether Cura can be effective in a real-world scenario, though, remains to be seen. Thus far, the researchers behind it have only conducted small-scale pilots, and have yet to secure a large-scale partnership.
“Cura is currently an algorithmic and visualization layer on top of Reddit data,” they explain. Deploying it in the real world, on an existing platform, would “require a community to change its back-end voting or install a bot to do this on its behalf.”
If you’re interested in experimenting with AI-assisted curation or building it into your website, you don’t have to wait for permission. All of Cura’s code and documentation are available on GitHub, and are free for anyone to use.
Get Our Best Stories!
Sign up for What’s New Now to get our top stories delivered to your inbox every morning.
This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.