Gamers help highlight disparities in algorithm data

Is The Witcher immersive? Is The Sims a role-playing game?

Gamers from around the world may have differing opinions, but this diversity of thought makes for better algorithms that help audiences everywhere pick the right games, according to new research from Cornell, Xbox and Microsoft Research.

With the help of more than 5,000 gamers, researchers show that predictive models, fed on massive datasets labeled by gamers from different countries, offer better personalized gaming recommendations than those labeled by gamers from a single country.

The team’s findings and corresponding guidelines have broad application beyond gaming for researchers and practitioners who seek more globally applicable data labeling and, in turn, more accurate predictive artificial intelligence (AI) models.

“We show that, in fact, you can do just as well, if not better, by diversifying the underlying data that goes into predictive models,” said Allison Koenecke, assistant professor of information science in the Cornell Ann S. Bowers College of Computing and Information Science.

Koenecke is the senior author of “Auditing Cross-Cultural Consistency of Human-Annotated Labels for Recommendation Systems,” which was presented at the Association for Computing Machinery Fairness, Accountability, and Transparency (ACM FAccT) conference, in June.

Massive datasets inform the predictive models behind recommendation systems. The model’s accuracy depends on its underlying data, especially the proper labeling of each individual piece within that massive trove. Researchers and practitioners are increasingly turning to crowdsourced workers to do this labeling for them, but crowdsourced workforces tend to be homogenous.

During this data-labeling phase, cultural bias can creep in and, ultimately, skew a predictive model intended to serve global audiences, Koenecke said.

“For the datasets used in algorithmic processes, someone still has to come up with either some rules or just some general idea of what it means for a data point to be labeled in some way,” Koenecke said. “That’s where this human aspect comes in, because humans do have to be the decision makers at some point in this process.”

The team surveyed 5,174 Xbox gamers from around the world to help label gaming titles. They were asked to apply labels like “cozy,” “fantasy,” or “pacifist” to games they had played, and to consider different factors, such as whether a title is low or high complexity, or the difficulty of the game controls.

Some game labels—like “zen,” which is used to describe peaceful, calming games—were applied consistently across countries; others, like whether a game is “replayable,” were applied inconsistently. To explain these inconsistencies, the team used computational methods to find that both cultural differences among gamers and translational and linguistic quirks of certain labels contributed to labeling differences across countries.

The researchers then built two models that could predict how gamers from each country would label a certain game—one was fed survey data from globally representative gamers, and the second used survey data from only U.S. gamers. They found that the model trained on labels from diverse global populations improved predictions by 8% for gamers everywhere when compared to the other model trained on labels from just American gamers.

“We see improvement for everyone—even for gamers from the U.S.—when the training data is shifted from being entirely U.S.-centric to being more globally representative,” Koenecke said.

In addition to their findings, researchers crafted a framework to guide fellow researchers and practitioners on ways to audit underlying data labels to check for global inclusivity.

“Companies tend to use homogeneous data labelers to do their data labeling, and if you’re trying to build a global product, you’ll run into issues,” Koenecke said. “With our framework, any academic researcher or practitioner could audit their own underlying data to see if they might be running into issues of representation via their data labels or choices.”

More information:
Rock Yuren Pang et al, Auditing Cross-Cultural Consistency of Human-Annotated Labels for Recommendation Systems, 2023 ACM Conference on Fairness, Accountability, and Transparency (2023). DOI: 10.1145/3593013.3594098

Provided by
Cornell University

Citation:
Gamers help highlight disparities in algorithm data (2023, September 29)

From Artemis II to ‘Project Hail Mary’, spaceflight captures audiences when it centers on people because human space travel is hazardous

How Iranian hackers pose a threat to US critical infrastructure

Astronaut Victor Glover is the latest in a long line of Black American explorers − including York, the enslaved man who played a key role in the Lewis and Clark expedition

Why Iran targeted Amazon data centers and what that does – and doesn’t – change about warfare

We analyzed Philly street scenes and identified signs of gentrification using machine learning trained on longtime residents’ observations

Are multiverses real? An astrophysicist explains why it depends on how you define ‘real’

Artemis II’s long countdown – a space historian explains why it has taken over 50 years to return to the Moon

‘Project Hail Mary’ demonstrates how intellectual humility can be a guiding force for scientists and astronauts

How Iranian hackers pose a threat to US critical infrastructure

Why Iran targeted Amazon data centers and what that does – and doesn’t – change about warfare

We analyzed Philly street scenes and identified signs of gentrification using machine learning trained on longtime residents’ observations

Not just a common cold: Studies show RSV's severity and impact on long-term health

More Americans, especially Black adults, are dying before they can access Medicare benefits

Cash transfers boost health in low- and middle-income countries, data reveal

Google plans to power a new data center with fossil fuels, yet release almost no emissions – here’s how its carbon capture tech works

Vegan diet can halve your carbon footprint, study finds

Cloud droplet microphysics challenges accuracy of current climate models

From Artemis II to ‘Project Hail Mary’, spaceflight captures audiences when it centers on people because human space travel is hazardous

NASA wants to build a base on the Moon by the 2030s – how and why it plans to build up to a long-term lunar presence

From ‘Project Hail Mary’ to Artemis II, spaceflight captures audiences when it centers on people because human space travel is hazardous

‘Project Hail Mary’ explores unique forms of life in space – 5 essential reads on searching for aliens that look nothing like life on Earth

How do scientists hunt for dark matter? A physicist explains why the mysterious substance is so hard to find

From ski jumps and sliding bobsleds to engineering snow, here are 5 essential reads on the science of the Winter Olympics

Diet alone doesn't explain divergent health of California sea lions in US and Mexico

How cells orchestrate protein production through ER-lysosome interactions

Tropical spiders craft giant doppelgängers as decoys

How Iranian hackers pose a threat to US critical infrastructure

HBO’s ‘The Pitt’ nails how hospital cyberattacks create chaos, endanger patients and disrupt critical care

Why cloud service outages ripple across the internet – and the economy

Why Iran targeted Amazon data centers and what that does – and doesn’t – change about warfare

How American courts are rewriting the rules for Big Tech and children

How a carrier of light you can’t see underlies much of the modern world

Millions of Instagram influencers had their private contact data scraped and exposed – TechCrunch

Several chip companies, including Qualcomm and Intel, have reportedly stopped supplying Huawei after blacklist – TechCrunch

Ford will cut 10 percent of its salaried employees

Microsoft says it wants to promote a ‘healthy gaming lifestyle’

Gamers help highlight disparities in algorithm data

Gamers help highlight disparities in algorithm data

Subscribe