Incident 21: Tougher Turing Test Exposes Chatbots’ Stupidity (migrated to Issue)
Entities
View all entitiesIncident Stats
CSETv1 Taxonomy Classifications
Taxonomy DetailsGMF Taxonomy Classifications
Taxonomy DetailsKnown AI Goal
Question Answering
Known AI Technology
Language Modeling, Distributional Learning
Potential AI Technology
Transformer
Potential AI Technical Failure
Generalization Failure, Dataset Imbalance, Underfitting, Context Misidentification
CSETv0 Taxonomy Classifications
Taxonomy DetailsFull Description
The Winograd Schema Challenge in 2016 highlighted shortcomings of an artificially intelligent system's ability to understand context. The Challenge is designed to present ambiguous sentences and ask AI systems to decipher them. In the Winograd Scheme Challenge, the two winning entries were successful 48% of the time, while random chance was correct 45% of the time. Quan Liu of the University of Science and Technology of China (partnering with University of Toronto and National Research Council of Canada) and Nicos Isaak of the Open University of Cyprus presented the most successful systems. It is notable that Google and Facebook did not participate.
Short Description
The 2016 Winograd Schema Challenge highlighted how even the most successful AI systems entered into the Challenge were only successful 3% more often than random chance.
Severity
Unclear/unknown
AI System Description
Artificially intelligent systems meant to understand ambiguous English sentences.
Sector of Deployment
Professional, scientific and technical activities
Relevant AI functions
Perception, Cognition, Action
Location
New York, NY
Named Entities
Winograd Schema Challenge, University of Science and Technology of China, Quan Liu, University of Toronto, National Research Council of Canada, Nicos Isaak, Open University of Cyprus
Technology Purveyor
Quan Liu, Nicos Isaak
Beginning Date
2016-01-01T00:00:00.000Z
Ending Date
2016-01-01T00:00:00.000Z
Near Miss
Unclear/unknown
Intent
Unclear
Lives Lost
No
Incident Reports
Reports Timeline
- View the original report at its source
- View the report at the Internet Archive
The following former incidents have been converted to "issues" following an update to the incident definition and ingestion criteria.
21: Tougher Turing Test Exposes Chatbots’ Stupidity
Description: The 2016 Winograd Schema Challenge highli…
Variants
Similar Incidents
Did our AI mess up? Flag the unrelated incidents
Inappropriate Gmail Smart Reply Suggestions
TayBot
Gender Biases in Google Translate
Similar Incidents
Did our AI mess up? Flag the unrelated incidents