Incident 61: Overfit Kaggle Models Discouraged Data Science Competitors

Description: In the “The Nature Conservancy Fisheries Monitoring” competition on the data science competition website Kaggle, a number of competitors overfit their image classifier models to a poorly representative validation data set.

Tools

New ReportNew ReportNew ResponseNew ResponseDiscoverDiscoverView HistoryView History
Alleged: Individual Kaggle Competitors developed and deployed an AI system, which harmed Individual Kaggle Competitors.

Incident Stats

Incident ID
61
Report Count
1
Incident Date
2017-05-01
Editors
Sean McGregor

CSETv0 Taxonomy Classifications

Taxonomy Details

Full Description

On the data science competition website Kaggle, a number of competitors in the “The Nature Conservancy Fisheries Monitoring” competition overfit their image classifier models to a poorly representative validation data set. This resulted in intermediate competition rankings that were misleading and discouraged other data scientists from competing. Outside of the competition environment it would not have been clear that this error had taken place.

Short Description

In the “The Nature Conservancy Fisheries Monitoring” competition on the data science competition website Kaggle, a number of competitors overfit their image classifier models to a poorly representative validation data set.

Severity

Negligible

Harm Distribution Basis

Religion

AI System Description

Image classifer models designed by individual competitors on Kaggle.

System Developer

Individual Kaggle Competitors

Sector of Deployment

Public administration and defence

Relevant AI functions

Perception

AI Techniques

supervised learning, machine learning, DNN, VGG, open-source

AI Applications

Feature detection, Image classification, Decision support

Location

Global

Named Entities

Kaggle, The Nature Conservancy

Technology Purveyor

Kaggle Competitors

Beginning Date

2016-11-14T08:00:00.000Z

Ending Date

2017-04-12T07:00:00.000Z

Near Miss

Near miss

Intent

Accident

Lives Lost

No

Data Inputs

Images captured on fishing boats

What I’ve learned from Kaggle’s fisheries competition
medium.com · 2017

What I’ve learned from Kaggle’s fisheries competition

Gidi Shperber Blocked Unblock Follow Following May 1, 2017

TLDR:

Me and my Kaggle partner, have recently participated in “The Nature Conservancy Fisheries Monitoring” (hereby: “fisheries…

Variants

A "variant" is an incident that shares the same causative factors, produces similar harms, and involves the same intelligent systems as a known AI incident. Rather than index variants as entirely separate incidents, we list variations of incidents under the first similar incident submitted to the database. Unlike other submission types to the incident database, variants are not required to have reporting in evidence external to the Incident Database. Learn more from the research paper.

Similar Incidents

By textual similarity

Did our AI mess up? Flag the unrelated incidents