Ended on 9th Nov'20 03:30 PM (Coordinated Universal Time)

Data Sprint #13: Cyberbullying

Identify cyberbullying comments

338
Hard
Challenge Starts

06 Nov 03:30 pm

Registration Ends

09 Nov 03:30 pm

Challenge Ends

09 Nov 03:30 pm

Context

What is cyberbullying?

Cyberbullying, also known as cyberharassment, is a form of bullying or harassment which happens over electronic media (or over the internet). It is also known as online bullying. 

It has become increasingly common as the digital sphere has expanded and technology has advanced.

Image source: UNICEF

Cyberbullying is when someone bullies or harasses others on the internet and in other digital spaces, particularly on social media sites. Harmful bullying behavior can include posting rumors, threats, sexual remarks, a victims' personal information, or pejorative labels (i.e. hate speech) with the intention of causing embarrassment or humiliation. Bullying or harassment can be identified by repeated behavior and an intent to harm. Victims of cyberbullying may experience lower self-esteem, increased suicidal ideation, and a variety of negative emotional responses including being scared, frustrated, angry, or depressed.


Problem Statement

The world of the internet receives thousands of new posts and comments on a daily basis from all over the globe. It is practically impossible for platforms (websites, forums, social media sites, etc.) to manually moderate these comments in order to identify cyberbullying and take appropriate actions.


Objective

Your objective here is to build a machine learning model that would identify comments that are cyberbullying.


Evaluation Criteria

Submissions are evaluated using F1 score.

How do we do it? 

Once we release the data, anyone can download it, build a model, and make a submission. We give competitors a set of data (
training data
) with both the independent and dependent variables. 

We also release another set of data (
test dataset
) with just the independent variables, and we hide the dependent variable that corresponds with this set. You submit the predicted values of the dependent variable for this set and we compare it against the actual values. 

The predictions are evaluated based on the evaluation metric defined in the datathon.


 

This website uses cookies to ensure you get the best experience. Learn more