Network Activity Anomaly Detection, by Consulting and Analytics Club, IIT Guwahati

The following code performs :
Data Loading: Loads training, test, and sample submission datasets using pandas.
Data Preprocessing:
Identifies numerical and categorical columns.
Creates preprocessing pipelines for both numerical (imputation and scaling) and categorical (imputation and one-hot encoding) data.
Model Training and Validation:
Splits the training data into training and validation sets.
Trains a RandomForestClassifier on the training split.
Validates the model using accuracy score and classification report.
Final Training and Prediction:
Retrains the model on the entire training data.
Makes predictions on the test data.
Submission: Saves the predictions to 'Submission.csv'.

Machine Learning Pipeline

Learn

Getting Started

More

Organization

Social Media