Machine Learning Pipeline

The following code performs : Data Loading: Loads training, test, and sample submission datasets using pandas. Data Preprocessing: Identifies numerical and categorical columns. Creates preprocessing pipelines for both numerical (imputation and scaling) and categorical (imputation and one-hot encoding) data. Model Training and Validation: Splits the training data into training and validation sets. Trains a RandomForestClassifier on the training split. Validates the model using accuracy score and classification report. Final Training and Prediction: Retrains the model on the entire training data. Makes predictions on the test data. Submission: Saves the predictions to 'Submission.csv'.

7/5/2024
24 views

Tags:  

#python 

#machine-learning