Machine Learning Pipeline
The following code performs : Data Loading: Loads training, test, and sample submission datasets using pandas. Data Preprocessing: Identifies numerical and categorical columns. Creates preprocessing pipelines for both numerical (imputation and scaling) and categorical (imputation and one-hot encoding) data. Model Training and Validation: Splits the training data into training and validation sets. Trains a RandomForestClassifier on the training split. Validates the model using accuracy score and classification report. Final Training and Prediction: Retrains the model on the entire training data. Makes predictions on the test data. Submission: Saves the predictions to 'Submission.csv'.
Tags:
#python
#machine-learning