Hist Gradient Boosting Classifier

The HistGradient Boosting Classifier model was utilized to predict the values in the "predictors" column of a dataset consisting of four categorical columns ('Food preference', 'Smoker?', 'Living in?', 'Any hereditary condition?') and eleven numerical columns ('Specific ailments', 'Age', 'BMI', 'Follow Diet', 'Physical activity', 'Regular sleeping hours', 'Alcohol consumption', 'Social interaction', 'Taking supplements', 'Mental health management', 'Illness count last year'). To handle missing values, the columns ('Follow Diet', 'Physical activity', 'Regular sleeping hours', 'Alcohol consumption', 'Social interaction', 'Taking supplements', 'Mental health management', 'Illness count last year') were imputed with the median using a simple imputer, while the remaining columns were marked as 'missing'. The HistGradient Boosting Classifier is a powerful gradient boosting algorithm that combines the benefits of histogram-based gradient boosting with a more efficient and scalable implementation. It is particularly effective for handling large datasets with a mix of numerical and categorical features. After training the HistGradient Boosting Classifier on the dataset, the model's performance was evaluated using appropriate metrics such as accuracy, precision, recall, or F1 score. These metrics provide insights into the classifier's ability to accurately predict the values in the "predictors" column. The predictions generated by the HistGradient Boosting Classifier were saved in a CSV file named "submission.csv." This file contains the predicted values for the "predictors" column based on the input data and the trained HistGradient Boosting Classifier model.

7/14/2023
39 views

Tags:  

#machine-learning 

#classification