ClinBAY
/

pbc_complication_model

RandomForestClassifier

Model card Files Files and versions Community

pbc_complication_model / RandomForestClassifier_Pipeline_explanation.txt

michalisG

adding model

861a9a7 3 months ago

raw history blame contribute delete

No virus

1.2 kB

	The Pipeline is using Simple-Imputer to impute the missing values of the data-setbefore pass them to the model.

	The Pipeline is using One-Hot-Encoding to encode the categorical valuesof the data-set before pass them to model, most of the models need One-hot-encoding, this algorithm transforms the value from a category to numerical.

	Many machine learning algorithms perform better or converge faster when features are on a relatively similar scale and/or close to normally distributed. This Pipeline uses Standard-Scaler algorithm which follows Standard Normal Distribution (SND). Therefore, it transforms each value in the column to range about the mean 0 and standard deviation 1, ie, each value will be normalised by subtracting the mean and dividing by standard deviation.

	This Pipeline has a RandomForestClassifier model. This model has been used because the user selected the "Accuracy" option and the machine learning problem is classification.

	The Grid Search hyper-parameter tuning was used in this Pipeline because the parameter list number was 9 or less, and an exhaustive Grid Search can be run.

	Columns that have been removed from the training:
	This is the target column: target