Spaces:
Running
Example prompt for create a good classification model
ð: This is train dataset train_df = pd.read_csv('/content/train.csv')
This is dataset without y(target) test_df = pd.read_csv('/content/test.csv')
āļŠāļĢāđāļēāļāđāļĄāđāļāļĨāļŠāļģāļŦāļĢāļąāļāļāļģāļāļēāļĒ activity āļāļēāļāļāđāļāļĄāļđāļĨ accelerometer
- Use tag_identification,x, y, z to predict activity
- Train the Classification model with Auto Gluon (TabularPredictor), label='activity', time_budget=600
- Show the evaluation result
- Use Model to predict on test_df
- export to csv in id,activity columns
This is columns in train_df
tag_identification : āļŦāļĄāļēāļĒāđāļĨāļāļĢāļ°āļāļļāļāļļāļāļāļĢāļāđāļŠāļ§āļĄāđāļŠāđ
x : āļāđāļāļĄāļđāļĨ accelerometer āđāļāđāļāļ x
y : āļāđāļāļĄāļđāļĨ accelerometer āđāļāđāļāļ y
z : āļāđāļāļĄāļđāļĨ accelerometer āđāļāđāļāļ z
activity : āļāļīāļāļāļĢāļĢāļĄāļāļĩāđāļāļģāļāļĒāļđāđ {walking,falling,'lying down',lying,'sitting down',sitting,'standing up from lying','on all fours','sitting on the ground','standing up from sitting', etc.}
ðĪ: To accomplish your request, you need to follow these steps:
Firstly, let's load the data and check the format:
import pandas as pd
from autogluon.tabular import TabularPredictor
train_df = pd.read_csv('/content/train.csv')
test_df = pd.read_csv('/content/test.csv')
print(train_df.head())
print(test_df.head())
Then we will prepare the predictor:
label = 'activity'
time_budget = 600 # Time in seconds
predictor = TabularPredictor(label=label).fit(
train_df,
time_limit=time_budget,
)
After training, we can evaluate our model:
performance = predictor.evaluate_predictions(
y_true=train_df[label],
y_pred=predictor.predict(train_df),
auxiliary_metrics=True
)
print(performance)
Next, we use the model to predict on test_df
:
predictions = predictor.predict(test_df)
Finally, we export predictions to a CSV file:
output_df = pd.DataFrame({'id': test_df['tag_identification'], 'activity': predictions})
output_df.to_csv('submission.csv', index=False)
Please note that this code assumes that 'tag_identification' is the ID column in your test set. If it is not, please replace 'tag_identification'
with the correct ID column name. Also, please ensure that the label 'activity' is present in your training set and absent in your testing set as you've mentioned.
Remember to install AutoGluon if you haven't done so already by running !pip install autogluon.tabular
. This package requires a lot of computational resources, so make sure your environment can handle it. The time_budget
parameter limits the time for training, but keep in mind that the more time you give it, the better your model might perform.