This is the base model of Algae-550M.
This model was trained on a 35GB dataset using bf16 precision and completed 1.8 epochs. It performs well in answering questions, achieving a score of up to 45.2 in TruthfulQA (mc2), surpassing GPT-2 (40.6). Other metrics align with models of equivalent training and parameter volume.
This model was trained using open-source datasets. All work was completed solely by the author. Given that the author is currently a high school student without formal systematic training, any questions or suggestions are welcome.
It's important to note that the version of the model released here is not necessarily the one with the best performance in testing, but rather a version with improved overall language comprehension abilities.
- Downloads last month
- 14
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.