airesearch
/

WangchanLion7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

wannaphong commited on Dec 27, 2023

Commit

441a68e

•

1 Parent(s): ecc5ee8

Update README.md

Files changed (1) hide show

README.md +12 -2

README.md CHANGED Viewed

@@ -3,6 +3,17 @@ license: apache-2.0
 language:
 - th
 - en
 ---
 # Model Card for WangChanLion 7B - The Multilingual Instruction-Following Model
@@ -96,5 +107,4 @@ We performed human and machine evaluations on XQuAD zero-shot and one-shot setti
 # What WangchanLion offers:
 - Transparent pretrained model: The development of SEA-LION is community-driven, with different ASEAN collaborators contributing pretraining datasets. The SEA-LION developers ensure that all datasets are safe and can be utilized without commercial restrictions. This transparency extends to the provision of pretraining code, ensuring anyone can replicate SEA-LION using the provided datasets.
 - Transparent finetuning data: In the spirit of open science, we make the finetuning data for WangchanLion accessible to all. This commitment to openness empowers the community by providing complete visibility into the instruction finetuning data that shapes WangchanLion.
-- Transparent finetuning code: The finetuning code for WangchanLion is readily available for distribution. By sharing our methods and processes, we invite others to learn from, build upon, and innovate alongside us.

 language:
 - th
 - en
+datasets:
+- laion/OIG
+- databricks/databricks-dolly-15k
+- thaisum
+- scb_mt_enth_2020
+- garage-bAInd/Open-Platypus
+- iapp_wiki_qa_squad
+- pythainlp/han-instruct-dataset-v1.0
+- cognitivecomputations/dolphin
+- Hello-SimpleAI/HC3
+- Muennighoff/xP3x
 ---
 # Model Card for WangChanLion 7B - The Multilingual Instruction-Following Model
 # What WangchanLion offers:
 - Transparent pretrained model: The development of SEA-LION is community-driven, with different ASEAN collaborators contributing pretraining datasets. The SEA-LION developers ensure that all datasets are safe and can be utilized without commercial restrictions. This transparency extends to the provision of pretraining code, ensuring anyone can replicate SEA-LION using the provided datasets.
 - Transparent finetuning data: In the spirit of open science, we make the finetuning data for WangchanLion accessible to all. This commitment to openness empowers the community by providing complete visibility into the instruction finetuning data that shapes WangchanLion.
+- Transparent finetuning code: The finetuning code for WangchanLion is readily available for distribution. By sharing our methods and processes, we invite others to learn from, build upon, and innovate alongside us.