--- license: apache-2.0 language: - en tags: - sft pipeline_tag: text-generation widget: - text: You are a helpful assistant model trained by LAION called AkiHi, how are you? - text: What's the Earth total population - text: Write a story about future of AI development --- # Pythia 3B SFT model revision 1 # Model Details ## Model Description Model was supervised fine tuned on only [Open Assistant](https://open-assistant.io/) crowd souce platform. - **Developed by:** Open Assistant - **Model type:** Pythia - **Language(s) (NLP):** English - **License:** Apache-2.0 ## Model Sources [optional] - **Repository:** [Open Assistant](https://github.com/LAION-AI/Open-Assistant) # Uses ## Direct Use See the example on the right # Bias, Risks, and Limitations [just read pythia](https://huggingface.co/EleutherAI/pythia-12b#out-of-scope-use) ## Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "theblackcat102/pythia-3b-deduped-sft-r1" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name).half().eval().cuda() input_text = "What's the earth population?" inputs = tokenizer(input_text, return_tensors="pt", padding=True).to(0) outputs = model.generate( **inputs, early_stopping=True, max_new_tokens=args.max_new_tokens, do_sample=True, top_k=args.top_k, temperature=args.temperature, pad_token_id=tokenizer.eos_token_id, # dialogue_collator.py line 36 ) output = tokenizer.decode(outputs[0], truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"]) print(output) ``` # Training Details ## Training Data ## Training Procedure ``` deepspeed trainer_sft.py --configs defaults pythia-1-4b-ost --deepspeed ``` This model was trained for 200 iterations. After 200 iterations the accuracy started to drop and loss increasing which is a sign of overfitting. ### Training Hyperparameters ``` defaults: learning_rate: 1e-5 gradient_checkpointing: false gradient_accumulation_steps: 32 per_device_train_batch_size: 2 per_device_eval_batch_size: 2 weight_decay: 0.00 warmup_steps: 600 eval_steps: 250 save_steps: 250 max_length: 512 num_train_epochs: 2 logging_steps: 10 max_grad_norm: 2.0 save_total_limit: 4 fp16: true eval_accumulation_steps: freeze_layer: datasets: - oa_private: data_path: .cache split: sft val_split: 0.01 fraction: 1 file: 2023-02-26_oasst_default.jsonl cache_dir: .cache loss_fn: CrossEntropyLoss eval_size: log_dir: "base" quantization: false seq2seqmodel: false poly_eps: 1.0 fuse_gelu: false log_wandb: true samples_mixing: true # uses collator that mixes samples in the batch to create a single sample with possible multiple tasks within verbose: false pythia-1-4b-ost: learning_rate: 1e-6 model_name: EleutherAI/pythia-1.4b-deduped weight_decay: 0.01 max_length: 1024 warmup_steps: 100 gradient_checkpointing: false gradient_accumulation_steps: 12 per_device_train_batch_size: 5 per_device_eval_batch_size: 6 eval_steps: 100 save_steps: 100 num_train_epochs: 50 save_total_limit: 4 ``` # Evaluation ## Testing Data, Factors & Metrics ### Testing Data [More Information Needed] ### Factors [More Information Needed] ### Metrics [More Information Needed] ## Results ### Summary # Model Examination [optional] [More Information Needed] # Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] # Technical Specifications [optional] ## Model Architecture and Objective [More Information Needed] ## Compute Infrastructure [More Information Needed] ### Hardware [More Information Needed] ### Software [More Information Needed] # Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] # Glossary [optional] [More Information Needed] # Acknowledgements - [LAION](https://laion.ai/) & EleutherAI - [Stability.ai](https://stability.ai/) : this project wouldn't be possible without their compute resource - [Teams and contributors at Open Assistant](https://github.com/LAION-AI/Open-Assistant/graphs/contributors) : who put their time after their day job or whatever into this project - [Huggingface](https://huggingface.co/) : For the storage and spaces here # Model Card Authors [optional] [More Information Needed] # Model Card Contact [More Information Needed]