mrm8488
/

bertin-gpt-j-6B-ES-v1-8bit

Text Generation

Inference Endpoints

Model card Files Files and versions Community

mrm8488 commited on Oct 3, 2022

Commit

61dbb88

·

1 Parent(s): c2ebb01

Update README.md

Files changed (1) hide show

README.md +4 -8

README.md CHANGED Viewed

@@ -42,10 +42,6 @@ On top of that, there is one more trick to consider: the overhead from de-quanti
 As a result, the larger batch size you can fit, the more efficient you will train.
-### Where can I train for free?
-You can train fine in colab, but if you get a K80, it's probably best to switch to other free gpu providers: [kaggle](https://towardsdatascience.com/amazon-sagemaker-studio-lab-a-great-alternative-to-google-colab-7194de6ef69a), [aws sagemaker](https://towardsdatascience.com/amazon-sagemaker-studio-lab-a-great-alternative-to-google-colab-7194de6ef69a) or [paperspace](https://docs.paperspace.com/gradient/more/instance-types/free-instances). For intance, this is the same notebook [running in kaggle](https://www.kaggle.com/justheuristic/dmazur-converted) using a more powerful P100 instance.
 ### Can I use this technique with other models?
@@ -54,7 +50,7 @@ The model was converted using [this notebook](https://nbviewer.org/urls/huggingf
 ### How to use
 ```sh
-wget https://huggingface.co/mrm8488/bertin-gpt-j-6B-ES-8bit/resolve/main/utils.py -O Utils.py
 pip install transformers
 pip install bitsandbytes-cuda111==0.26.0
 ```
@@ -65,7 +61,7 @@ import torch
 from Utils import GPTJBlock, GPTJForCausalLM
-device = 'cuda' if torch.cuda.is_available() else 'cpu'
 transformers.models.gptj.modeling_gptj.GPTJBlock = GPTJBlock  # monkey-patch GPT-J
@@ -76,9 +72,9 @@ model = GPTJForCausalLM.from_pretrained(ckpt, pad_token_id=tokenizer.eos_token_i
 prompt = tokenizer("El sentido de la vida es", return_tensors='pt')
-prompt = {key: value.to(device) for key, value in prompt.items()}
-out = model.generate(**prompt, max_length=64, do_sample=True)
 print(tokenizer.decode(out[0]))
 ```

 As a result, the larger batch size you can fit, the more efficient you will train.
 ### Can I use this technique with other models?
 ### How to use
 ```sh
+wget https://huggingface.co/mrm8488/bertin-gpt-j-6B-ES-v1-8bit/resolve/main/utils.py -O Utils.py
 pip install transformers
 pip install bitsandbytes-cuda111==0.26.0
 ```
 from Utils import GPTJBlock, GPTJForCausalLM
+device = "cuda" if torch.cuda.is_available() else "cpu"
 transformers.models.gptj.modeling_gptj.GPTJBlock = GPTJBlock  # monkey-patch GPT-J
 prompt = tokenizer("El sentido de la vida es", return_tensors='pt')
+feats = {key: value.to(device) for key, value in prompt.items()}
+out = model.generate(**feats, max_length=64, do_sample=True)
 print(tokenizer.decode(out[0]))
 ```