mrm8488 commited on
Commit
61dbb88
1 Parent(s): c2ebb01

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -8
README.md CHANGED
@@ -42,10 +42,6 @@ On top of that, there is one more trick to consider: the overhead from de-quanti
42
  As a result, the larger batch size you can fit, the more efficient you will train.
43
 
44
 
45
- ### Where can I train for free?
46
-
47
- You can train fine in colab, but if you get a K80, it's probably best to switch to other free gpu providers: [kaggle](https://towardsdatascience.com/amazon-sagemaker-studio-lab-a-great-alternative-to-google-colab-7194de6ef69a), [aws sagemaker](https://towardsdatascience.com/amazon-sagemaker-studio-lab-a-great-alternative-to-google-colab-7194de6ef69a) or [paperspace](https://docs.paperspace.com/gradient/more/instance-types/free-instances). For intance, this is the same notebook [running in kaggle](https://www.kaggle.com/justheuristic/dmazur-converted) using a more powerful P100 instance.
48
-
49
 
50
  ### Can I use this technique with other models?
51
 
@@ -54,7 +50,7 @@ The model was converted using [this notebook](https://nbviewer.org/urls/huggingf
54
  ### How to use
55
 
56
  ```sh
57
- wget https://huggingface.co/mrm8488/bertin-gpt-j-6B-ES-8bit/resolve/main/utils.py -O Utils.py
58
  pip install transformers
59
  pip install bitsandbytes-cuda111==0.26.0
60
  ```
@@ -65,7 +61,7 @@ import torch
65
 
66
  from Utils import GPTJBlock, GPTJForCausalLM
67
 
68
- device = 'cuda' if torch.cuda.is_available() else 'cpu'
69
 
70
  transformers.models.gptj.modeling_gptj.GPTJBlock = GPTJBlock # monkey-patch GPT-J
71
 
@@ -76,9 +72,9 @@ model = GPTJForCausalLM.from_pretrained(ckpt, pad_token_id=tokenizer.eos_token_i
76
 
77
 
78
  prompt = tokenizer("El sentido de la vida es", return_tensors='pt')
79
- prompt = {key: value.to(device) for key, value in prompt.items()}
80
 
81
- out = model.generate(**prompt, max_length=64, do_sample=True)
82
 
83
  print(tokenizer.decode(out[0]))
84
  ```
 
42
  As a result, the larger batch size you can fit, the more efficient you will train.
43
 
44
 
 
 
 
 
45
 
46
  ### Can I use this technique with other models?
47
 
 
50
  ### How to use
51
 
52
  ```sh
53
+ wget https://huggingface.co/mrm8488/bertin-gpt-j-6B-ES-v1-8bit/resolve/main/utils.py -O Utils.py
54
  pip install transformers
55
  pip install bitsandbytes-cuda111==0.26.0
56
  ```
 
61
 
62
  from Utils import GPTJBlock, GPTJForCausalLM
63
 
64
+ device = "cuda" if torch.cuda.is_available() else "cpu"
65
 
66
  transformers.models.gptj.modeling_gptj.GPTJBlock = GPTJBlock # monkey-patch GPT-J
67
 
 
72
 
73
 
74
  prompt = tokenizer("El sentido de la vida es", return_tensors='pt')
75
+ feats = {key: value.to(device) for key, value in prompt.items()}
76
 
77
+ out = model.generate(**feats, max_length=64, do_sample=True)
78
 
79
  print(tokenizer.decode(out[0]))
80
  ```