ebisuke commited on
Commit
dbf525e
โ€ข
1 Parent(s): b952029

applied very weak PPO

Browse files
README.md CHANGED
@@ -25,8 +25,8 @@ __ๆœฌใƒขใƒ‡ใƒซใฏ้–‹็™บไธญใฎใŸใ‚ใ€ใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆใฎๆ›ดๆ–ฐใซใ‚ˆใ‚Š้€ๆฌก
25
 
26
  ใƒฆใƒผใ‚ถใƒผใฎๅ…ฅๅŠ›ใ‚’"`็›ธๆ‰‹ใฏ่จ€ใ„ใพใ—ใŸใ€‚ใ€Œ๏ผˆๅ†…ๅฎน๏ผ‰ใ€\n`"ใงๆ‹ฌใฃใฆใใ ใ•ใ„ใ€‚
27
  ใƒขใƒ‡ใƒซใฏ"`ใ‚ใชใŸใฏ่จ€ใ„ใพใ—ใŸใ€‚ใ€Œ`"ไปฅ้™ใฎๆ–‡่„ˆใ‚’็”Ÿๆˆใ—ใพใ™ใ€‚
28
- ใใ‚Œไปฅ้™ใ‚‚็ถšใๅ ดๅˆใŒใ‚ใ‚‹ใฎใงๅฟ…่ฆใซๅฟœใ˜ใฆ"`ใ€`"ใฎๆ–‡ๅญ—ใพใงใงๆ‰“ใกๅˆ‡ใฃใฆใใ ใ•ใ„ใ€‚
29
-
30
  ```python
31
  import torch
32
  from transformers import AutoTokenizer, AutoModelForCausalLM
@@ -53,6 +53,6 @@ print(output)
53
  ```
54
 
55
  ## Plan
56
- - RLHFใจใ‹ใซๆŒ‘ๆˆฆใ—ใฆใฟใ‚‹ใ€‚
57
  - ใƒ—ใƒญใƒณใƒ—ใƒˆใฎ่จ˜่ฟฐๆ–นๆณ•ใ‚’ใ€ๆ—ขๅญ˜ใฎใƒใƒฃใƒƒใƒˆใƒขใƒ‡ใƒซใฎใƒ•ใ‚ฉใƒผใƒžใƒƒใƒˆใซๅˆใ‚ใ›ใ‚‹ใ‹ๆคœ่จŽไธญใ€‚
58
- - ๆŒ‡็คบใ‚’ใ‚ใพใ‚Šๅ—ใ‘ไป˜ใ‘ใชใ„ใƒป็‰ฉใ‚’็Ÿฅใ‚‰ใชใ„ๆ–นใŒๅฅฝใฟใฎใŸใ‚ใ€instructใƒขใƒ‡ใƒซใธๅˆ‡ใ‚Šๆ›ฟใˆใ‚‹ไบˆๅฎšใฏใ‚ใ‚Šใพใ›ใ‚“ใ€‚
 
25
 
26
  ใƒฆใƒผใ‚ถใƒผใฎๅ…ฅๅŠ›ใ‚’"`็›ธๆ‰‹ใฏ่จ€ใ„ใพใ—ใŸใ€‚ใ€Œ๏ผˆๅ†…ๅฎน๏ผ‰ใ€\n`"ใงๆ‹ฌใฃใฆใใ ใ•ใ„ใ€‚
27
  ใƒขใƒ‡ใƒซใฏ"`ใ‚ใชใŸใฏ่จ€ใ„ใพใ—ใŸใ€‚ใ€Œ`"ไปฅ้™ใฎๆ–‡่„ˆใ‚’็”Ÿๆˆใ—ใพใ™ใ€‚
28
+ ใใ‚Œไปฅ้™ใ‚‚็ถšใๅ ดๅˆใŒใ‚ใ‚‹ใฎใงๅฟ…่ฆใซๅฟœใ˜ใฆ"`ใ€`"ใฎๆ–‡ๅญ—ใพใงใงๆ‰“ใกๅˆ‡ใฃใฆใใ ใ•ใ„ใ€‚
29
+ ้•ทๆ–‡ใ‚’ๆ‰“ใคใจๅฃ่ชฟใŒๅ‰ฅใŒใ‚Œใ‚‹ใฎใงใ”ๆณจๆ„ใใ ใ•ใ„ใ€‚
30
  ```python
31
  import torch
32
  from transformers import AutoTokenizer, AutoModelForCausalLM
 
53
  ```
54
 
55
  ## Plan
56
+ - RLHFใจใ‹ใซๆŒ‘ๆˆฆใ—ใฆใฟใ‚‹ใ€‚โ†’23/05/30ใ”ใๅฐใ•ใ„ใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆใง่ฉฆ่กŒ
57
  - ใƒ—ใƒญใƒณใƒ—ใƒˆใฎ่จ˜่ฟฐๆ–นๆณ•ใ‚’ใ€ๆ—ขๅญ˜ใฎใƒใƒฃใƒƒใƒˆใƒขใƒ‡ใƒซใฎใƒ•ใ‚ฉใƒผใƒžใƒƒใƒˆใซๅˆใ‚ใ›ใ‚‹ใ‹ๆคœ่จŽไธญใ€‚
58
+ - ๆŒ‡็คบใ‚’ใ‚ใพใ‚Šๅ—ใ‘ไป˜ใ‘ใชใ„ใƒป็‰ฉใ‚’็Ÿฅใ‚‰ใชใ„ๆ–นใŒๅฅฝใฟใฎใŸใ‚ใ€instructionใƒขใƒ‡ใƒซใธๅˆ‡ใ‚Šๆ›ฟใˆใ‚‹ไบˆๅฎšใฏใ‚ใ‚Šใพใ›ใ‚“ใ€‚
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "rinna/japanese-gpt-neox-3.6b",
3
  "architectures": [
4
  "GPTNeoXForCausalLM"
5
  ],
@@ -18,7 +18,7 @@
18
  "rotary_emb_base": 10000,
19
  "rotary_pct": 1.0,
20
  "tie_word_embeddings": false,
21
- "torch_dtype": "float32",
22
  "transformers_version": "4.29.2",
23
  "use_cache": false,
24
  "use_parallel_residual": false,
 
1
  {
2
+ "_name_or_path": "./model",
3
  "architectures": [
4
  "GPTNeoXForCausalLM"
5
  ],
 
18
  "rotary_emb_base": 10000,
19
  "rotary_pct": 1.0,
20
  "tie_word_embeddings": false,
21
+ "torch_dtype": "bfloat16",
22
  "transformers_version": "4.29.2",
23
  "use_cache": false,
24
  "use_parallel_residual": false,
generation_config.json CHANGED
@@ -2,5 +2,6 @@
2
  "_from_model_config": true,
3
  "bos_token_id": 2,
4
  "eos_token_id": 3,
5
- "transformers_version": "4.29.2"
 
6
  }
 
2
  "_from_model_config": true,
3
  "bos_token_id": 2,
4
  "eos_token_id": 3,
5
+ "transformers_version": "4.29.2",
6
+ "use_cache": false
7
  }
pytorch_model-00001-of-00002.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:5e772cd62824a354b83ef5b9adaaf15c5318eb295a1a492f26fa71e3131dc629
3
- size 10084370522
 
 
 
 
pytorch_model-00002-of-00002.bin โ†’ pytorch_model.bin RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fb50e2da577968b14a719d3427a9e8b4c6903ecb9e75d3cae8fa7d31e3c04d33
3
- size 4495809824
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c671e9bffda9561d587ebea6c7db1e525f448c878f7161dc6afe4e5fa38a8f0e
3
+ size 7365693557
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7894a57ab5a12479f5b66c77c83da2d822478150971a3c88cd3e6d57d86dfe7d
3
  size 3899
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31f3c7a9f38deef824e2176f0bf4bf4c7366b8d80dc4ab54a0226cbdc004f351
3
  size 3899