Text Generation
Transformers
GGUF
English
mistral
code
art
conversational
Inference Endpoints
text-generation-inference
trollek commited on
Commit
1d204a6
1 Parent(s): d203bbf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -30,6 +30,11 @@ library_name: transformers
30
  tags:
31
  - code
32
  - art
 
 
 
 
 
33
  ---
34
 
35
  This is [NinjaMouse](https://huggingface.co/trollek/NinjaMouse-2.4B-32L-danube) extended even further. Instead of Cosmopedia I used different coding datasets.
 
30
  tags:
31
  - code
32
  - art
33
+ ---
34
+ #### ❗ This model gives up when the input reaches a critical mass of about tree fiddy thousand tokens
35
+
36
+ I have dun goofed and not tested the [base model](https://huggingface.co/h2oai/h2o-danube-1.8b-chat) enough (and possibly goofed in other ways too), but I'm already training the new one based on [h2oai/h2o-danube2-1.8b-chat](https://huggingface.co/h2oai/h2o-danube2-1.8b-chat). Perhaps S² attn or RoPE scaling will work and make a hella big context window possible? We'll see.
37
+
38
  ---
39
 
40
  This is [NinjaMouse](https://huggingface.co/trollek/NinjaMouse-2.4B-32L-danube) extended even further. Instead of Cosmopedia I used different coding datasets.