Corianas
/

Microllama_Char_100k_step

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Corianas commited on Apr 7

Commit

48cf0a7

•

1 Parent(s): 95c64fd

Update README.md

Files changed (1) hide show

README.md +32 -0

README.md CHANGED Viewed

@@ -1,3 +1,35 @@
 ---
 license: cc-by-nc-sa-4.0
 ---

 ---
 license: cc-by-nc-sa-4.0
+datasets:
+- roneneldan/TinyStories
 ---
+This is a character (english a-z 0-9 and so on) trained model following Andrej karpathy's llama.c project https://github.com/karpathy/llama2.c on both TinyStories and my own internal similar dataset I made.
+for it to see/output Uppercase letters this model uses a Shift-Key modifier before the letter to become uppercase, and has never been trained on actual uppercase letters.
+This modifier is ↨ and here are the functions I use to convert from straight text to the modified format and back.
+```
+def add_caseifer(text):
+    # Using list comprehension for more efficient concatenation
+    return ''.join(['↨' + char.lower() if char.isupper() else char for char in text
+def remove_caseifer(text):
+    new_text = ""
+    i = 0
+    while i < len(text):
+        if text[i] == "↨":
+            if i+1 < len(text):
+                new_text += text[i+1].upper()
+                i += 1
+            else:
+                pass  # skip this index
+        else:
+            new_text += text[i]
+        i += 1
+    return new_text
+```
+As such for test strings to use in chat try using somthing like:
+```
+↨hello, my name is ↨clara and ↨i like
+```