Corianas commited on
Commit
48cf0a7
1 Parent(s): 95c64fd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -0
README.md CHANGED
@@ -1,3 +1,35 @@
1
  ---
2
  license: cc-by-nc-sa-4.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-sa-4.0
3
+ datasets:
4
+ - roneneldan/TinyStories
5
  ---
6
+ This is a character (english a-z 0-9 and so on) trained model following Andrej karpathy's llama.c project https://github.com/karpathy/llama2.c on both TinyStories and my own internal similar dataset I made.
7
+
8
+ for it to see/output Uppercase letters this model uses a Shift-Key modifier before the letter to become uppercase, and has never been trained on actual uppercase letters.
9
+
10
+ This modifier is ↨ and here are the functions I use to convert from straight text to the modified format and back.
11
+ ```
12
+ def add_caseifer(text):
13
+ # Using list comprehension for more efficient concatenation
14
+ return ''.join(['↨' + char.lower() if char.isupper() else char for char in text
15
+
16
+ def remove_caseifer(text):
17
+ new_text = ""
18
+ i = 0
19
+ while i < len(text):
20
+ if text[i] == "↨":
21
+ if i+1 < len(text):
22
+ new_text += text[i+1].upper()
23
+ i += 1
24
+ else:
25
+ pass # skip this index
26
+ else:
27
+ new_text += text[i]
28
+ i += 1
29
+ return new_text
30
+ ```
31
+
32
+ As such for test strings to use in chat try using somthing like:
33
+ ```
34
+ ↨hello, my name is ↨clara and ↨i like
35
+ ```