Glitches, Missing Data in tokenizer.json file of all-MiniLM-L6-v2

#2
by MartialTerran - opened

There seem to be glitches, missing data, in your tokenizer.json file:
"լ": 1226,
"կ": 1227,
"հ": 1228,
"մ": 1229,
"յ": 1230,
"ն": 1231,
"ո": 1232,
"պ": 1233,
"ս": 1234,
"վ": 1235,
"տ": 1236,
"ր": 1237,
"ւ": 1238,
"ք": 1239, <------ GLITCH? Missing data in quotes?
"־": 1240, <------ GLITCH? Missing data in quotes?
"א": 1241,
"ב": 1242,
"ג": 1243,
"ד": 1244,
"ה": 1245,
"ו": 1246,
"ז": 1247,
"ח": 1248,
"ט": 1249,
"י": 1250,
"ך": 1251,
"כ": 1252,
"ל": 1253,
"ם": 1254,
"מ": 1255,
"ן": 1256,
"נ": 1257,
"ס": 1258,
"ע": 1259,
"ף": 1260,
"פ": 1261,
"ץ": 1262,
"צ": 1263,
"ק": 1264,
"ר": 1265,
"ש": 1266,
"ת": 1267, <------ GLITCH? Double data in quotes?
"،": 1268, <------ GLITCH? Missing data in quotes?
"ء": 1269, <------ GLITCH? Missing data in quotes?
"ا": 1270,
"ب": 1271,
"ة": 1272,

This comment has been hidden
MartialTerran changed discussion status to closed

Sign up or log in to comment