--- license: mit --- posterior_KaTeMaTa_llama_llama.model - This is SP format tokenizer obtained by merging Kannada, Telugu, Malayalam, Tamil and Llama-2 tokenizers. posterior_dr_llama_15_32k_balanced.model posterior_dr_llama_15_32k_balanced.vocab - These is SP format tokenizer obtained by training the SP tokenizer using the four languages data.