pymupdf git+https://github.com/huggingface/transformers.git datasets sentencepiece unidecode transformers torch