BueormLLC
/

ST3

Text Generation

Safetensors

Spanish

gpt2

Model card Files Files and versions Community

Gerson Fabian Buenahora Ormaza commited on Oct 1

Commit

2dfbb7d

•

1 Parent(s): a1d6acc

Update README.md

Browse files

Files changed (1) hide show

README.md +29 -2

README.md CHANGED Viewed

@@ -9,6 +9,8 @@ base_model:
 pipeline_tag: text-generation
 ---
 # ST3: Simple Transformer 3
 ## Model description
@@ -22,7 +24,7 @@ ST3 (Simple Transformer 3) is a lightweight transformer-based model derived from
 - **Parameters:** 4 million FP32 parameters.
 - **Batch size:** 32.
 - **Training environment:** 1 epoch on a Kaggle P100 GPU.
-- **Tokenizer:** Custom WordPiece tokenizer "ST3" with a max input length of 2048 tokens.
 ## Intended use
 ST3 is not a highly powerful or fully functional model compared to larger transformer models but can be used for:
@@ -32,6 +34,32 @@ ST3 is not a highly powerful or fully functional model compared to larger transf
 This model has not been fine-tuned or evaluated with performance metrics as it’s not designed for state-of-the-art tasks.
 ## Limitations
 - **Performance:** ST3 lacks the power of larger models and may not perform well on complex language tasks.
 - **No evaluation:** The model hasn’t been benchmarked with metrics.
@@ -60,4 +88,3 @@ If you find this model useful and would like to support further development, ple
 ---
 *Contributions to this project are always welcome!*

 pipeline_tag: text-generation
 ---
 # ST3: Simple Transformer 3
 ## Model description
 - **Parameters:** 4 million FP32 parameters.
 - **Batch size:** 32.
 - **Training environment:** 1 epoch on a Kaggle P100 GPU.
+- **Tokenizer:** Custom WordPiece tokenizer "ST3" that generates tokens with "##" as a prefix for subword units.
 ## Intended use
 ST3 is not a highly powerful or fully functional model compared to larger transformer models but can be used for:
 This model has not been fine-tuned or evaluated with performance metrics as it’s not designed for state-of-the-art tasks.
+### Usage
+To use the ST3 model, you can follow this example:
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained("BueormLLC/ST3")
+model = AutoModelForCausalLM.from_pretrained("BueormLLC/ST3")
+def clean_wordpiece_tokens(text):
+    return text.replace(" ##", "").replace("##", "")
+input_text = "Esto es un ejemplo"
+inputs = tokenizer(input_text, return_tensors="pt")
+outputs = model.generate(inputs.input_ids, max_length=2048, num_return_sequences=1)
+generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
+cleaned_text = clean_wordpiece_tokens(generated_text)
+print(cleaned_text)
+```
+### Explanation
+The ST3 tokenizer uses the WordPiece algorithm, which generates tokens prefixed with "##" to indicate subword units. The provided `clean_wordpiece_tokens` function removes these prefixes, allowing for cleaner output text.
 ## Limitations
 - **Performance:** ST3 lacks the power of larger models and may not perform well on complex language tasks.
 - **No evaluation:** The model hasn’t been benchmarked with metrics.
 ---
 *Contributions to this project are always welcome!*