kdf
/

python-docstring-generation

Text Generation

Inference Endpoints

Model card Files Files and versions Community

qhduan commited on Jul 29, 2022

Commit

0b33c64

•

1 Parent(s): 619286f

Update README.md

Files changed (1) hide show

README.md +41 -0

README.md CHANGED Viewed

@@ -3,3 +3,44 @@ license: apache-2.0
 widget:
 - text: "<|endoftext|>\ndef load_excel(path):\n    return pd.read_excel(path)\n# docstring\n\"\"\""
 ---

 widget:
 - text: "<|endoftext|>\ndef load_excel(path):\n    return pd.read_excel(path)\n# docstring\n\"\"\""
 ---
+## Basic info
+model based [Salesforce/codegen-350M-mono](https://huggingface.co/Salesforce/codegen-350M-mono)
+fine-tuned with data [codeparrot/github-code-clean](https://huggingface.co/datasets/codeparrot/github-code-clean)
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_type = 'kdf/python-docstring-generation'
+tokenizer = AutoTokenizer.from_pretrained(model_type)
+model = AutoModelForCausalLM.from_pretrained(model_type)
+inputs = tokenizer('''<|endoftext|>
+def load_excel(path):
+    return pd.read_excel(path)
+# docstring
+"""''', return_tensors='pt')
+doc_max_length = 128
+generated_ids = model.generate(
+    **inputs,
+    max_length=inputs.input_ids.shape[1] + doc_max_length,
+    do_sample=False,
+    return_dict_in_generate=True,
+    top_p = 0.9,
+    num_return_sequences=1,
+    output_scores=True,
+    pad_token_id=50256,
+    eos_token_id=50256  # <|endoftext|>
+)
+ret = tokenizer.decode(generated_ids.sequences[0], skip_special_tokens=False)
+print(ret)
+```