Spaces:

dgjx
/

llama-3-sqlcoder-8b

Running

dgjx commited on Sep 23, 2024

Commit

04ce9c2

verified ·

1 Parent(s): 63d9b3c

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -36,16 +36,12 @@ The following SQL query best answers the question `{user_question}`:
         eos_token_id=tokenizer.eos_token_id,
         pad_token_id=tokenizer.eos_token_id,
         max_new_tokens=400,
-        do_sample=False,
         num_beams=1,
     )
     outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
     torch.cuda.empty_cache()
     torch.cuda.synchronize()
-    # empty cache so that you do generate more results w/o memory crashing
-    # particularly important on Colab – memory management is much more straightforward
-    # when running on an inference service
     return sqlparse.format(outputs[0].split("[SQL]")[-1], reindent=True)

         eos_token_id=tokenizer.eos_token_id,
         pad_token_id=tokenizer.eos_token_id,
         max_new_tokens=400,
         num_beams=1,
     )
     outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
     torch.cuda.empty_cache()
     torch.cuda.synchronize()
     return sqlparse.format(outputs[0].split("[SQL]")[-1], reindent=True)