huggingchat/chat-ui-template · Unable to run with private model

Hi,

I tried to run chat-ui with huggingface private model by providing HF_TOKEN. But it is failing while generation. Below is the error

2023-12-22T10:58:24.045199Z  INFO compat_generate{default_return_full_text=false}:generate_stream{parameters=GenerateParameters { best_of: None, temperature: Some(0.35), repetition_penalty: Some(1.2), top_k: Some(50), top_p: Some(0.95), typical_p: None, do_sample: false, max_new_tokens: Some(1024), return_full_text: Some(false), stop: [], truncate: Some(1000), watermark: false, details: false, decoder_input_details: false, seed: None, top_n_tokens: None } total_time="249.264095ms" validation_time="59.501µs" queue_time="94.391µs" inference_time="249.110293ms" time_per_token="249.110293ms" seed="Some(15872725608211790663)"}: text_generation_router::server: router/src/server.rs:457: Success
11:58:24 11|index  | Error: Generation failed
11:58:24 11|index  |     at generateFromDefaultEndpoint (file:///app/build/server/chunks/_server.ts-e21a8bfe.js:48:9)
11:58:24 11|index  |     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
11:58:24 11|index  |     at async summarize (file:///app/build/server/chunks/_server.ts-e21a8bfe.js:299:10)
11:58:24 11|index  |     at async file:///app/build/server/chunks/_server.ts-e21a8bfe.js:460:26

I tried running the space using local docker. I could not find whether text-generation-inferenence is actually running.
Kindly help!