arxiv:2412.06676

I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token

Published on Dec 9

· Submitted by

roicohen9 on Dec 12

Upvote

Authors:

Roi Cohen ,

Konstantin Dobler ,

Abstract

Large Language Models are known to capture real-world knowledge, allowing them to excel in many downstream tasks. Despite recent advances, these models are still prone to what are commonly known as hallucinations, causing them to emit unwanted and factually incorrect text. In this work, we propose a novel calibration method that can be used to combat hallucinations. We add a special [IDK] ("I don't know") token to the model's vocabulary and introduce an objective function that shifts probability mass to the [IDK] token for incorrect predictions. This approach allows the model to express uncertainty in its output explicitly. We evaluate our proposed method across multiple model architectures and factual downstream tasks. We find that models trained with our method are able to express uncertainty in places where they would previously make mistakes while suffering only a small loss of encoded knowledge. We further perform extensive ablation studies of multiple variations of our approach and provide a detailed analysis of the precision-recall tradeoff of our method.

View arXiv page View PDF Add to collection

Community

roicohen9

Paper author Paper submitter 10 days ago

LLMs are yet to fully internalize the concept of uncertainty. But can we pretrain them better for that?

In our new paper "I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token" we propose a new objective for uncertainty training and show how to utilize it to upgrade factual accuracy.

Check it out!

librarian-bot

10 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2412.06676 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2412.06676 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2412.06676 in a Space README.md to link it from this page.