What are you reporting:

Evaluation dataset(s) found in a pre-training corpus. (e.g. COPA found in ThePile)
Evaluation dataset(s) found in a pre-trained model. (e.g. FLAN T5 has been trained on ANLI)

Contaminated Evaluation Dataset(s):

- ibragim-bad/arc_easy. Note: The extent of contamination not specified, hence assumed to be 100%

Contaminated Model:

- Mistral 7B

Approach:

Data-based approach
Model-based approach

Description of your method, 3-4 sentences. Evidence of data contamination:

They perform a statistical test on log probs of the model, where they compare the log prob of the dataset under its original ordering to the log probability under random permutations. Specifically, they have a shared version where they test that the log-probability under the canonical ordering X is higher than the average log probability under a random permutation.

Citation:

Is there a paper that reports the data contamination or describes the method used to detect data contamination? Yes

url: https://arxiv.org/abs/2310.17623

 title={Proving test set contamination in black box language models},
 author={Oren, Yonatan and Meister, Nicole and Chatterji, Niladri and Ladhak, Faisal and Hashimoto, Tatsunori B},
 journal={arXiv preprint arXiv:2310.17623},
 year={2023}
}

Important! If you wish to be listed as an author in the final report, please complete this information for all the authors of this Pull Request.

Full name: Ameya Prabhu
Institution: Tübingen AI Center, University of Tübingen
Email: ameya@prabhu.be

Changing name to match HF idb151ec13

OSainz

Workshop on Data Contamination org Apr 29

Hi @AmeyaPrabhu !

Thank you for your contribution. I have changed the name of the model to match the ID in HF.

Merging to main.

Best,
Oscar

OSainz changed pull request status to merged Apr 29