File size: 2,350 Bytes
0c9e835
 
06e689b
 
0c9e835
06e689b
 
 
 
a7957d5
06e689b
 
c2889d8
06e689b
4750622
06e689b
 
 
 
 
290e02a
06e689b
 
 
 
 
 
290e02a
06e689b
 
 
 
4750622
 
fc673dc
93d69c3
fc673dc
 
06e689b
b329f0a
06e689b
7b3a3e9
96294f1
b329f0a
 
 
 
 
 
 
7b3a3e9
9a0007b
b329f0a
 
 
a7957d5
 
beef182
 
 
 
 
 
 
 
 
 
 
 
a7957d5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
license: apache-2.0
language:
- en
---

# FILM-7B

<p align="center">
   💻 <a href="https://github.com/microsoft/FILM/" target="_blank">[Github Repo]</a> • 📃 <a href="https://arxiv.org/abs/2404.16811" target="_blank">[Paper]</a> • ⚓ <a href="https://huggingface.co/datasets/In2Training/VaLProbing-32K" target="_blank">[VaLProbing-32K] </a>
</p>

**FILM-7B is a 32K-context LLM that overcomes the lost-in-the-middle problem.**
It is trained from Mistral-7B-Instruct-v0.2 by applying Information-Intensie (In2) Training.
FILM-7B achieves near-perfect performance on probing tasks, SOTA-level performance on real-world long-context tasks among ~7B size LLMs, and does not compromise the short-context performance.

## Model Usage

The system tempelate for FILM-7B:
```text
'''[INST] Below is a context and an instruction. Based on the information provided in the context, write a response for the instruction.

### Context:
{YOUR LONG CONTEXT}

### Instruction:
{YOUR QUESTION & INSTRUCTION} [/INST]
'''
```

## Probing Results

To reproduce the results on our VaL Probing, see the guidance in [https://github.com/microsoft/FILM/tree/main/VaLProbing](https://github.com/microsoft/FILM/tree/main/VaLProbing).

<p align="center">
    <img src="./figures/probing_results_new.png" width="800">
    <br>
</p>

## Real-World Long-Context Tasks

To reproduce the results on real-world long-context tasks, see the guidance in [https://github.com/microsoft/FILM/tree/main/real_world_long](https://github.com/microsoft/FILM/tree/main/real_world_long).

<p align="center">
    <img src="./figures/real_world_long.png" width="800">
    <br>
</p>

## Short-Context Tasks

To reproduce the results on short-context tasks, see the guidance in [https://github.com/microsoft/FILM/tree/main/short_tasks](https://github.com/microsoft/FILM/tree/main/short_tasks).

<p align="center">
    <img src="./figures/short.png" width="800">
    <br>
</p>

## 📝 Citation
```
@misc{an2024make,
      title={Make Your LLM Fully Utilize the Context}, 
      author={Shengnan An and Zexiong Ma and Zeqi Lin and Nanning Zheng and Jian-Guang Lou},
      year={2024},
      eprint={2404.16811},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```

Disclaimer: This model is strictly for research purposes, and not an official product or service from Microsoft.