Text Generation
Transformers
PyTorch
English
llama
text-generation-inference
Inference Endpoints

Model Card for SciTulu 7B

SciTulu is a collection of instruction-following language models targeting scientific literature understanding use cases. Starting from the Tulu v2 7B model, SciTulu is trained on a mix of science-specific demonstrations from the SciRIFF dataset, together with general-domain instructions from the Tulu v2 SFT mix. SciTulu 7B achives a 28.1% average improvement over Tulu v2 7B on nine held-out scientific literature understanding tasks. More information can be found in our preprint: SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature.

Training and evaluation code for SciTulu is available in our GitHub repository: https://github.com/allenai/SciRIFF.

See the Tulu model card for more information on potential risks, biases, and limitations.

Downloads last month
86
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train allenai/scitulu-7b

Collection including allenai/scitulu-7b