tingyuansen commited on
Commit
b57f0cb
1 Parent(s): c3eb13e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -41
README.md CHANGED
@@ -1,38 +1,25 @@
1
  ---
2
  language:
3
  - en
 
4
  tags:
5
- - physics
6
  - astronomy
7
  - astrophysics
8
  - cosmology
9
- license:
10
- - llama3.1
11
  base_model:
12
  - meta-llama/Meta-Llama-3.1-8B
13
- library_name: transformers
14
  ---
15
 
16
  # AstroSage-Llama-3.1-8B
17
 
18
- <INSERT PAPER LINK HERE>
19
-
20
- AstroSage-Llama-3.1-8B is a domain-specialized natural-language AI assistant
21
- tailored for research in astronomy, astrophysics, and cosmology. Trained on the
22
- complete collection of astronomy-related arXiv papers from 2007-2024 along with
23
- millions of synthetically-generated question-answer pairs and other
24
- astronomical literature, AstroSage-Llama-3.1-8B demonstrates excellent
25
- proficiency on a wide range of questions. AstroSage-Llama-3.1-8B scores 80.9%
26
- on the AstroMLab-1 benchmark, greatly outperforming all models---proprietary
27
- and open-weight---in the 8-billion parameter class, and performing on par with
28
- GPT-4o. This achievement demonstrates the potential of domain specialization in
29
- AI, suggesting that focused training can yield capabilities exceeding those of
30
- much larger, general-purpose models. AstroSage-Llama-3.1-8B is freely
31
- available, enabling widespread access to advanced AI capabilities for
32
- astronomical education and research.
33
 
34
  ## Model Details
35
- - **Model Type**: Domain-specialized LLM
 
36
  - **Base Model**: Meta-Llama-3.1-8B
37
  - **Parameters**: 8 billion
38
  - **Training Focus**: Astronomy, Astrophysics, Cosmology, and Astronomical Instrumentation
@@ -42,29 +29,72 @@ astronomical education and research.
42
  2. Supervised Fine-tuning (SFT) on QA pairs and instruction sets
43
  3. Model merging with Meta-Llama-3.1-8B-Instruct (75% CPT+SFT / 25% Meta-Instruct)
44
 
45
- ## Performance
46
- - **AstroMLab-1 Benchmark**: 80.9% accuracy
47
- - Outperforms all 8B parameter models
48
- - Comparable to GPT-4o (80.4%)
49
- - ~1000x more cost-effective than proprietary models
50
- - 8 percentage-point improvement over base Llama-3.1-8b model on Astronomy Q&A benchmark
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/643f1ddce2ea47d170103537/9cXdGN7xf4vDag0_bgJcL.png)
53
 
54
- - **General Capabilities**: Maintains strong performance on standard benchmarks
55
- - IF-EVAL: 41.4%
56
- - BBH: 52.9%
57
- - MATH: 8.4%
58
- - GPQA: 31.2%
59
- - MUSR: 38.9%
60
- - MMLU-PRO: 34.6%
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
 
62
  ## Training Data
 
63
  - **Continued Pre-training**:
64
  - ~250,000 arXiv preprints (2007-2024) from astro-ph and gr-qc
65
  - Astronomy-related Wikipedia articles
66
  - Selected astronomy textbooks
67
  - Total: 3.3 billion tokens, 19.9 GB plaintext
 
68
  - **Supervised Fine-tuning**:
69
  - 8.8 million curated QA pairs
70
  - Filtered Infinity-Instruct-7M dataset
@@ -87,16 +117,20 @@ astronomical education and research.
87
  - Performance primarily validated on multiple-choice questions
88
  - Primarily trained for use in English
89
 
90
- ## Ethical Considerations
91
- - Should not be used as sole source for critical research decisions
92
- - Output should be verified against primary sources
93
- - May reflect biases present in astronomical literature
94
-
95
  ## Technical Specifications
96
  - Architecture: Based on Meta-Llama 3.1
97
  - Training Infrastructure: ORNL OLCF Frontier
98
  - Hosting: Hugging Face Hub (AstroMLab/AstroSage-8B)
99
 
 
 
 
 
 
 
 
100
  ## Citation and Contact
101
- - Contract: Corresponding author Tijmen de Haan, email: tijmen dot dehaan at gmail dot com and AstroMLab astromachinelearninglab at gmail dot com
102
- - Please cite the AstroMLab 3 paper when referencing to this model.
 
 
 
1
  ---
2
  language:
3
  - en
4
+ pipeline_tag: text-generation
5
  tags:
6
+ - llama-3.1
7
  - astronomy
8
  - astrophysics
9
  - cosmology
10
+ - arxiv
11
+ inference: false
12
  base_model:
13
  - meta-llama/Meta-Llama-3.1-8B
 
14
  ---
15
 
16
  # AstroSage-Llama-3.1-8B
17
 
18
+ AstroSage-Llama-3.1-8B is a domain-specialized natural-language AI assistant tailored for research in astronomy, astrophysics, and cosmology. Trained on the complete collection of astronomy-related arXiv papers from 2007-2024 along with millions of synthetically-generated question-answer pairs and other astronomical literature, AstroSage-Llama-3.1-8B demonstrates excellent proficiency on a wide range of questions. This achievement demonstrates the potential of domain specialization in AI, suggesting that focused training can yield capabilities exceeding those of much larger, general-purpose models.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ## Model Details
21
+
22
+ - **Base Architecture**: Meta-Llama-3.1-8B
23
  - **Base Model**: Meta-Llama-3.1-8B
24
  - **Parameters**: 8 billion
25
  - **Training Focus**: Astronomy, Astrophysics, Cosmology, and Astronomical Instrumentation
 
29
  2. Supervised Fine-tuning (SFT) on QA pairs and instruction sets
30
  3. Model merging with Meta-Llama-3.1-8B-Instruct (75% CPT+SFT / 25% Meta-Instruct)
31
 
32
+ ## Using the model
33
+
34
+ ```python
35
+ import torch
36
+ from transformers import AutoModelForCausalLM, AutoTokenizer
37
+
38
+ # Load the model and tokenizer
39
+ model = AutoModelForCausalLM.from_pretrained("AstroMLab/AstroSage-8b", device_map="auto")
40
+ tokenizer = AutoTokenizer.from_pretrained("AstroMLab/AstroSage-8b")
41
+
42
+ # Function to generate a response
43
+ def generate_response(prompt):
44
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
45
+
46
+ outputs = model.generate(
47
+ **inputs,
48
+ max_new_tokens=128,
49
+ do_sample=True,
50
+ pad_token_id=tokenizer.eos_token_id,
51
+ )
52
+ response = outputs[0][inputs['input_ids'].shape[-1]:]
53
+ decoded = tokenizer.decode(response, skip_special_tokens=True)
54
+
55
+ return decoded
56
+
57
+ # Example usage
58
+ prompt = """
59
+ You are an expert in general astrophysics. Your task is to answer the following question:
60
+ What are the main components of a galaxy?
61
+ """
62
+ response = generate_response(prompt)
63
+ print(response)
64
+ ```
65
 
 
66
 
67
+ ## Model Improvements and Performance
68
+
69
+ AstroSage-Llama-3.1-8B shows remarkable performance improvements:
70
+
71
+ | Model | Score (%) |
72
+ |-------|-----------|
73
+ | **AstroSage-Llama-3.1-8B** | **80.9** |
74
+ | GPT-4o | 80.4 |
75
+ | LLaMA-3-8B | 72.9 |
76
+ | Gemma-2-9B | 71.5 |
77
+ | Qwen-2.5-7B | 70.4 |
78
+ | Yi-1.5-9B | 68.4 |
79
+ | InternLM-2.5-7B | 64.5 |
80
+ | Mistral-7B-v0.3 | 63.9 |
81
+ | ChatGLM3-6B | 50.4 |
82
+
83
+ The model demonstrates:
84
+ - Outperformance of all 8B parameter models
85
+ - Comparable performance to GPT-4o (80.4%)
86
+ - ~1000x more cost-effective than proprietary models
87
+ - 8 percentage-point improvement over base Llama-3.1-8b model
88
+
89
 
90
  ## Training Data
91
+
92
  - **Continued Pre-training**:
93
  - ~250,000 arXiv preprints (2007-2024) from astro-ph and gr-qc
94
  - Astronomy-related Wikipedia articles
95
  - Selected astronomy textbooks
96
  - Total: 3.3 billion tokens, 19.9 GB plaintext
97
+
98
  - **Supervised Fine-tuning**:
99
  - 8.8 million curated QA pairs
100
  - Filtered Infinity-Instruct-7M dataset
 
117
  - Performance primarily validated on multiple-choice questions
118
  - Primarily trained for use in English
119
 
 
 
 
 
 
120
  ## Technical Specifications
121
  - Architecture: Based on Meta-Llama 3.1
122
  - Training Infrastructure: ORNL OLCF Frontier
123
  - Hosting: Hugging Face Hub (AstroMLab/AstroSage-8B)
124
 
125
+ ## Ethical Considerations
126
+
127
+ While this model is designed for scientific use:
128
+ - Should not be used as sole source for critical research decisions
129
+ - Output should be verified against primary sources
130
+ - May reflect biases present in astronomical literature
131
+
132
  ## Citation and Contact
133
+
134
+ - Corresponding author: Tijmen de Haan (tijmen dot dehaan at gmail dot com)
135
+ - AstroMLab: astromachinelearninglab at gmail dot com
136
+ - Please cite the AstroMLab 3 paper when referencing this model