Experiments with new architecture that enables latent space reasoning
Aman Gupta PRO
amang1802
AI & ML interests
None yet
Recent Activity
updated
a collection
1 day ago
ThinkTransformer experiments
updated
a model
1 day ago
amang1802/think_fineweb-edu_chkpts_exp2
published
a model
1 day ago
amang1802/think_fineweb-edu_chkpts_exp2
Organizations
Collections
7
models
18

amang1802/think_fineweb-edu_chkpts_exp2
Updated

amang1802/smol-math-400M
Text Generation
•
Updated
•
38

amang1802/llama-3.1-70B-wildeweb-sample
Updated
•
3

amang1802/llama-3.1-70B-cpttest_mode2_qna_fulltext
Updated
•
1

amang1802/llama-3.1-8B-cpttest_mode2_qna_fulltext
Updated
•
4

amang1802/llama-3.1-70B-cpttest_mode1_fulltext
Updated
•
6

amang1802/llama-3.1-8B-cpttest_mode1_fulltext
Updated
•
6

amang1802/llama_162M_fineweb100BT
Text Generation
•
Updated
•
64

amang1802/llama_162M_fineweb10BT
Text Generation
•
Updated
•
158

amang1802/Llama3.2-1B-summary-length-exp7.1
Text Generation
•
Updated
•
9
datasets
35
amang1802/math-vibe-new
Viewer
•
Updated
•
5
•
24
amang1802/math-vibe-gsm-similar
Viewer
•
Updated
•
5
•
37
amang1802/liar2-doubts
Viewer
•
Updated
•
32
•
43
amang1802/wildeweb-sample-salad_5K
Viewer
•
Updated
•
5k
•
50
amang1802/wildeweb-sample-realtoxicity-challenge
Viewer
•
Updated
•
770
•
42
amang1802/wildeweb-safety-vibe-check
Viewer
•
Updated
•
5
•
34
amang1802/wildeweb_sample
Viewer
•
Updated
•
38.3k
•
43
amang1802/wildeweb_cls_1M
Viewer
•
Updated
•
1M
•
50
amang1802/synthetic_data_qna_fulltext_conditioned_L3.3_70B
Viewer
•
Updated
•
10.2k
•
52
amang1802/cpt_gen_content_topic_conditioned_L3.1_8B_cpt_qna_epoch19
Viewer
•
Updated
•
5.12k
•
50