Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Heaplax
/
ARMAP-RM-LoRA
like
0
Reinforcement Learning
Transformers
Inference Endpoints
arxiv:
2502.12130
License:
apache-2.0
Model card
Files
Files and versions
Community
1
Train
Deploy
Use this model
main
ARMAP-RM-LoRA
/
RM-alfworld
/
checkpoint-460
/
adapter_model
/
README.md
Heaplax
Upload folder using huggingface_hub
29c609c
verified
19 days ago
preview
code
|
raw
Copy download link
history
blame
contribute
delete
Safe
88 Bytes
metadata
library_name:
peft
Training procedure
Framework versions
PEFT 0.4.0