---
license: mit
datasets:
- Qingyun/lmmrotate-sft-data
language:
- en
base_model:
- microsoft/Florence-2-large
pipeline_tag: image-text-to-text
tags:
- aerial
- geoscience
- remotesensing
---
LMMRotate 🎮: A Simple Aerial Detection Baseline of Multimodal Language Models
Qingyun Li
Yushi Chen
Xinya Shu
Dong Chen
Xin He
Yi Yu
Xue Yang
If you find our work helpful, please consider giving us a ⭐!
- ArXiv Paper: https://arxiv.org/abs/2501.09720
- GitHub Repo: https://github.com/Li-Qingyun/mllm-mmrotate
- HuggingFace Page: https://huggingface.co/collections/Qingyun/lmmrotate-6780cabaf49c4e705023b8df
This repo hosts the checkpoint of Florence-2-larged trained on DOTA-v1.0 with LMMRotate. More checkpoint for aerial detection with LMMRotate in [our paper](https://arxiv.org/abs/2501.09720) can be found in [this repo](https://huggingface.co/Qingyun/Florence-2-models-lmmrotate).
LMMRotate is a technical practice to fine-tune Large Multimodal language Models for oriented object detection as in MMRotate and hosts the official implementation of the paper: A Simple Aerial Detection Baseline of Multimodal Language Models.
## Downloading Guide
You can download with your web browser on [the file page](https://huggingface.co/datasets/Qingyun/Florence-2-models-lmmrotate/tree/main).
We recommand downloading in terminal using huggingface-cli (`pip install --upgrade huggingface_cli`). You can refer to [the document](https://huggingface.co/docs/huggingface_hub/guides/download) for more usages.
```
# Set Huggingface Mirror for Chinese users (if required):
export HF_ENDPOINT=https://hf-mirror.com
# Download a certain checkpoint:
huggingface-cli download Qingyun/Florence-2-models-lmmrotate --repo-type model --local-dir checkpoint/
# If any error (such as network error) interrupts the downloading, you just need to execute the same command, the latest huggingface_hub will resume downloading.
```
## Detection Performance
![](https://github.com/user-attachments/assets/f61edcd2-1dee-4bdb-8a1e-c8dd1cf163a1)
## Cite
LMMRotate paper:
```
@article{li2025lmmrotate,
title={A Simple Aerial Detection Baseline of Multimodal Language Models},
author={Li, Qingyun and Chen, Yushi and Shu, Xinya and Chen, Dong and He, Xin and Yu Yi and Yang, Xue },
journal={arXiv preprint arXiv:2501.09720},
year={2025}
}
```