tsunghanwu
/

SESAME

Text Generation

Inference Endpoints

Model card Files Files and versions Community

SESAME / README.md

tsunghanwu's picture

Update README.md

40f5bfe verified 8 months ago

|

819 Bytes

	---
	license: mit
	---

	SESAME Model Card

	Model details
	Model type: SESAME is an open-source multimodal model trained by fine-tuning LLaVA on various instruction-based image grounding (segmentation) data. It is an auto-regressive language model plus a segmentation model.

	Paper or resources for more information: https://see-say-segment.github.io/

	Where to send questions or comments about the model: https://github.com/see-say-segment/sesame/issues

	Intended use
	Primary intended uses: The primary use of SESAME is research on large multimodal models and chatbots.

	Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.

	Training dataset: (FP-/R-)RefCOCO(+/g) + LLaVA 150K VQA data