Apollo: An Exploration of Video Understanding in Large Multimodal Models
Paper
β’
2412.10360
β’
Published
β’
130
transformers
once [#30530](https://github.com/huggingface/transformers/pull/30530) is merged. Huge shoutout to
@nielsr
and
@danelcsb
for bringing this to HF!