Spatial-aware vision-language model trained on COCO for visual question-answering.
- Type: AI Model
- Key Features: Produces answers describing spatial relationships between objects in an image.
- Technical Categories: Computer Vision, Natural Language Processing, Vision-Language Models
- Sectors: Robotics, Surveillance, Spatial Understanding Applications
- Research areas: Visual Question Answering, Spatial Reasoning
- Type of License: Apache-2.0