Technology Radar

LLaVA

Published : Apr 03, 2024

NOT ON THE CURRENT EDITION

This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar. Understand more

Apr 2024

Assess

LLaVA (Asistente Visual y de Lenguaje grande, del inglés: Large Language and Vision Assistant) es un modelo multimodal en código abierto que conecta un codificador visual y un modelo de lenguaje grande (o LLM en inglés) para el entendimiento visual y lingüístico con propósito general. La gran capacidad de LLaVA en el seguimiento de instrucciones lo posiciona como un oponente altamente competitivo entre los modelos de IA multimodal. La última versión, LLaVA-NeXT, proporciona una mejor respuesta. Entre los modelos de código abierto para asistencia lingüística y visual, LLaVA es una opción prometedora cuando es comparado con GPT-4 Vision. Nuestros equipos han estado experimentando con él para responder visualmente a preguntas.

Download the PDF

English | Português

Sign up for the Technology Radar newsletter

Subscribe now

Industrias

Publicaciones Digitales y Herramientas

Todos los Insights

LLaVA

Download the PDF

Sign up for the Technology Radar newsletter

Visit our archive to read the previous volumes