Technology Radar

Inferencia con LLMs en dispositivos de usuario final

Published : Oct 23, 2024

NOT ON THE CURRENT EDITION

This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar. Understand more

Oct 2024

Assess

Los modelos de lenguaje de gran tamaño o LLMs (siglas en inglés para Large Language Model) ahora son capaces de correr en navegadores web y dispositivos de usuario final, como teléfonos inteligentes y computadores portátiles, permitiendo que aplicaciones de AI se ejecuten en el dispositivo. Esto permite el manejo seguro de datos sensibles sin necesidad de transferir datos hacia la nube, muy baja latencia en tareas como edge computing y procesamiento de imagen o video en tiempo real, costos reducidos al realizar cómputos localmente y mantener funcionalidad incluso cuando no se cuenta con una conexión estable a internet. Ésta es un área de continua investigación y desarrollo. En ediciones pasadas mencionamos MLX, un framework de código abierto para machine learning eficiente en procesadores Apple silicon. Otras herramientas que están emergiendo incluyen Transformers.js y Chatty. Transformers.js nos permite correr Transformers en el navegador usando el ONNX Runtime, soportando modelos convertidos desdecomo PyTorch, TensorFlow y JAX. Chatty se apalanca en WebGPU para correr LLMs de forma nativa y privada en el navegador, ofreciendo una experiencia de AI enriquecida dentro del mismo.

Download the PDF

English | Português

Sign up for the Technology Radar newsletter

Subscribe now

Industrias

Publicaciones Digitales y Herramientas

Todos los Insights

Inferencia con LLMs en dispositivos de usuario final

Download the PDF

Sign up for the Technology Radar newsletter

Visit our archive to read the previous volumes