Enable javascript in your browser for better experience. Need to know to enable it? Go here.
Volume 34 | April 2026

Platforms

Subscribe
  • Platforms

    Adopt Trial Assess Caution Adopt Trial Assess Caution
  • New
  • Moved in/out
  • No change

Platforms

Adopt ?

No blips

Trial ?

  • 42. AG-UI Protocol

    AG-UI is an open protocol and library designed to standardize communication between rich user interfaces and back-end AI agents. Historically, building agentic UIs required bespoke plumbing for bidirectional, stateful collaboration. AG-UI addresses this by providing a consistent, event-driven architecture — supporting transports such as server-sent events (SSE) and WebSockets — for streaming reasoning steps, synchronizing state and rendering dynamic UI components.

    However, the architectural landscape for agent interfaces is shifting rapidly. AG-UI intentionally sits outside MCP, functioning as an interface layer between the frontend and the agent backend. We’re now seeing a different approach emerge, where newer MCP-based applications package HTML and UI widgets directly within MCP servers or skills.

    Because UI components can now be embedded and served alongside the tools themselves — a pattern related to emerging adjacent standards such as MCP-UI — the need for a separate UI protocol layer such as AG-UI is being questioned. While AG-UI remains a solid choice for decoupling front-end UX from back-end orchestration, teams should assess its role in light of the growing trend toward consolidating tool logic and UI within the MCP ecosystem.

  • 43. Apache APISIX

    Apache APISIX is an open-source, high-performance, cloud-native gateway that addresses the limitations of legacy Nginx-based solutions. Built on Nginx and LuaJIT via OpenResty, it uses etcd for configuration storage to eliminate reload-induced latency, making it well-suited for dynamic microservices and serverless architectures. Its primary strength is its fully dynamic, pluggable architecture, which allows teams to customize traffic management, security and observability through APIs and a multi-language plugin ecosystem, including WASM. By supporting the Kubernetes Gateway API, Apache APISIX can be used as a Kubernetes gateway and is a strong candidate for replacing legacy Nginx ingress controllers. Some of our teams are adopting Apache APISIX and find its performance and feature set compelling.

  • 44. AWS Bedrock AgentCore

    AWS Bedrock AgentCore is the agentic platform for building, running and operating agents at scale, securely without the overhead of infrastructure management, similar to GCP Vertex AI Agent Builder and Azure AI Foundry Agent Service. While it’s tempting to adopt the platform as a monolithic black box, we’ve seen more success with a nuanced, decoupled architecture: using the AgentCore runtime for production concerns such as session isolation, security and observability, while retaining orchestration logic in external frameworks like LangGraph. This separation of concerns allows teams to benefit from managed infrastructure while maintaining the flexibility to adapt as the LLM landscape evolves. By focusing on the runtime first, organizations can incrementally move agentic workloads into production without ceding control of their core logic to a vendor-specific orchestration layer.

  • 45. Graphiti

    We’re moving Graphiti to Trial as this open-source temporal knowledge graph engine from Zep has demonstrated its production viability for addressing the LLM memory problem. While flat vector stores in RAG pipelines fail to track how facts change over time, Graphiti ingests data as discrete episodes and maintains bi-temporal validity windows on graph edges, so outdated facts are invalidated rather than overwritten. Unlike batch-oriented GraphRAG, it updates the graph incrementally and delivers sub-second retrieval via hybrid retrieval combining semantic search, BM25 and graph traversal, without query-time LLM calls. Two factors drove this move: peer-reviewed benchmarks reporting 18.5% accuracy improvements and 90% latency reductions, and the release of a first-class MCP server enabling Model Context Protocol–compliant agents to attach persistent temporal memory with minimal integration effort. Strong community adoption further signals production readiness. We’re using Graphiti to build context-aware agents with stateful, temporally aware knowledge graphs and recommend evaluating it for agentic applications. Neo4j is the primary backend, with FalkorDB as a lighter alternative. Teams should also account for per-write LLM extraction costs and pin dependencies given its pre-1.0 release status.

  • 46. Langfuse

    Langfuse is an open-source LLM engineering platform covering observability, prompt management, evaluations and dataset management. The project has matured significantly since we last assessed it. The v3 architecture introduces ClickHouse, Redis and S3 as back-end components, making it more scalable but also more complex to self-host.

    Both the Python and TypeScript SDKs are now built natively on OpenTelemetry, making Langfuse a natural fit for teams that already use OTEL-based observability. New capabilities such as the experiment runner SDK and structured output support for prompt experiments move Langfuse beyond pure tracing into systematic evaluation workflows. This makes it worth considering in an increasingly crowded space that includes Arize Phoenix, Helicone and LangSmith.

    Teams building primarily on Pydantic AI may also consider Pydantic Logfire, which takes a broader approach as a full-stack OTEL observability platform rather than an LLM-specific tooling suite. Langfuse remains in Assess as it is a credible choice for teams that need integrated tracing, evaluations and prompt management in one self-hostable platform. However, teams should evaluate whether the infrastructure commitment is justified for their scale and whether a narrower tool like Helicone may suffice if the primary need is model-layer cost and latency visibility.

  • 47. Port

    Port is a commercial internal developer portal designed to improve developer experience by centralizing software assets, automating workflows and enforcing engineering standards, giving platform teams a single source of truth for self-service workflows. We're seeing it matter more as organizations look to standardize engineering workflows while exposing templates, APIs, automations and agents in a form developers can actually use, including directly in the IDE through Port's API and MCP layers rather than only through a standalone portal.

    In our experience, Port works well for organizations that want productized portal capabilities without investing heavily in platform engineering. In client engagements, it has supported thousands of developers while enabling relatively small platform teams to deliver effective self-service quickly. We think Port is worth assessing for organizations that need internal developer portal capabilities quickly and can accept the constraints of a commercial platform and vendor dependence.

  • 48. Replit

    Replit is a cloud-native collaborative development platform that provides instant dev environments, real-time coding and integrated AI assistance right in the browser. It combines an editor, runtime, deployment, and AI coding workflows into one unified platform, allowing developers to start coding immediately without any local setup. We found that this AI-based collaborative IDE is really helpful for reducing onboarding friction, making it a great fit for prototyping together as a team. We also find it very effective for training sessions, knowledge sharing and bootcamps. While some might see Replit as a place for AI-assisted hobby projects, we think it stands out because the environment is powerful enough to compete with traditional local IDEs, making iteration and collaboration much easier.

  • 49. SigNoz

    SigNoz is an open-source, OpenTelemetry-native observability platform that provides unified support for logs, metrics and traces. It addresses the APM and instrumentation needs of modern microservices and distributed architectures while avoiding vendor lock-in. By leveraging ClickHouse as its underlying columnar database, SigNoz provides scalable, high-performance and cost-effective storage with fast querying, positioning it as a strong self-hosted alternative to platforms such as Datadog. It supports flexible querying through PromQL and ClickHouse SQL, along with alerting across multiple notification channels. In practice, we’ve seen SigNoz reduce infrastructure resource consumption and overall observability costs without compromising performance. While a managed cloud service is available, ready-to-use Docker images and Helm charts make it a practical choice for organizations that prefer to retain control over their data and infrastructure.

Assess ?

  • 50. Agent Trace

    Agent Trace is an open specification proposed by Cursor looking to standardize AI code attribution. As adoption of coding agents increases, understanding who has modified code now extends beyond human developers to include AI-generated changes. We're seeing early interest from teams that need better traceability around these changes. Existing tools such as git blame can show a line of code has been modified, but fail to capture whether that change was made by a human, an AI or both. Agent Trace takes a vendor-neutral approach to defining how code changes are traced and is unopinionated about how those traces are stored. It’s compatible with multiple version control systems, including Git, Mercurial and Jujutsu. The specification defines contributor types such as human, AI, mixed and unknown, along with a trace record describing the origin of each contribution. There are early signals of adoption, with support from tools such as Cline and OpenCode as well as implementations like Git AI. Teams adopting coding agents should assess tooling that implements the Agent Trace specification to improve code attribution.

  • 51. ClickStack

    ClickStack is an OpenTelemetry-compatible, open-source observability platform that unifies logs, traces, metrics and sessions in a single high-performance data store built on ClickHouse. As infrastructure grows and observability costs increase, many teams struggle with fragmented telemetry toolchains and expensive vendor platforms. ClickStack addresses this challenge by leveraging ClickHouse’s columnar storage to enable sub-second, high-cardinality queries across large volumes of telemetry data, offering a simpler and more cost-effective foundation for observability.

  • 52. Coder

    Coder presents a good alternative to pixel-streamed development environments by separating where code runs from how developers interact with it. Instead of streaming full desktop interfaces, developers connect to remote environments using local IDEs such as VS Code or a browser, resulting in a more responsive experience without compromising usability.

    In this model, code executes on remote, scalable infrastructure while environments are defined and managed as code. This enables teams to standardize development setups and simplify the onboarding of new developers. It also makes it easier to provide controlled access to internal systems and streamline access to pre-approved AI coding agents.

    We see Coder as a middle ground between local development and fully virtualized desktops: it provides centralized control and governance without the usability limitations of pixel-streamed VDIs. This makes it a good option for organizations that require remote or controlled execution environments, particularly where higher compute or secure access is needed. As with similar approaches, teams should evaluate the operational overhead and security responsibilities that come with managing these environments.

  • 53. Databricks Agent Bricks

    As agent-based approaches become more mainstream, we’re seeing data platforms evolve to support these workloads natively rather than as a bolt-on. Databricks Agent Bricks provides prebuilt, auto-optimizing components for common AI patterns such as knowledge assistants and data analysts. It follows a declarative approach: developers define the goal and underlying data, while the framework handles the execution and optimization. By simplifying LLMOps and reducing the effort required for data curation, teams can focus more on business outcomes than on boilerplate. For example, our teams have used it alongside custom agents to evaluate and build complex RAG solutions for preclinical R&D. If you’re already invested in the Databricks ecosystem and exploring agent-based approaches for common use cases such as chatbots and document extraction, consider assessing Agent Bricks.

  • 54. DuckLake

    DuckLake is an integrated data lake and catalog format that simplifies the lakehouse architecture by using standard SQL databases for catalog and metadata management. While traditional open table formats like Iceberg or Delta Lake rely on complex, file-based metadata structures, DuckLake stores metadata in a catalog database (for example, SQLite, PostgreSQL or DuckDB) while persisting data as Parquet files on local disk or S3-compatible object storage. This hybrid approach improves query planning latency and transactional reliability during concurrent updates. DuckDB serves as the query engine via its ducklake extension, providing a familiar SQL interface for standard DDL and DML operations. DuckLake also retains lakehouse characteristics, such as partitioning, while omitting indexes and primary or foreign keys. With support for time travel, schema evolution and ACID compliance, DuckLake offers a low-complexity option for teams seeking a standalone analytical stack. Although still early in maturity, DuckLake is a promising, lightweight alternative to traditional lakehouse architectures. It avoids the operational overhead associated with Spark or Trino-based ecosystems, making it a good fit for streamlined data environments.

  • 55. FalkorDB

    FalkorDB is a Redis-based graph database that supports Cypher and suits teams that want graph capabilities without adopting a heavyweight graph platform. We see it as a practical option for organizations building relationship-rich AI and application workloads where low operational friction matters, and where a server-based graph service is preferable to embedded storage. We’re placing it in Assess because the architecture is promising and the developer model is approachable, but teams should validate production behavior around scaling, operational tooling and long-term ecosystem maturity before committing to broad adoption.

  • 56. Google Dialogflow CX

    Google Dialogflow CX is Google Cloud's managed conversational AI platform, combining a graph-based state machine build from Flows and Pages with Vertex AI Gemini–powered generative capabilities. We previously tracked its predecessor, Dialogflow, in the Radar. CX represents a significant redesign that gained traction after Google integrated Vertex AI Gemini models in 2024, introducing Generative Playbooks for instruction-driven agents and Data Store RAG for grounding responses in indexed content. We used it to build a natural language data discovery agent, choosing it over a custom SDK approach for its low-code environment and Generative Playbooks. We configured these with few-shot prompting to translate natural language queries into SQL. Teams on Google Cloud building natural language interfaces over structured internal data will find Dialogflow CX accelerates delivery compared to a custom agent stack. However, the platform has no free tier; its deep Google Cloud dependency introduces significant vendor lock-in, and teams should plan for context engineering effort.

  • 57. MCP Apps

    MCP Apps is the first official extension to the Model Context Protocol, letting MCP servers return interactive HTML interfaces as dashboards, forms and visualizations that render directly in the conversation. Co-developed by Anthropic, OpenAI and open-source contributors, the extension standardizes a ui:// resource scheme where tools declare UI templates rendered in sandboxed iframes with graceful degradation to text when the host lacks UI support. Unlike AG-UI, which operates as a separate library layer, MCP Apps packages UI directly inside MCP servers. The bidirectional design lets models observe user actions while the interface handles live data and direct manipulation that text cannot. Clients including Claude, ChatGPT, VS Code and Goose already ship support. Teams exploring richer agent interactions should assess whether the added complexity over plain text responses is warranted for their use case.

  • 58. Monarch

    Monarch is an open-source distributed programming framework that brings the simplicity of single-machine PyTorch workloads to large GPU clusters. It provides a Python API for spawning remote processes and actors, grouping them into collections called meshes that support broadcast messaging. It also offers fault tolerance through supervision trees, where failures propagate up a hierarchy to enable clean error handling and fine-grained recovery. Additional features include support for point-to-point RDMA transfers for efficient GPU and CPU memory movement and a distributed tensor abstraction that allows actors to work with tensors sharded across processes while maintaining an imperative programming model. Monarch is built on a high-performance Rust backend. Although still in the early stages of development, its abstraction — making distributed tensors behave like local ones — is powerful and can greatly reduce the complexity of large-scale distributed AI training.

  • 59. Neutree

    Neutree is an open-source platform for managing and serving LLMs on private infrastructure, positioning itself as a model-as-a-service layer for enterprise AI. It provides a unified control plane for model lifecycle management, inference serving and compute scheduling across heterogeneous hardware such as NVIDIA, AMD and Intel accelerators. As organizations move away from hosted APIs toward self-hosted, governed deployments, Neutree addresses a clear gap: operating LLM workloads with enterprise-grade capabilities such as multi-tenancy, access control, usage accounting and infrastructure abstraction. By separating model serving from application logic, it enables teams to deploy, scale and route models across environments — including bare metal, VMs and containers — without tightly coupling to a specific cloud provider. However, Neutree is still relatively new, and teams should approach adoption with caution. Its ecosystem, operational maturity and integration capabilities are still evolving compared to more established ML platforms. While promising, it’s best suited for teams willing to invest in evaluating and shaping emerging enterprise AI infrastructure.

  • 60. OptScale

    OptScale is an open-source, multi-cloud FinOps platform with support for AI/ML-heavy workloads where GPU and experimentation costs can spike quickly. It ingests billing and usage data from cloud APIs, combining cost visibility, optimization recommendations, budget tracking and anomaly detection in one system with policy-based alerts aligned to teams or business structures.

    Compared with OpenCost, OptScale covers broader non-Kubernetes FinOps use cases while still providing Kubernetes-level analysis. It also offers more control and less vendor lock-in than enterprise suites such as IBM Cloudability, CloudZero, CloudHealth, IBM Kubecost and Flexera One. The trade-off is higher operational overhead, with concerns around deployment complexity, connector edge cases and container image security hygiene. Teams should treat OptScale as a platform capability investment rather than a plug-and-play product.

  • 61. Rhesis

    Rhesis is an open-source testing platform for LLM and agentic applications that lets teams define expected behavior in natural language, generate adversarial test scenarios and evaluate outcomes through both a UI and an SDK or API. It’s becoming more relevant as traditional testing approaches assume deterministic behavior, while AI systems fail in more subtle ways, including jailbreaks, multi-turn interactions, policy violations and context-dependent edge cases. In our evaluation, Rhesis is a useful platform for teams that need more than simple prompt evaluations. Features such as the conversation simulator, adversarial testing, OpenTelemetry-based tracing and self-hosting via Docker make it a practical way to bring product, domain and engineering teams into a shared testing workflow. The main benefit is improved pre-production validation for non-deterministic systems. However, teams should consider common trade-offs in this space, including evaluation cost, the limits of LLM-as-judge metrics and the need for well-defined requirements before the platform delivers value. We think Rhesis is worth assessing for teams building LLMs or agentic systems that require collaborative, repeatable testing beyond basic prompt checks.

  • 62. RunPod

    As organizations increasingly experiment with training and fine-tuning LLMs, hyperscalers such as AWS and Google Cloud can introduce high costs and limited hardware availability. RunPod provides a cost-effective alternative for compute-intensive AI workloads. Operating as a globally distributed GPU marketplace, it offers on-demand access to a wide range of hardware, from enterprise-grade H100 clusters to consumer-grade RTX 4090s, often at significantly lower cost than traditional cloud providers. For teams needing flexible, budget-friendly infrastructure to develop, train or deploy AI models without long-term commitments or vendor lock-in, RunPod is a practical option worth evaluating.

  • 63. Sprites

    Sprites is a stateful sandbox environment from Fly.io designed for running AI coding agents in isolation. Where most agent sandboxes are ephemeral, spinning up for a task and disappearing, Sprites provides persistent Linux environments with unlimited checkpoint and restore capabilities. This allows developers to snapshot the entire environment state — including installed dependencies, run-time configuration and file system changes — and roll back when an agent goes off track. This goes beyond what Git alone can recover, capturing system state that version control does not track. As our teams increasingly adopt sandboxed execution for coding agents as a sensible default, Sprites represents one end of the spectrum: a non-ephemeral, stateful approach that trades the simplicity of throwaway containers for richer recovery options. Teams evaluating agent sandboxing should consider Sprites alongside ephemeral alternatives such as Dev Containers based on their needs and workflow.

  • 64. torchforge

    torchforge is a PyTorch-native reinforcement learning library designed for large-scale post-training of language models. It provides a higher-level abstraction that decouples algorithmic logic from infrastructure concerns, orchestrating components such as Monarch for coordination, vLLM for inference and torchtitan for distributed training. This approach allows researchers to express complex reinforcement learning workflows using pseudocode-like APIs, while scaling workloads across thousands of GPUs without managing low-level concerns such as resource synchronization, scheduling or fault tolerance. By separating the “what” (algorithm design) from the “how” (distributed execution), torchforge simplifies experimentation and iteration in large-scale alignment systems. We see this as a useful step toward making advanced post-training techniques more accessible, although teams should evaluate its maturity and fit within their existing ML infrastructure.

  • 65. torchtitan

    torchtitan is a PyTorch-native platform for large-scale pre-training of generative AI models, providing a clean and modular reference implementation for high-performance distributed training. It brings together advanced distributed primitives into a cohesive system, supporting 4D parallelism: data, tensor, pipeline and context parallelism. As training models at the scale of Llama 3.1 405B demands significant scale and efficiency, torchtitan offers a practical foundation for building and operating large training workloads. Its modular design makes it easier for teams to experiment with and evolve parallelism strategies while maintaining production readiness. We see torchtitan as a useful step toward standardizing large-scale model training in the PyTorch ecosystem, particularly for teams building their own pre-training infrastructure.

Caution ?

No blips

Unable to find something you expected to see?

 

Each edition of the Radar features blips reflecting what we came across during the previous six months. We might have covered what you are looking for on a previous Radar already. We sometimes cull things just because there are too many to talk about. A blip might also be missing because the Radar reflects our experience, it is not based on a comprehensive market analysis.

Download the PDF

 

 

 

English | Português

Sign up for the Technology Radar newsletter

 

 

Subscribe now

Visit our archive to read previous volumes