Enable javascript in your browser for better experience. Need to know to enable it? Go here.
Volume 33 | November 2025

Platforms

Subscribe
  • 平台

    采纳 试验 评估 暂缓 采纳 试验 评估 暂缓
  • 新的
  • 移进/移出
  • 没有变化

平台

采纳 ?

  • 32. Arm in the cloud

    Arm compute instances in the cloud have become increasingly popular in recent years for their cost and energy efficiency compared to traditional x86-based instances. Major cloud providers — including AWS, Azure and GCP — now offer robust Arm options. These instances are especially attractive for large-scale or cost-sensitive workloads. Many of our teams have successfully migrated workloads such as microservices, open-source databases and even high-performance computing to Arm with minimal code changes and only minor build-script adjustments. New cloud-based applications and systems increasingly default to Arm in the cloud. Based on our experience, we recommend Arm compute instances for most workloads unless specific architecture dependencies exist. Modern tooling, such as multi-arch Docker images, further simplifies building and deploying across both Arm and x86 environments.

试验 ?

  • 33. Apache Paimon

    Apache Paimon is an open-source data lake format designed to enable lakehouse architecture. It integrates seamlessly with processing engines like Flink and Spark, supporting both streaming and batch operations. A key advantage of Paimon's architecture lies in its fusion of a standard data lake format with an LSM (log-structured merge-tree) structure. This combination addresses the traditional challenges of high-performance updates and low-latency reads in data lakes. Paimon supports primary key tables for high-throughput, real-time updates and includes a customizable merge engine for deduplication, partial updates and aggregations. This design enables efficient streaming data ingestion and management of mutable state directly within the lake. Paimon also provides mature data lake capabilities such as scalable metadata, ACID transactions, time travel, schema evolution and optimized data layouts through compression and Z-ordering. We recommend evaluating Paimon for projects that need a unified storage layer capable of efficiently handling large-scale append-only data and complex, real-time streaming updates.

  • 34. DataDog LLM Observability

    Datadog LLM Observability provides end-to-end tracing, monitoring and diagnostics for large language models and agentic application workflows. It maps each prompt, tool call and intermediate step into spans and traces; tracks latency, token usage, errors and quality metrics; and integrates with Datadog’s broader APM and observability suite.

    Organizations already using Datadog — and familiar with its cost structure — may find the LLM observability feature a straightforward way to gain visibility into AI workloads, assuming those workloads can be instrumented. However, configuring and using LLM instrumentation requires care and a solid understanding of both the workloads and their implementation. We recommend data engineers and operations staff collaborate closely when deploying it. See also our advice on avoiding standalone data engineering teams..

  • 35. Delta Sharing

    Delta Sharing is an open standard and protocol for secure, cross-platform data sharing, developed by Databricks and the Linux Foundation. It’s cloud-agnostic, enabling organizations to share live data across cloud providers and on-prem locations without copying or replicating the data — preserving data freshness and eliminating duplication costs. We've seen an e-commerce company successfully use Delta Sharing to replace a fragmented partner data-sharing system with a centralized, real-time and secure platform, significantly improving collaboration. The protocol uses a simple REST API to issue short-lived pre-signed URLs, allowing recipients to retrieve large datasets using tools such as pandas, Spark or Power BI. It supports sharing data tables, views, AI models and notebooks. While it provides strong centralized governance and auditing, users should remain mindful of cloud egress costs, which can become a significant operational risk if unmanaged.

  • 36. Dovetail

    Dovetail addresses the persistent challenge of managing fragmented qualitative research data. It provides a centralized repository for user interviews, transcripts and insights, turning raw data into a structured, analyzable asset. We’ve found it invaluable in product discovery workflows, particularly for creating an evidence trail that links customer quotes and synthesized themes directly to product hypotheses and estimated ROI. In doing so, Dovetail strengthens the role of qualitative data in product decision-making.

  • 37. Langdock

    Langdock is a platform for organizations to develop and run generative AI agents and workflows for internal operations. It provides a unified environment with internal chat assistants, an API layer for connecting to multiple LLMs and tools for building agentic workflows that integrate with systems such as Slack, Confluence and Google Drive. The platform emphasizes data sovereignty, offering on-premise and EU-hosted options with enterprise compliance standards.

    Organizations deploying Langdock should still pay close attention to data governance, and use techniques such as toxic flow analysis to avoid the lethal trifecta. Adopters should also consider the platform’s maturity, evaluate the specific integrations they require and plan for any custom development that may be necessary.

  • 38. LangSmith

    LangSmith is a hosted platform from the LangChain team that provides observability, tracing and evaluation for LLM applications. It captures detailed traces of chains, tools and prompts, enabling teams to debug and measure model behavior, track performance regressions and manage evaluation data sets. LangSmith is a proprietary SaaS with limited support for non-LangChain workflows, making it most appealing to teams already invested in that ecosystem. Its integrated support for prompt evaluation and experimentation is notably more polished than open-source alternatives like Langfuse.

  • 39. Model Context Protocol (MCP)

    The Model Context Protocol (MCP) is an open standard that defines how LLM applications and agents integrate with external data sources and tools, significantly improving the quality of AI-generated outputs. MCP focuses on context and tool access, distinguishing it from the Agent2Agent (A2A) protocol, which governs inter-agent communication. It specifies servers (for data and tools such as databases, wikis and services) and clients (agents, applications and coding assistants). Since our last blip, MCP adoption has surged, with major companies such as JetBrains (IntelliJ) and Apple joining the ecosystem, alongside emerging frameworks like FastMCP. A preview MCP Registry standard now supports public and proprietary tool discovery. However, MCP's rapid evolution has also introduced architectural gaps, drawing criticism for overlooking established RPC best practices. For production applications, teams should look beyond the hype and apply additional scrutiny by mitigating toxic flows using tools like MCP-Scan and closely monitoring the draft authorization module for security.

  • 40. n8n

    n8n is a fair-code–licensed workflow automation platform, similar to Zapier or Make (formerly Integromat), but built for developers who want a self-hosted, extensible and code-controllable option. It offers a lower-code, visual approach to workflow creation than Apache Airflow, while still supporting custom code in JavaScript or Python.

    Its primary use case is integrating multiple services into automated workflows, but it can also connect LLMs with configurable data sources, memory and tools. Many of our teams use n8n to rapidly prototype agentic workflows triggered by chat applications or webhooks, often leveraging its import and export capabilities to generate workflows with AI assistance. As always, we advise caution when using low-code platforms in production. However, n8n's self-hosting and code-defined workflows can mitigate some of those risks.

  • 41. OpenThread

    OpenThread is an open-source implementation of the Thread networking protocol developed by Google. It supports all key features of the Thread specification — including networking layers such as IPv6, 6LoWPAN and LR-WPAN — as well as mesh network capabilities that allow a device to function as both a node and a border router. OpenThread runs on a wide range of hardware platforms, leveraging a flexible abstraction layer and integration hooks that enable vendors to incorporate their own radio and cryptographic capabilities. This mature protocol is widely used in commercial products and, in our experience, has proven reliable for building diverse IoT solutions — from battery-operated, low-power devices to large-scale mesh sensor networks.

评估 ?

  • 42. AG-UI Protocol

    AG-UI is an open protocol and library designed to standardize communication between rich user interfaces and agents. Focused on direct user-facing agents, it uses middleware and client integrations to generalize across any frontend and backend. The protocol defines a consistent way for back-end agents to communicate with front-end applications, enabling real-time, stateful collaboration between AI and human users. It supports multiple transport protocols, including SSE and WebSockets, and provides standardized event types to represent different states of agent execution. Built-in support is available for popular agentic frameworks such as LangGraph and Pydantic AI, with community integrations for others.

  • 43. Agent-to-Agent (A2A) Protocol

    Agent2Agent (A2A) is a protocol that defines a standard for communication and interaction among agents in complex, multi-agent workflows. It uses Agent Cards to describe the key elements of inter-agent communication, including skill discovery and the specification of transport and security schemes. A2A complements the Model Context Protocol (MCP) by focusing on agent-to-agent communication without exposing internal details such as an agent's state, memory or internal.

    The protocol promotes best practices such as an asynchronous-first approach for long-running tasks, streaming responses for incremental updates and secure transport with HTTPS, authentication and authorization. SDKs are available in Python, JavaScript, Java and C# to facilitate rapid adoption. Although relatively new, A2A enables teams to build domain-specific agents that can collaborate to form complex workflows, making it a strong option for such scenarios.

  • 44. Amazon S3 Vectors

    Amazon S3 Vectors extends the S3 object store with native vector capabilities, offering built-in vector storage and similarity search functionality. It integrates seamlessly with the AWS ecosystem, including Amazon Bedrock and OpenSearch, and provides additional features such as metadata filtering and governance via IAM. While still in preview and subject to restrictions and limitations, we find its value proposition compelling. This cost-effective, accessible approach to vector storage could enable a range of applications that involve large data volumes and where low latency is not the primary concern.

  • 45. Ardoq

    Ardoq is an enterprise architecture (EA) platform that enables organizations to build, manage and scale their architecture knowledge bases so they can plan more effectively for the future. Unlike traditional static documentation, which is prone to drift and siloing, Ardoq's data-driven approach pulls information from existing systems to create a dynamic knowledge graph that stays up to date as the landscape evolves. One feature we've found particularly useful is Ardoq Scenarios, which allows you to visually model and define what-if future states using a branching and merging approach similar to Git. Organizations pursuing architectural transformation should assess dedicated EA platforms like Ardoq for their potential to streamline and accelerate this process.

  • 46. CloudNativePG

    CloudNativePG is a Kubernetes Operator that simplifies hosting and managing highly available PostgreSQL clusters in Kubernetes. Running a stateful service like PostgreSQL on Kubernetes can be complex, requiring deep knowledge of both Kubernetes and PostgreSQL replication. CloudNativePG abstracts much of this complexity by treating the entire PostgreSQL cluster as a single, configurable declarative resource. It provides seamless primary/standby architecture using native streaming replication and includes high-availability features out of the box, including self-healing capabilities, automated failover that promotes the most aligned replica and automatic recreation of failed replicas. If you're looking to host PostgreSQL on Kubernetes, CloudNativePG is a solid place to start.

  • 47. Coder

    Coder is a platform for quickly provisioning standardized coding environments, following the development environments in the cloud practice we've described before. Compared with similar tools such as Gitpod (now rebranded as Ona) and GitHub Codespaces, Coder offers greater control over workstation customization through Terraform. It hosts workstations on your own infrastructure, whether in the cloud or in a data center, rather than on a vendor's servers. This approach provides more flexibility, including the ability to run AI coding agents and access internal organizational systems. However, this flexibility comes with tradeoffs: More effort to set up and maintain workstation templates and greater responsibility for managing data security risks in agentic workflows.

  • 48. Graft

    Graft is a transactional storage engine that enables strongly consistent and efficient data synchronization across edge and distributed environments. It achieves this by using lazy replication to sync data only on demand, partial replication to minimize bandwidth consumption and serializable snapshot isolation to guarantee data integrity. We’ve mentioned Electric in the Radar for a similar use case, but we see Graft as unique in turning object storage into a transactional system that supports consistent page-level updates without imposing a data format. This makes it well-suited to powering local-first mobile applications, managing complex cross-platform synchronization and serving as the backbone for stateless replicas in serverless or embedded systems.

  • 49. groundcover

    groundcover is a cloud-native observability platform that unifies logs, traces, metrics and Kubernetes events in a single pane of glass. It leverages eBPF to capture granular observability data with zero code instrumentation — that is, without inserting agents or SDKs into application code. groundcover’s eBPF sensor runs on a dedicated node in each monitored cluster, operating independently from the applications it observes. Key features include deep kernel-level visibility, a bring-your-own-cloud (BYOC) architecture for data privacy and a data volume–agnostic pricing model that keeps costs predictable.

  • 50. Karmada

    Karmada ("Kubernetes Armada") is a platform for orchestrating workloads across multiple Kubernetes clusters, clouds and data centers. Many teams currently deploy across clusters using GitOps tools like Flux or ArgoCD combined with custom scripts, so a purpose-built solution is welcome. Karmada leverages Kubernetes-native APIs, requiring no changes to applications already built for cloud-native environments. It offers advanced scheduling capabilities for multi-cloud management, high availability, failure recovery and traffic scheduling.

    Karmada is still relatively new, so it's important to assess the maturity of the features your team depends on. As a CNCF project, however, it has strong momentum, and several of our teams are already using it successfully. Note that certain areas — such as networking, state and storage management across clusters — are outside Karmada’s scope. Most teams will still need a service mesh like Istio or Linkerd for traffic handling and should plan how to manage stateful workloads and distributed data.

  • 51. OpenFeature

    As businesses scale, feature flag management often becomes increasingly complex; teams need an abstraction layer that goes beyond the simplest possible feature toggle. OpenFeature provides this layer through a vendor-agnostic, community-driven API specification that standardizes how feature flags are defined and consumed, decoupling application code from the management solution. This flexibility allows teams to switch providers easily — from basic setups using environment variables or in-memory configurations up to mature platforms like ConfigCat or LaunchDarkly. However, one critical caution remains: teams must manage different categories of flags separately and with discipline to avoid flag proliferation, application complexity and excessive testing overhead.

  • 52. Oxide

    Building and operating private infrastructure is complex. That’s one of the main reasons public cloud is the default for most organizations. However, for those that need it, Oxide offers an alternative to assembling and integrating hardware and software from scratch. It provides prebuilt racks with compute, networking and storage, running fully integrated system software. Teams can manage resources through Oxide’s IaaS APIs using Terraform and other automation tools — what Oxide calls on-premises elastic infrastructure.

    Dell and VMware’s VxRail, Nutanix and HPE SimpliVity also provide hyper-converged infrastructure (HCI) solutions, but what distinguishes Oxide is its purpose-built approach. It designs the entire stack — from circuit boards and power supplies to firmware — instead of assembling components from different vendors. Oxide has also developed and open-sourced Hubris, a lightweight, memory-protected, message-passing kernel written in Rust for embedded systems, along with other Rust-based infrastructure projects. We also appreciate that Oxide sells their equipment and software without license fees.

  • 53. Restate

    Restate is a durable execution platform designed to address complex distributed system challenges when building stateful, fault-tolerant applications. It logs every step via execution journaling, ensuring fault-tolerance, reliable recovery and exactly-once communication across services. The platform’s key architectural advantage lies in separating application logic into three durable service types: Basic Services for stateless functions; Virtual Objects to model concurrent, stateful entities; and Workflows to orchestrate complex, multi-step processes. We’ve been carefully assessing Restate in a large insurance system and are quite happy with its performance so far.

  • 54. SkyPilot

    SkyPilot is an open-source platform for running and scaling AI workloads on-premises or in the cloud. Developed by the Sky Computing Lab at UC Berkeley, SkyPilot acts as an intelligent broker, automatically finding and provisioning the cheapest, most available GPUs across major clouds and Kubernetes clusters, often cutting compute costs. For infrastructure teams, it simplifies running AI on Kubernetes by offering Slurm-like ease of use, cloud-native robustness, direct SSH access to pods and features such as gang scheduling and multi-cluster support for seamless scaling of training or inference workloads.

  • 55. StarRocks

    StarRocks is an analytical database that redefines real-time business intelligence by combining the speed of traditional OLAP systems with the flexibility of a modern data lakehouse. It achieves sub-second query latency at massive scale through a SIMD-optimized execution engine, columnar storage and a sophisticated cost-based optimizer. This high-performance architecture allows users to run complex analytics directly on open data formats such as Apache Iceberg, without pre-computation or data copying. While there are many platforms in this space, we see StarRocks as a strong candidate for cost-effective solutions that require both extreme concurrency and consistent, up-to-the-second data freshness.

  • 56. Uncloud

    Uncloud is a lightweight container orchestration and clustering tool that enables developers to take Docker Compose applications to production, offering a simple, cloud-like experience without the operational overhead of Kubernetes. It achieves cross-machine scaling and zero-downtime deployments by automatically configuring a secure WireGuard mesh network for communication and using the Caddy reverse proxy to provide automatic HTTPS and load balancing. Uncloud’s main architectural advantage is its fully decentralized design, which eliminates the need for a central control plane and ensures cluster operations remain functional even if individual machines go offline. With Uncloud, you can freely mix and match cloud VMs and bare-metal servers into a unified and cost-effective computing environment.

暂缓 ?

无异常点

无法找到需要的信息?

 

每期技术雷达中的条目都在试图反映我们在过去六个月中的技术洞见,或许你所搜索的内容已经在前几期中出现过。由于我们有太多想要谈论的内容,有时候不得不剔除一些长期没有发生变化的条目。技术雷达来自于我们的主观经验,而非全面的市场分析,所以你可能会找不到自己最在意的技术条目。

Download the PDF

 

 

 

English | Español | Português | 中文

Sign up for the Technology Radar newsletter

 

 

Subscribe now

Visit our archive to read previous volumes