平台 | Thoughtworks China

Technology Radar Vol 33

Download

平台

新的
移进/移出
没有变化

平台

采纳

32. Arm in the cloud

Arm compute instances in the cloud have become increasingly popular in recent years for their cost and energy efficiency compared to traditional x86-based instances. Major cloud providers — including AWS, Azure and GCP — now offer robust Arm options. These instances are especially attractive for large-scale or cost-sensitive workloads. Many of our teams have successfully migrated workloads such as microservices, open-source databases and even high-performance computing to Arm with minimal code changes and only minor build-script adjustments. New cloud-based applications and systems increasingly default to Arm in the cloud. Based on our experience, we recommend Arm compute instances for most workloads unless specific architecture dependencies exist. Modern tooling, such as multi-arch Docker images, further simplifies building and deploying across both Arm and x86 environments.

查看历史条目

分享

试验

33. Apache Paimon

Apache Paimon is an open-source data lake format designed to enable lakehouse architecture. It integrates seamlessly with processing engines like Flink and Spark, supporting both streaming and batch operations. A key advantage of Paimon's architecture lies in its fusion of a standard data lake format with an LSM (log-structured merge-tree) structure. This combination addresses the traditional challenges of high-performance updates and low-latency reads in data lakes. Paimon supports primary key tables for high-throughput, real-time updates and includes a customizable merge engine for deduplication, partial updates and aggregations. This design enables efficient streaming data ingestion and management of mutable state directly within the lake. Paimon also provides mature data lake capabilities such as scalable metadata, ACID transactions, time travel, schema evolution and optimized data layouts through compression and Z-ordering. We recommend evaluating Paimon for projects that need a unified storage layer capable of efficiently handling large-scale append-only data and complex, real-time streaming updates.

查看历史条目
相关标记
分享
34. DataDog LLM Observability

Datadog LLM Observability provides end-to-end tracing, monitoring and diagnostics for large language models and agentic application workflows. It maps each prompt, tool call and intermediate step into spans and traces; tracks latency, token usage, errors and quality metrics; and integrates with Datadog’s broader APM and observability suite.

Organizations already using Datadog — and familiar with its cost structure — may find the LLM observability feature a straightforward way to gain visibility into AI workloads, assuming those workloads can be instrumented. However, configuring and using LLM instrumentation requires care and a solid understanding of both the workloads and their implementation. We recommend data engineers and operations staff collaborate closely when deploying it. See also our advice on avoiding standalone data engineering teams..

查看历史条目
相关标记
- Standalone data engineering teams
  
  暂缓
  
  技术
  November 2025
分享
35. Delta Sharing

Delta Sharing is an open standard and protocol for secure, cross-platform data sharing, developed by Databricks and the Linux Foundation. It’s cloud-agnostic, enabling organizations to share live data across cloud providers and on-prem locations without copying or replicating the data — preserving data freshness and eliminating duplication costs. We've seen an e-commerce company successfully use Delta Sharing to replace a fragmented partner data-sharing system with a centralized, real-time and secure platform, significantly improving collaboration. The protocol uses a simple REST API to issue short-lived pre-signed URLs, allowing recipients to retrieve large datasets using tools such as pandas, Spark or Power BI. It supports sharing data tables, views, AI models and notebooks. While it provides strong centralized governance and auditing, users should remain mindful of cloud egress costs, which can become a significant operational risk if unmanaged.

查看历史条目

分享
36. Dovetail

Dovetail addresses the persistent challenge of managing fragmented qualitative research data. It provides a centralized repository for user interviews, transcripts and insights, turning raw data into a structured, analyzable asset. We’ve found it invaluable in product discovery workflows, particularly for creating an evidence trail that links customer quotes and synthesized themes directly to product hypotheses and estimated ROI. In doing so, Dovetail strengthens the role of qualitative data in product decision-making.

查看历史条目

分享
37. Langdock

Langdock is a platform for organizations to develop and run generative AI agents and workflows for internal operations. It provides a unified environment with internal chat assistants, an API layer for connecting to multiple LLMs and tools for building agentic workflows that integrate with systems such as Slack, Confluence and Google Drive. The platform emphasizes data sovereignty, offering on-premise and EU-hosted options with enterprise compliance standards.

Organizations deploying Langdock should still pay close attention to data governance, and use techniques such as toxic flow analysis to avoid the lethal trifecta. Adopters should also consider the platform’s maturity, evaluate the specific integrations they require and plan for any custom development that may be necessary.

查看历史条目
相关标记
- Toxic flow analysis for AI
  
  评估
  
  技术
  November 2025
分享
38. LangSmith

LangSmith is a hosted platform from the LangChain team that provides observability, tracing and evaluation for LLM applications. It captures detailed traces of chains, tools and prompts, enabling teams to debug and measure model behavior, track performance regressions and manage evaluation data sets. LangSmith is a proprietary SaaS with limited support for non-LangChain workflows, making it most appealing to teams already invested in that ecosystem. Its integrated support for prompt evaluation and experimentation is notably more polished than open-source alternatives like Langfuse.

查看历史条目
相关标记
- Langfuse
  
  试验
  
  平台
  October 2024
分享
39. Model Context Protocol (MCP)

The Model Context Protocol (MCP) is an open standard that defines how LLM applications and agents integrate with external data sources and tools, significantly improving the quality of AI-generated outputs. MCP focuses on context and tool access, distinguishing it from the Agent2Agent (A2A) protocol, which governs inter-agent communication. It specifies servers (for data and tools such as databases, wikis and services) and clients (agents, applications and coding assistants). Since our last blip, MCP adoption has surged, with major companies such as JetBrains (IntelliJ) and Apple joining the ecosystem, alongside emerging frameworks like FastMCP. A preview MCP Registry standard now supports public and proprietary tool discovery. However, MCP's rapid evolution has also introduced architectural gaps, drawing criticism for overlooking established RPC best practices. For production applications, teams should look beyond the hype and apply additional scrutiny by mitigating toxic flows using tools like MCP-Scan and closely monitoring the draft authorization module for security.

查看历史条目
相关标记
分享
40. n8n

n8n is a fair-code–licensed workflow automation platform, similar to Zapier or Make (formerly Integromat), but built for developers who want a self-hosted, extensible and code-controllable option. It offers a lower-code, visual approach to workflow creation than Apache Airflow, while still supporting custom code in JavaScript or Python.

Its primary use case is integrating multiple services into automated workflows, but it can also connect LLMs with configurable data sources, memory and tools. Many of our teams use n8n to rapidly prototype agentic workflows triggered by chat applications or webhooks, often leveraging its import and export capabilities to generate workflows with AI assistance. As always, we advise caution when using low-code platforms in production. However, n8n's self-hosting and code-defined workflows can mitigate some of those risks.

查看历史条目
相关标记
- Low-code platforms
  
  暂缓
  
  平台
  November 2018
分享
41. OpenThread

OpenThread is an open-source implementation of the Thread networking protocol developed by Google. It supports all key features of the Thread specification — including networking layers such as IPv6, 6LoWPAN and LR-WPAN — as well as mesh network capabilities that allow a device to function as both a node and a border router. OpenThread runs on a wide range of hardware platforms, leveraging a flexible abstraction layer and integration hooks that enable vendors to incorporate their own radio and cryptographic capabilities. This mature protocol is widely used in commercial products and, in our experience, has proven reliable for building diverse IoT solutions — from battery-operated, low-power devices to large-scale mesh sensor networks.

查看历史条目

分享

评估

42. AG-UI Protocol

AG-UI is an open protocol and library designed to standardize communication between rich user interfaces and agents. Focused on direct user-facing agents, it uses middleware and client integrations to generalize across any frontend and backend. The protocol defines a consistent way for back-end agents to communicate with front-end applications, enabling real-time, stateful collaboration between AI and human users. It supports multiple transport protocols, including SSE and WebSockets, and provides standardized event types to represent different states of agent execution. Built-in support is available for popular agentic frameworks such as LangGraph and Pydantic AI, with community integrations for others.

查看历史条目
相关标记
- LangGraph
  
  采纳
  
  语言 & 框架
  November 2025
- Pydantic AI
  
  试验
  
  语言 & 框架
  November 2025
分享
43. Agent-to-Agent (A2A) Protocol

Agent2Agent (A2A) is a protocol that defines a standard for communication and interaction among agents in complex, multi-agent workflows. It uses Agent Cards to describe the key elements of inter-agent communication, including skill discovery and the specification of transport and security schemes. A2A complements the Model Context Protocol (MCP) by focusing on agent-to-agent communication without exposing internal details such as an agent's state, memory or internal.

The protocol promotes best practices such as an asynchronous-first approach for long-running tasks, streaming responses for incremental updates and secure transport with HTTPS, authentication and authorization. SDKs are available in Python, JavaScript, Java and C# to facilitate rapid adoption. Although relatively new, A2A enables teams to build domain-specific agents that can collaborate to form complex workflows, making it a strong option for such scenarios.

查看历史条目
相关标记
- 模型上下文协议（MCP）
  
  评估
  
  平台
  April 2025
分享
44. Amazon S3 Vectors

Amazon S3 Vectors extends the S3 object store with native vector capabilities, offering built-in vector storage and similarity search functionality. It integrates seamlessly with the AWS ecosystem, including Amazon Bedrock and OpenSearch, and provides additional features such as metadata filtering and governance via IAM. While still in preview and subject to restrictions and limitations, we find its value proposition compelling. This cost-effective, accessible approach to vector storage could enable a range of applications that involve large data volumes and where low latency is not the primary concern.

查看历史条目

分享
45. Ardoq

Ardoq is an enterprise architecture (EA) platform that enables organizations to build, manage and scale their architecture knowledge bases so they can plan more effectively for the future. Unlike traditional static documentation, which is prone to drift and siloing, Ardoq's data-driven approach pulls information from existing systems to create a dynamic knowledge graph that stays up to date as the landscape evolves. One feature we've found particularly useful is Ardoq Scenarios, which allows you to visually model and define what-if future states using a branching and merging approach similar to Git. Organizations pursuing architectural transformation should assess dedicated EA platforms like Ardoq for their potential to streamline and accelerate this process.

查看历史条目

分享
46. CloudNativePG

CloudNativePG is a Kubernetes Operator that simplifies hosting and managing highly available PostgreSQL clusters in Kubernetes. Running a stateful service like PostgreSQL on Kubernetes can be complex, requiring deep knowledge of both Kubernetes and PostgreSQL replication. CloudNativePG abstracts much of this complexity by treating the entire PostgreSQL cluster as a single, configurable declarative resource. It provides seamless primary/standby architecture using native streaming replication and includes high-availability features out of the box, including self-healing capabilities, automated failover that promotes the most aligned replica and automatic recreation of failed replicas. If you're looking to host PostgreSQL on Kubernetes, CloudNativePG is a solid place to start.

查看历史条目
相关标记
- Kubernetes Operators
  
  评估
  
  工具
  April 2019
分享
47. Coder

Coder is a platform for quickly provisioning standardized coding environments, following the development environments in the cloud practice we've described before. Compared with similar tools such as Gitpod (now rebranded as Ona) and GitHub Codespaces, Coder offers greater control over workstation customization through Terraform. It hosts workstations on your own infrastructure, whether in the cloud or in a data center, rather than on a vendor's servers. This approach provides more flexibility, including the ability to run AI coding agents and access internal organizational systems. However, this flexibility comes with tradeoffs: More effort to set up and maintain workstation templates and greater responsibility for managing data security risks in agentic workflows.

查看历史条目
相关标记
- Github Codespace
  
  评估
  
  工具
  March 2022
分享
48. Graft

Graft is a transactional storage engine that enables strongly consistent and efficient data synchronization across edge and distributed environments. It achieves this by using lazy replication to sync data only on demand, partial replication to minimize bandwidth consumption and serializable snapshot isolation to guarantee data integrity. We’ve mentioned Electric in the Radar for a similar use case, but we see Graft as unique in turning object storage into a transactional system that supports consistent page-level updates without imposing a data format. This makes it well-suited to powering local-first mobile applications, managing complex cross-platform synchronization and serving as the backbone for stateless replicas in serverless or embedded systems.

查看历史条目
相关标记
- Electric
  
  评估
  
  语言 & 框架
  April 2024
- 本地优先应用程序
  
  评估
  
  技术
  October 2022
分享
49. groundcover

groundcover is a cloud-native observability platform that unifies logs, traces, metrics and Kubernetes events in a single pane of glass. It leverages eBPF to capture granular observability data with zero code instrumentation — that is, without inserting agents or SDKs into application code. groundcover’s eBPF sensor runs on a dedicated node in each monitored cluster, operating independently from the applications it observes. Key features include deep kernel-level visibility, a bring-your-own-cloud (BYOC) architecture for data privacy and a data volume–agnostic pricing model that keeps costs predictable.

查看历史条目
相关标记
- eBPF
  
  试验
  
  平台
  October 2022
分享
50. Karmada

Karmada ("Kubernetes Armada") is a platform for orchestrating workloads across multiple Kubernetes clusters, clouds and data centers. Many teams currently deploy across clusters using GitOps tools like Flux or ArgoCD combined with custom scripts, so a purpose-built solution is welcome. Karmada leverages Kubernetes-native APIs, requiring no changes to applications already built for cloud-native environments. It offers advanced scheduling capabilities for multi-cloud management, high availability, failure recovery and traffic scheduling.

Karmada is still relatively new, so it's important to assess the maturity of the features your team depends on. As a CNCF project, however, it has strong momentum, and several of our teams are already using it successfully. Note that certain areas — such as networking, state and storage management across clusters — are outside Karmada’s scope. Most teams will still need a service mesh like Istio or Linkerd for traffic handling and should plan how to manage stateful workloads and distributed data.

查看历史条目
相关标记
分享
51. OpenFeature

As businesses scale, feature flag management often becomes increasingly complex; teams need an abstraction layer that goes beyond the simplest possible feature toggle. OpenFeature provides this layer through a vendor-agnostic, community-driven API specification that standardizes how feature flags are defined and consumed, decoupling application code from the management solution. This flexibility allows teams to switch providers easily — from basic setups using environment variables or in-memory configurations up to mature platforms like ConfigCat or LaunchDarkly. However, one critical caution remains: teams must manage different categories of flags separately and with discipline to avoid flag proliferation, application complexity and excessive testing overhead.

查看历史条目
相关标记
- 最简特性开关
  
  采纳
  
  技术
  May 2020
- ConfigCat
  
  评估
  
  工具
  May 2020
分享
52. Oxide

Building and operating private infrastructure is complex. That’s one of the main reasons public cloud is the default for most organizations. However, for those that need it, Oxide offers an alternative to assembling and integrating hardware and software from scratch. It provides prebuilt racks with compute, networking and storage, running fully integrated system software. Teams can manage resources through Oxide’s IaaS APIs using Terraform and other automation tools — what Oxide calls on-premises elastic infrastructure.

Dell and VMware’s VxRail, Nutanix and HPE SimpliVity also provide hyper-converged infrastructure (HCI) solutions, but what distinguishes Oxide is its purpose-built approach. It designs the entire stack — from circuit boards and power supplies to firmware — instead of assembling components from different vendors. Oxide has also developed and open-sourced Hubris, a lightweight, memory-protected, message-passing kernel written in Rust for embedded systems, along with other Rust-based infrastructure projects. We also appreciate that Oxide sells their equipment and software without license fees.

查看历史条目

分享
53. Restate

Restate is a durable execution platform designed to address complex distributed system challenges when building stateful, fault-tolerant applications. It logs every step via execution journaling, ensuring fault-tolerance, reliable recovery and exactly-once communication across services. The platform’s key architectural advantage lies in separating application logic into three durable service types: Basic Services for stateless functions; Virtual Objects to model concurrent, stateful entities; and Workflows to orchestrate complex, multi-step processes. We’ve been carefully assessing Restate in a large insurance system and are quite happy with its performance so far.

查看历史条目

分享
54. SkyPilot

SkyPilot is an open-source platform for running and scaling AI workloads on-premises or in the cloud. Developed by the Sky Computing Lab at UC Berkeley, SkyPilot acts as an intelligent broker, automatically finding and provisioning the cheapest, most available GPUs across major clouds and Kubernetes clusters, often cutting compute costs. For infrastructure teams, it simplifies running AI on Kubernetes by offering Slurm-like ease of use, cloud-native robustness, direct SSH access to pods and features such as gang scheduling and multi-cluster support for seamless scaling of training or inference workloads.

查看历史条目
相关标记
- vLLM
  
  采纳
  
  语言 & 框架
  November 2025
分享
55. StarRocks

StarRocks is an analytical database that redefines real-time business intelligence by combining the speed of traditional OLAP systems with the flexibility of a modern data lakehouse. It achieves sub-second query latency at massive scale through a SIMD-optimized execution engine, columnar storage and a sophisticated cost-based optimizer. This high-performance architecture allows users to run complex analytics directly on open data formats such as Apache Iceberg, without pre-computation or data copying. While there are many platforms in this space, we see StarRocks as a strong candidate for cost-effective solutions that require both extreme concurrency and consistent, up-to-the-second data freshness.

查看历史条目
相关标记
- Lakehouse架构
  
  试验
  
  技术
  April 2023
- Apache Iceberg
  
  评估
  
  平台
  March 2022
分享
56. Uncloud

Uncloud is a lightweight container orchestration and clustering tool that enables developers to take Docker Compose applications to production, offering a simple, cloud-like experience without the operational overhead of Kubernetes. It achieves cross-machine scaling and zero-downtime deployments by automatically configuring a secure WireGuard mesh network for communication and using the Caddy reverse proxy to provide automatic HTTPS and load balancing. Uncloud’s main architectural advantage is its fully decentralized design, which eliminates the need for a central control plane and ensures cluster operations remain functional even if individual machines go offline. With Uncloud, you can freely mix and match cloud VMs and bare-metal servers into a unified and cost-effective computing environment.

查看历史条目

分享

暂缓

无异常点

无法找到需要的信息？

每期技术雷达中的条目都在试图反映我们在过去六个月中的技术洞见，或许你所搜索的内容已经在前几期中出现过。由于我们有太多想要谈论的内容，有时候不得不剔除一些长期没有发生变化的条目。技术雷达来自于我们的主观经验，而非全面的市场分析，所以你可能会找不到自己最在意的技术条目。

platforms

无法找到需要的信息？

.all-quadrants.js-enabled { display: none; }

Download the PDF

English | Español | Português | 中文

Sign up for the Technology Radar newsletter

Subscribe now

解决方案

行业

特色

数字出版物和工具

所有洞见

Platforms

平台

平台

采纳

32. Arm in the cloud

试验

33. Apache Paimon

34. DataDog LLM Observability

35. Delta Sharing

36. Dovetail

37. Langdock

38. LangSmith

39. Model Context Protocol (MCP)

40. n8n

41. OpenThread

评估

42. AG-UI Protocol

43. Agent-to-Agent (A2A) Protocol

44. Amazon S3 Vectors

45. Ardoq

46. CloudNativePG

47. Coder

48. Graft

49. groundcover

50. Karmada

51. OpenFeature

52. Oxide

53. Restate

54. SkyPilot

55. StarRocks

56. Uncloud

暂缓

Download the PDF

Sign up for the Technology Radar newsletter

Visit our archive to read previous volumes

解决方案

行业

特色

数字出版物和工具

所有洞见

Platforms

平台

平台

采纳 ? 我们强烈建议业界采用这些技术，我们将会在任何合适的项目中使用它们。

32. Arm in the cloud

试验 ? 值得一试。了解为何要构建这一能力是很重要的。企业应当在风险可控的前提下在项目中尝试应用此项技术。

33. Apache Paimon

34. DataDog LLM Observability

35. Delta Sharing

36. Dovetail

37. Langdock

38. LangSmith

39. Model Context Protocol (MCP)

40. n8n

41. OpenThread

评估 ? 在了解它将对你的企业产生什么影响的前提下值得探索

42. AG-UI Protocol

43. Agent-to-Agent (A2A) Protocol

44. Amazon S3 Vectors

45. Ardoq

46. CloudNativePG

47. Coder

48. Graft

49. groundcover

50. Karmada

51. OpenFeature

52. Oxide

53. Restate

54. SkyPilot

55. StarRocks

56. Uncloud

暂缓 ? 谨慎行事

Download the PDF

Sign up for the Technology Radar newsletter

Visit our archive to read previous volumes

采纳

试验

评估

暂缓