Enable javascript in your browser for better experience. Need to know to enable it? Go here.
第32期 | 四月 2025

平台

  • 平台

    采纳 试验 评估 暂缓 采纳 试验 评估 暂缓
  • 新的
  • 移进/移出
  • 没有变化

平台

采纳 ?

  • 23. GitLab CI/CD

    GitLab CI/CD 已发展为 GitLab 内部一个高度集成的系统,涵盖从代码集成、测试到部署和监控的所有环节。它支持复杂的工作流,包括多阶段流水线、缓存、并行执行和自动扩展运行器,非常适合大型项目和复杂流水线需求。我们特别想强调其内置的安全和合规工具(如 SAST 和 DAST 分析),使其非常适合具有高合规性要求的场景。此外,它还与 Kubernetes 无缝集成,支持云原生工作流,并提供实时日志、测试报告和可追溯性,以增强可观察性。

  • 24. Trino

    Trino 是一个开源的分布式 SQL 查询引擎,专为大数据的交互式分析查询而设计。它针对本地和云端环境进行了优化,支持直接在数据驻留的位置进行查询,包括关系型数据库和各种专有数据存储(通过连接器)。Trino 还能够查询存储为 Parquet 等文件格式的数据,以及像 Apache Iceberg 这样的开放表格格式。其内置的查询联邦功能允许将多个数据源的数据作为一个逻辑表进行查询,非常适合需要聚合多种来源数据的分析工作负载。Trino 是许多流行技术栈(如 AWS Athena、Starburst 和其他专有数据平台)的重要组成部分。我们的团队在各种用例中成功使用了 Trino,对于跨多个数据源进行分析的数据集查询,Trino 一直是一个可靠的选择。

试验 ?

  • 25. ABsmartly

    ABsmartly 是一款先进的 A/B 测试与实验平台,专为快速且可信的决策制定而设计。其核心亮点是 Group Sequential Testing (GST) 引擎,与传统 A/B 测试工具相比,可将测试结果的速度提升高达 80%。平台提供实时报告、深度数据分割以及通过 API 优先的方式实现的无缝全栈集成,支持在网页、移动端、微服务和机器学习模型中运行实验。

    ABsmartly 专注于解决可扩展、数据驱动实验中的关键挑战,使得更快的迭代和更敏捷的产品开发成为可能。通过零延迟执行、强大的深度分割能力以及对多平台实验的支持,ABsmartly 对希望扩大实验文化并优先推动数据驱动创新的组织尤为有价值。借助其显著缩短的测试周期和自动化结果分析能力,ABsmartly 帮助我们比传统 A/B 测试平台更高效地优化功能和用户体验。

  • 26. Dapr

    自从我们上次在技术雷达中介绍 Dapr 以来,它已经有了显著的发展。它的许多新特性包括任务调度、虚拟角色以及更为复杂的重试策略和可观察性组件。它的构建模块列表不断扩展,新增了任务、加密等功能。我们的团队还注意到 Dapr 在安全默认设置方面的日益关注,支持 mTLS 和无发行版镜像。总体而言,我们对 Dapr 的表现感到满意,并期待其未来演进。

  • 27. Grafana Alloy

    前身为 Grafana Agent,Grafana Alloy 是一个开源的 OpenTelemetry Collector。Alloy 被设计为一个一体化的遥测数据收集器,用于收集包括日志、指标和跟踪在内的所有遥测数据。它支持常用的遥测数据格式,如 OpenTelemetry、Prometheus 和 Datadog。随着 Promtail 最近被弃用,Alloy 正逐渐成为遥测数据收集的首选工具——特别是在使用 Grafana 可观察性技术栈时,用于日志数据的收集。

  • 28. Grafana Loki

    Grafana Loki是一个受 Prometheus 启发的横向可扩展,高可用的多租户日志聚合系统。Loki 只对日志的元数据进行索引,并把它当做日志流的标签集,而日志数据本身则储存在像 S3, GCS 或 Azure Blob Storage 这样的块存储方案中。这样做的好处是 Loki 比竞争对手的运维复杂度更低,同时也降低了存储成本。正如你所期待的那样,它与 Grafana 和Grafana Alloy深度集成,即使其他的日志采集机制也被支持。

    Loki 3.0 引入了对原生OpenTelemetry的支持,这使得与 OpenTelemetry 系统的数据摄入与集成如配置一个端点一样简单。此外,它还提供了高级的多租户功能,例如通过 shuffle-sharding 的方式实现各租户间的隔离,避免异常的租户(比如正在执行高负载查询或者出现故障)影响到集群中的其他租户。如果你还没有关注 Grafana 生态系统的最新发展,现在正是个好时机——它正在快速地演进。

  • 29. Grafana Tempo

    Grafana Tempo 是一个高可扩展的分布式追踪后端,支持诸如 OpenTelemetry 等开放标准。Tempo 专为成本效率设计,依赖对象存储进行长期追踪数据的保存,并支持追踪查询、基于 Span 的指标生成以及与日志和指标的关联。Tempo 默认使用 Apache Parquet 为基础的列式块格式,提高了查询性能,并使下游工具能够访问追踪数据。查询通过 TraceQLTempo CLI 执行。Grafana Alloy 也可以配置以收集并转发追踪数据到 Tempo。我们的团队在 GKE 上自托管了 Tempo,使用 MinIO 作为对象存储,结合 OpenTelemetry 收集器以及 Grafana 用于追踪数据的可视化。

  • 30. Railway

    Heroku 曾是许多开发者快速发布和部署应用程序的优秀选择。近年来,我们也看到了像 Vercel 这样更现代、轻量且易用的平台的兴起,虽然它们主要面向前端应用的部署。而在全栈部署领域的一个替代选择是 Railway,这是一个云端 PaaS 平台,简化了从 GitHub/Docker 部署到生产环境可观测性的整个流程。

    Railway 支持大多数主流编程框架、数据库以及容器化部署。作为应用程序的长期托管平台,您可能需要仔细比较不同平台的成本。目前,我们的团队对 Railway 的部署和可观测性体验良好。其操作流畅,并且能够很好地与我们倡导的 持续部署 实践相结合。

  • 31. Unblocked

    Unblocked 是一款现成的 AI 团队助手。通过与代码库、企业文档平台、项目管理工具以及沟通工具的集成,Unblocked 能帮助解答关于复杂业务和技术概念、架构设计与实现以及操作流程的问题。这在处理大型或遗留系统时尤为有用。在使用 Unblocked 的过程中,我们观察到团队更重视快速获取与代码和用户故事相关的上下文信息,而非生成代码或用户故事。对于这些生成任务,特别是编码场景,软件工程代理 更为适合。

  • 32. Weights & Biases

    Weights & Biases 持续发展,自上次在技术雷达中提及以来,增加了更多面向 LLM 的功能。他们扩展了 Traces 并推出了 Weave,一个超越 LLM 系统跟踪的完整平台。Weave 支持创建系统评估、定义自定义指标、使用 LLM 作为任务(如摘要)的评判工具,并保存数据集以捕捉不同行为进行分析。这有助于优化 LLM 组件,并在本地和全局层面跟踪性能。该平台还支持迭代开发和高效调试,这对错误难以检测的代理系统尤为重要。此外,它还允许收集宝贵的人类反馈,这些反馈可以用于后续模型微调,从而进一步提升模型的表现和可靠性。

评估 ?

  • 33. Arize Phoenix

    With the popularity of LLM and agentic applications, LLM observability is becoming more and more important. Previously, we’ve recommended platforms such as Langfuse and Weights & Biases (W&B). Arize Phoenix is ​​another emerging platform in this space, and our team has had a positive experience using it. It offers standard features like LLM tracing, evaluation and prompt management, with seamless integration into leading LLM providers and frameworks. This makes it easy to gather insights on LLM output, latency and token usage with minimal configuration. So far, our experience is limited to the open-source tool but the broader Arize platform offers more comprehensive capabilities. We look forward to exploring it in the future.

  • 34. Chainloop

    Chainloop is an open-source supply chain security platform that helps security teams enforce compliance while allowing development teams to seamlessly integrate security compliance into CI/CD pipelines. It consists of a control plane, which acts as the single source of truth for security policies, and a CLI, which runs attestations within CI/CD workflows to ensure compliance. Security teams define workflow contracts specifying which artifacts — such as SBOMs and vulnerability reports — must be collected, where to store them and how to evaluate compliance. Chainloop uses Rego, OPA's policy language, to validate attestations — for example, ensuring a CycloneDX SBOM meets version requirements. During workflow execution, security artifacts like SBOMs are attached to an attestation and pushed to the control plane for enforcement and auditing. This approach ensures compliance can be enforced consistently and at scale while minimizing friction in development workflows. This results in an SLSA level-three–compliant single source of truth for metadata, artefacts and attestations.

  • 35. Deepseek R1

    DeepSeek-R1 is DeepSeek's first-generation of reasoning models. Through a progression of non-reasoning models, the engineers at DeepSeek designed and used methods to maximize hardware utilization. These include Multi-Head Latent Attention (MLA), Mixture of Experts (MoE) gating, 8-bit floating points training (FP8) and low-level PTX programming. Their high-performance computing co-design approach enables DeepSeek-R1 to rival state-of-the-art models at significantly reduced cost for training and inference.

    DeepSeek-R1-Zero is notable for another innovation: the engineers were able to elicit reasoning capabilities from a non-reasoning model using simple reinforcement learning without any supervised fine-tuning. All DeepSeek models are open-weight, which means they are freely available, though training code and data remain proprietary. The repository includes six dense models distilled from DeepSeek-R1, based on Llama and Qwen, with DeepSeek-R1-Distill-Qwen-32B outperforming OpenAI-o1-mini on various benchmarks.

  • 36. Deno

    Created by Ryan Dahl, the inventor of Node.js, Deno was designed to address what he saw as mistakes in Node.js. It features a stricter sandboxing system, built-in dependency management and native TypeScript support — a key draw for its user base. Many of us prefer Deno for TypeScript projects, as it feels like a true TypeScript run time and toolchain, rather than an add-on to Node.js.

    Since its inclusion in the Radar in 2019, Deno has made significant advancements. The Deno 2 release introduces backward compatibility with Node.js and npm libraries, long-term support (LTS) releases and other improvements. Previously, one of the biggest barriers to adoption was the need to rewrite Node.js applications. These updates reduce migration friction while expanding dependency options for supporting tools and systems. Given the massive Node.js and npm ecosystem, these changes should drive further adoption.

    Additionally, Deno’s Standard Library has stabilized, helping combat the proliferation of low-value npm packages across the ecosystem. Its tooling and Standard Library make TypeScript or JavaScript more appealing for server-side development. However, we caution against choosing a platform solely to avoid polyglot programming.

  • 37. Graphiti

    Graphiti builds dynamic, temporally-aware knowledge graphs that capture evolving facts and relationships. Our teams use GraphRAG to uncover data relationships, which enhances retrieval and response accuracy. As data sets constantly evolve, Graphiti maintains temporal metadata on graph edges to record relationship lifecycles. It ingests both structured and unstructured data as discrete episodes and supports queries using a fusion of time-based, full-text, semantic and graph algorithms. For LLM-based applications — whether RAG or agentic — Graphiti enables long-term recall and state-based reasoning.

  • 38. Helicone

    Similar to Langfuse, Weights & Biases and Arize Phoenix, Helicone is a managed LLMOps platform designed to meet the growing enterprise demand for LLM cost management, ROI evaluation and risk mitigation. Open-source and developer-focused, Helicone supports production-ready AI applications, offering prompt experimentation, monitoring, debugging and optimization across the entire LLM lifecycle. It enables real-time analysis of costs, utilization, performance and agentic stack traces across various LLM providers. While it simplifies LLM operations management, the platform is still emerging and may require some expertise to fully leverage its advanced features. Our team has been using it with good experience so far.

  • 39. Humanloop

    Humanloop is an emerging platform focused on making AI systems more reliable, adaptable and aligned with user needs by integrating human feedback at key decision points. It offers tools for human labeling, active learning and human-in-the-loop fine-tuning as well as LLM evaluation against business requirements. Additionally, it helps manage the cost-effective lifecycle of GenAI solutions with greater control and efficiency. Humanloop supports collaboration through a shared workspace, version-controlled prompt management and CI/CD integration to prevent regressions. It also provides observability features such as tracing, logging, alerting and guardrails to monitor and optimize AI performance. These capabilities make it particularly relevant for organizations deploying AI in regulated or high-risk domains where human oversight is critical. With its focus on responsible AI practices, Humanloop is worth evaluating for teams looking to build scalable and ethical AI systems.

  • 40. Model Context Protocol (MCP)

    One of the biggest challenges in prompting is ensuring the AI tool has access to all the context relevant to the task. Often, this context already exists within the systems we use all day: wikis, issue trackers, databases or observability systems. Seamless integration between AI tools and these information sources can significantly improve the quality of AI-generated outputs.

    The Model Context Protocol (MCP), an open standard released by Anthropic, provides a standardized framework for integrating LLM applications with external data sources and tools. It defines MCP servers and clients, where servers access the data sources and clients integrate and use this data to enhance prompts. Many coding assistants have already implemented MCP integration, allowing them to act as MCP clients. MCP servers can be run in two ways: Locally, as a Python or Node process running on the user’s machine, or remotely, as a server that the MCP client connects to via SSE (though we haven't seen any usage of the remote server variant yet). Currently, MCP is primarily used in the first way, with developers cloning open-source MCP server implementations. While locally run servers offer a neat way to avoid third-party dependencies, they remain less accessible to nontechnical users and introduce challenges such as governance and update management. That said, it's easy to imagine how this standard could evolve into a more mature and user-friendly ecosystem in the future.

  • 41. Open WebUI

    Open WebUI is an open-source, self-hosted AI platform with a versatile feature set. It supports OpenAI-compatible APIs and integrates with providers like OpenRouter and GroqCloud, among others. It can run entirely offline by connecting to local or self-hosted models via Ollama. Open WebUI includes a built-in capability for RAG, allowing users to interact with local and web-based documents in a chat-driven experience. It offers granular RBAC controls, enabling different models and platform capabilities for different user groups. The platform is extensible through Functions — Python-based building blocks that customize and enhance its capabilities. Another key feature is model evaluation, which includes a model arena for side-by-side comparisons of LLMs on specific tasks. Open WebUI can be deployed at various scales — as a personal AI assistant, a team collaboration assistant or an enterprise-grade AI platform.

  • 42. pg_mooncake

    pg_mooncake is a PostgreSQL extension that adds columnar storage and vectorized execution. Columnstore tables are stored as Iceberg or Delta Lake tables in the local file system or S3-compatible cloud storage. pg_mooncake supports loading data from file formats like Parquet, CSV and even Hugging Face datasets. It can be a good fit for heavy data analytics that typically requires columnar storage, as it removes the need to add dedicated columnar store technologies into your stack.

  • 43. Reasoning models

    One of the most significant AI advances since the last Radar is the breakthrough and proliferation of reasoning models. Also marketed as "thinking models," these models have achieved top human-level performance in benchmarks like frontier mathematics and coding.

    Reasoning models are usually trained through reinforcement learning or supervised fine-tuning, enhancing capabilities such as step-by-step thinking (CoT), exploring alternatives (ToT) and self-correction. Examples include OpenAI’s o1/o3, DeepSeek R1 and Gemini 2.0 Flash Thinking. However, these models should be seen as a distinct category of LLMs rather than simply more advanced versions.

    This increased capability comes at a cost. Reasoning models require longer response time and higher token consumption, leading us to jokingly call them "Slower AI" (as if current AI wasn’t slow enough). Not all tasks justify this trade-off. For simpler tasks like text summarization, content generation or fast-response chatbots, general-purpose LLMs remain the better choice. We advise using reasoning models in STEM fields, complex problem-solving and decision-making — for example, when using LLMs as judges or improving explainability through explicit CoT outputs. At the time of writing, Claude 3.7 Sonnet, a hybrid reasoning model, had just been released, hinting at a possible fusion between traditional LLMs and reasoning models.

  • 44. Restate

    Restate is a durable execution platform, similar to Temporal, developed by the original creators of Apache Flink. Feature-wise it offers workflows as code, stateful event processing, the saga pattern and durable state machines. Written in Rust and deployed as a single binary, it uses a distributed log to record events, implemented using a virtual consensus algorithm based on Flexible Paxos; this ensures durability in the event of node failure. SDKs are available for the usual suspects: Java, Go, Rust and TypeScript. We still maintain that it's best to avoid distributed transactions in distributed systems, because of both the additional complexity and the inevitable additional operational overhead involved. However, this platform is worth assessing if you can’t avoid distributed transactions in your environment.

  • 45. Supabase

    Supabase is an open-source Firebase alternative for building scalable and secure backends. It offers a suite of integrated services, including a PostgreSQL database, authentication, instant APIs, Edge Functions, real-time subscriptions, storage and vector embeddings. Supabase aims to streamline back-end development, allowing developers to focus on building front-end experiences while leveraging the power and flexibility of open-source technologies. Unlike Firebase, Supabase is built on top of PostgreSQL. If you're working on prototyping or an MVP, Supabase is worth considering, as it will be easier to migrate to another SQL solution after the prototyping stage.

  • 46. Synthesized

    A common challenge in software development is generating test data for development and test environments. Ideally, test data should be as production-like as possible, while ensuring no personally identifiable or sensitive information is exposed. Though this may seem straightforward, test data generation is far from simple. That's why we’re interested in Synthesized — a platform that can mask and subset existing production data or generate statistically relevant synthetic data. It integrates directly into build pipelines and offers privacy masking, providing per-attribute anonymization through irreversible data obfuscation techniques such as hashing, randomization and binning. Synthesized can also generate large volumes of synthetic data for performance testing. While it includes the obligatory GenAI features, its core functionality addresses a real and persistent challenge for development teams, making it worth exploring.

  • 47. Tonic.ai

    Tonic.ai is part of a growing trend in platforms designed to generate realistic, de-identified synthetic data for development, testing and QA environments. Similar to Synthesized, Tonic.ai is a platform with a comprehensive suite of tools addressing various data synthesis needs in contrast to the library-focused approach of Synthetic Data Vault. Tonic.ai generates both structured and unstructured data, maintaining the statistical properties of production data while ensuring privacy and compliance through differential privacy techniques. Key features include automatic detection, classification and redaction of sensitive information in unstructured data, along with on-demand database provisioning via Tonic Ephemeral. It also offers Tonic Textual, a secure data lakehouse that helps AI developers leverage unstructured data for retrieval-augmented generation (RAG) systems and LLM fine-tuning. Teams looking to accelerate engineering velocity while generating scalable, realistic data — all while adhering to stringent data privacy requirements — should consider evaluating Tonic.ai.

  • 48. turbopuffer

    turbopuffer is a serverless, multi-tenant search engine that seamlessly integrates vector and full-text search on object storage. We quite like its architecture and design choices, particularly its focus on durability, scalability and cost efficiency. By using object storage as a write-ahead log while keeping its query nodes stateless, it’s well-suited for high-scale search workloads.

    Designed for performance and accuracy, turbopuffer delivers high recall out of the box, even for complex filter-based queries. It caches cold query results on NVMe SSDs and keeps frequently accessed namespaces in memory, enabling low-latency search across billions of documents. This makes it ideal for large-scale document retrieval, vector search and retrieval-augmented generation (RAG) AI applications. However, its reliance on object storage introduces trade-offs in query latency, making it most effective for workloads that benefit from stateless, distributed compute. turbopuffer powers high-scale production systems like Cursor but is currently only available by referral or invitation.

  • 49. VectorChord

    VectorChord is a PostgreSQL extension for vector similarity search, developed by the creators of pgvecto.rs as its successor. It’s open source, compatible with pgvector data types and designed for disk-efficient, high-performance vector search. It employs inverted file indexing (IVF) along with RaBitQ quantization to enable fast, scalable and accurate vector search while significantly reducing computation demands. Like other PostgresSQL extensions in this space, it leverages the PostgreSQL ecosystem, allowing vector search alongside standard transactional operations. Though still in its early stages, VectorChord is worth assessing for vector search workloads.

暂缓 ?

  • 50. Tyk hybrid API management

    We've observed multiple teams encountering issues with the Tyk hybrid API management solution. While the concept of a managed control plane and self-managed data planes offers flexibility for complex infrastructure setups (such as multi-cloud and hybrid cloud), teams have experienced control plane incidents that were only discovered internally rather than by Tyk, highlighting potential observability gaps in Tyk's AWS-hosted environment. Furthermore, the level of incident support appears slow; communicating via tickets and emails isn’t ideal in these situations. Teams have also reported issues with the maturity of Tyk's documentation, often finding it inadequate for complex scenarios and issues. Additionally, other products in the Tyk ecosystem seem immature as well, for example, the enterprise developer portal is reported to not be backward compatible and has limited customization capabilities. Especially for Tyk’s hybrid setup, we recommend proceeding with caution and will continue to monitor its maturity.

无法找到需要的信息?

 

每期技术雷达中的条目都在试图反映我们在过去六个月中的技术洞见,或许你所搜索的内容已经在前几期中出现过。由于我们有太多想要谈论的内容,有时候不得不剔除一些长期没有发生变化的条目。技术雷达来自于我们的主观经验,而非全面的市场分析,所以你可能会找不到自己最在意的技术条目。

下载 PDF

 

English | Español | Português | 中文

订阅技术雷达简报

 

立即订阅

查看存档并阅读往期内容