Enable javascript in your browser for better experience. Need to know to enable it? Go here.
第32期 | 四月2025

工具

  • 工具

    采纳 试验 评估 暂缓 采纳 试验 评估 暂缓
  • 新的
  • 移进/移出
  • 没有变化

工具

采纳 ?

  • 51. Renovate

    Renovate 已经变成了我们很多团队在依赖项版本管理工具的首选。虽然Dependabot仍然是 GitHub 仓库的一个安全默认选项,我们依然推荐评估 Renovate,因为它提供了更全面且可定制的方案。为了最大程度发挥 Renovate 的优势,应配置它来监控并更新所有依赖项,包括工具、基础设施以及私有或内部托管的依赖项。同时为了减少开发者的工作量,可以考虑自动合并依赖更新的 PR.

  • 52. uv

    自上一次技术雷达以来,我们积累了更多关于 uv 的实践经验,并收到了团队的极大好评。uv 是一个由 Rust 编写的下一代 Python 包和项目管理工具,其核心价值主张是“极快的速度”。在基准测试中,uv 的性能远超其他 Python 包管理工具,加速了构建和测试周期,显著提升了开发者体验。除了性能,uv 还提供了统一的工具集,有效取代了像 Poetry、pyenv 和 pipx 等工具。然而,我们对包管理工具的担忧依然存在:一个强大的生态系统、成熟的社区和长期支持至关重要。由于 uv 相对较新,将其移至 Adopt 阶段是一个大胆的决定。然而,许多数据团队都渴望 摆脱 Python 的传统包管理系统。而我们的前线开发者也一致推荐 uv,认为这是目前最好的工具。

  • 53. Vite

    自从 Vite 上次出现在 radar 以来,它的影响力在进一步提升。作为一个高性能的前端构建工具,Vite 提供了快速热重载的特性。它正在被众多前端框架采用并推荐为默认选择,比如 Vue, SvelteKit,以及最近 废弃了 create-react-app的 React。此外 Vite 最近获得了重大投资,这促成了VoidZero 的成立。VoidZero 是一个专注于 Vite 发展的组织,这笔投资预计将加速 Vite 的开发,并提升其项目的长期可持续性。

试验 ?

  • 54. Claude Sonnet

    Claude Sonnet 是一款擅长编码,写作,分析和视觉处理的先进语言模型。它可在浏览器,终端和大多数主流 IDE 中使用,并支持与GitHub Copilot 集成。截至目前的基准测试显示,该模型在 3.5 和 3.7 版本中的表现显著优于早前推出的模型。它还擅长解析图表并从图片中提取文本,同时提供以开发者体验为中心的特性,譬如它在浏览器 UI 中的“Artifacts”功能,用于生成和交互动态内容(如代码片段和 HTML 设计)。

    我们在软件开发中较为广泛地使用了 Claude Sonnet 的 3.5 版本,并发现它在多个项目中显著地提高了生产力。它在从零开始的项目(greenfield projects)中表现尤为出色,特别是在协同软件设计和架构讨论方面。尽管目前还难言有 AI 模型是“稳定”的编码助手,但 Claude Sonnet 已经是我们使用过的模型中最可靠的一个。截至撰写本文时,Claude 3.7也已发布,并展现出很大潜力,但我们尚未在生产环境中进行全面测试。

  • 55. Cline

    Cline 是一个开源的 VSCode 扩展程序,目前在监督型 软件工程代理领域中是最强有力的竞争者之一。它让开发者能够完全通过 Cline 聊天来驱动实现,并与他们已经使用的 IDE 无缝集成。Cline 的核心功能包括计划与执行模式(Plan & Act mode)、透明的 token 使用跟踪以及 MCP 集成,帮助开发者高效地与 LLM 交互。Cline 展现了在处理复杂开发任务方面的高级能力,尤其是在结合 Claude 3.5 Sonnet 时表现突出。它支持处理大型代码库、自动化无头浏览器测试,并能够主动修复错误。与基于云的解决方案不同,Cline 通过本地存储数据增强了隐私保护。其开源特性不仅确保了更高的透明度,还支持社区驱动的改进。需要注意的是,由于 Cline 的代码上下文编排虽然高效,但资源消耗较高,因此开发者需关注 token 使用成本。另外,Cline 可能面临 速率限制 的潜在瓶颈,这可能会减慢工作流。在此问题解决之前,建议使用像 OpenRouter 这样提供更优速率限制的 API 提供商。

  • 56. Cursor

    我们依然对 Cursor 这款以 AI 为核心的代码编辑器印象深刻,它在竞争激烈的 AI 代码辅助领域依然保持领先地位。其代码上下文编排功能十分高效,并支持多种模型,包括使用自定义 API 密钥的选项。Cursor 团队经常在其他厂商之前推出创新的用户体验功能,并在聊天功能中集成了丰富的上下文提供者,例如 Git 差异对比(git diffs)、先前的 AI 对话、网络搜索、库文档以及 MCP 集成等。与 ClineWindsurf 等工具类似,Cursor 也因其强大的智能代理编码模式而脱颖而出。该模式允许开发者直接通过 AI 聊天界面指导实现过程,工具可以自主读取和修改文件,并执行命令。此外,我们还欣赏 Cursor 在检测生成代码的 lint 及编译错误方面的能力,它能够主动识别并进行修正。

  • 57. D2

    D2 是一个开源的图表即代码工具,帮助用户通过文本创建和定制图表。它引入了 D2 图表脚本语言,以简单的声明式语法优先保证可读性而非紧凑性。D2 自带默认主题,并使用与 Mermaid 相同的布局引擎。我们的团队非常欣赏其轻量化的语法,尤其适用于软件文档和架构图的场景。

  • 58. Databricks Delta Live Tables

    Delta Live Tables(DLT)在简化和优化数据管道管理方面持续展现出其价值,通过声明式方法支持实时流处理和批量处理。通过自动化复杂的数据工程任务(如手动检查点管理),DLT 减少了运营开销,同时确保了端到端系统的稳健性。其简化管道编排的能力,几乎无需人工干预,大大提升了可靠性和灵活性。此外,像物化视图这样的功能为特定用例提供了增量更新和性能优化。

    然而,团队需要深入理解 DLT 的细节,才能充分利用其优势并规避潜在的陷阱。作为一种有主见的抽象层,DLT 自行管理表数据,并限制每次仅允许单个管道插入数据。流式表仅支持追加操作,这需要在设计中仔细考量。此外,删除 DLT 管道时会同时删除底层表和数据,可能带来一定的操作性问题。

  • 59. JSON Crack

    JSON Crack 是一个用于在 Visual Studio Code 中将文本数据渲染为交互式图表的扩展工具。尽管其名称中包含“JSON”,但它实际上支持包括 YAML、TOML 和 XML 在内的多种格式。与 MermaidD2 不同,这些工具通过文本形式生成特定的可视化图表,而 JSON Crack 则是一个用于直观查看以文本格式存储数据的工具。其布局算法表现良好,并且支持选择性隐藏分支和节点,是探索数据集的绝佳选择。此外,它还提供了一个基于 Web 的配套工具,但我们对在线代码格式化或解析服务的依赖仍需保持谨慎。需要注意的是,JSON Crack 有节点数量的限制,对于超过几百个节点的文件会引导用户使用其商业版工具。

  • 60. MailSlurp

    测试涉及电子邮件的工作流通常复杂且耗时。开发团队需要为自动化构建自定义的电子邮件 API 客户端,同时还需要设置临时收件箱以满足手动测试场景的需求,例如在主要发布之前进行用户测试或内部产品培训。当开发客户入职产品时,这些挑战会变得更加明显。我们对 MailSlurp 的使用体验非常积极。它是一个邮件服务器和 SMS API 服务,提供用于创建收件箱和电话号码的 REST API,同时还支持直接在代码中验证电子邮件和消息。其无代码的仪表板对于手动测试准备也非常有用。此外,像自定义域名、webhook、自动回复和转发等功能在更复杂的场景中也值得一试。

  • 61. Metabase

    Metabase 是一款开源的分析和商业智能工具,允许用户从各种数据源(包括关系型数据库和 NoSQL 数据库)中可视化和分析数据。该工具帮助用户创建可视化和报告,将其组织到仪表板中,并轻松分享数据洞察。此外,它还提供了一个 SDK,用于在 Web 应用程序中嵌入交互式仪表板,并能够匹配应用程序的主题和样式——这使其对开发者非常友好。通过官方支持和社区支持的数据连接器,Metabase 在不同的数据环境中表现出极大的灵活性。作为一款轻量级的 BI 工具,我们的团队发现它在管理应用程序中的交互式仪表板和报告方面非常实用。

  • 62. NeMo Guardrails

    NeMo Guardrails 是 NVIDIA 提供的一个易于使用的开源工具包,可帮助开发者为用于对话式应用的大型语言模型实施“护栏”。自我们上一次在技术雷达中提到它以来,NeMo 在团队中的应用显著增加,并且不断改进。最近对 NeMo Guardrails 的更新主要集中在扩展集成能力和加强安全性、数据管理及控制方面,与该项目的核心目标保持一致。

    NeMo 的文档进行了重大改进,提高了可用性,并新增了多个集成,包括 AutoAlignPatronus Lynx,同时支持 Colang 2.0。关键升级包括增强了内容安全性和安全功能,以及最近发布的支持通过输出轨道流式处理 LLM 内容的功能,从而提高性能。我们还看到新增了对 Prompt Security 的支持。此外,NVIDIA 还发布了三种新的微服务:内容安全微服务主题控制微服务越狱检测微服务,这些微服务都已集成至 NeMo Guardrails。

    基于其不断扩展的功能集和在生产中的日益广泛使用,我们将 NeMo Guardrails 的状态提升至试验(Trial)。建议查看最新的发布说明,以全面了解自我们上次提到以来的所有更新内容。

  • 63. Nyx

    Nyx 是一个多功能的语义化版本发布工具,支持各种软件工程项目。它对编程语言无依赖,并兼容所有主流的持续集成和源代码管理平台,具备极高的适配性。尽管许多团队在 主干开发 中使用语义化版本管理,Nyx 还支持 Gitflow、OneFlow 和 GitHub Flow 等工作流。在生产环境中,Nyx 的一大优势是其自动生成变更日志的能力,并且内置支持 Conventional Commits 规范。

    如前几期技术雷达中所提到的,我们对依赖长期分支的开发模式(如 GitflowGitOps)持谨慎态度,因为这些模式引入了许多挑战,即使是像 Nyx 这样强大的工具也难以完全解决这些问题。我们强烈推荐在 CI/CD 工作流中尝试 Nyx,尤其是在主干开发中,我们已经多次见证其成功应用。

  • 64. OpenRewrite

    OpenRewrite 一直是我们进行大规模代码重构的得力工具,尤其适用于基于规则的重构场景,例如迁移到广泛使用的库的新 API 版本,或对从相同模板创建的多个服务进行更新。除了对 Java 的强大支持外,OpenRewrite 还引入了对 JavaScript 等语言的支持。在框架(如 Angular)采用短期 LTS 发布周期的背景下,保持项目及时升级变得越来越重要,而 OpenRewrite 在这一过程中表现出色。虽然使用 AI 编程助手是另一种选择,但对于基于规则的更改,AI 通常运行较慢、成本更高且可靠性较低。我们特别欣赏 OpenRewrite 内置的丰富规则集(recipes),这些规则明确描述了需要执行的更改。其重构引擎、内置规则集以及构建工具插件均为开源软件,这为团队在需要进行大规模代码更新时提供了更大的便利和灵活性。

  • 65. Plerion

    Plerion 是一个专注于 AWS 的云安全平台,通过与托管服务提供商集成,帮助发现云基础设施、服务器和应用程序中的风险、错误配置和漏洞。与 Wiz类似,Plerion 使用基于风险的优先级策略对检测到的问题进行排序,旨在帮助用户“专注于最重要的 1% 问题”。 我们的团队对 Plerion 的使用体验非常积极,认为它为客户提供了重要的洞察力,并进一步强调了对组织实施主动安全监控的重要性。

  • 66. 软件工程代理(software engineering agents)

    自我们六个月前首次讨论 软件工程代理(software engineering agents) 以来,行业内仍然缺乏对“代理(Agent)”这一术语的统一定义。然而,一个重要的进展已经浮现——并非完全自主的编码代理(其能力仍然令人怀疑)——而是在 IDE 内的监督代理模式(supervised agentic modes)。这些模式允许开发者通过聊天驱动实现,工具不仅可以修改多个文件中的代码,还能执行命令、运行测试并响应 IDE 反馈(如 linting 或编译错误)。

    这种方式有时被称为“面向聊天的编程(chat-oriented programming,CHOP)”或“从提示到代码(prompt-to-code)”,它让开发者保持控制的同时,将更多责任转移给 AI,这跟传统的自动补全类辅助工具有很大不同。该领域的领先工具包括 CursorClineWindsurf,而 GitHub Copilot 稍显落后,但正在快速追赶。这些代理模式的有效性取决于所使用的模型(以 Claude's Sonnet 系列为当前业界领先)以及工具与 IDE 集成的深度,为开发者提供良好的体验。

    我们发现这些工作流具有吸引力且潜力巨大,并显著提高了编码速度。然而,保持问题范围小有助于开发者更好地审查 AI 生成的更改。这种方法在低抽象提示以及AI 友好的代码库中效果最佳——这些代码库结构良好且经过充分测试。随着这些模式的改进,它们也会加剧开发者自满于 AI 生成的代码。为了缓解这一问题,我们建议采用结对编程和其他严格的审查实践,尤其是在生产代码中。

  • 67. Tuple

    Tuple 是一款专为远程结对编程优化的工具,最初设计是为了填补 Slack 的 Screenhero 停止服务后留下的空白。自我们上次在技术雷达中提到它以来,Tuple 已获得更广泛的应用,并解决了之前的诸多问题和限制,现在还支持 Windows 平台。一个关键的改进是增强了桌面共享功能,新增的隐私功能允许用户在共享屏幕时隐藏私人应用窗口(如短信),同时专注于共享工具(例如浏览器窗口)。此前,UI 限制让 Tuple 更像是一个专用的结对编程工具,而不是通用的协作工具。随着这些更新,用户现在可以在 IDE 之外的内容上进行协作。

    但需要注意的是,远程结对的伙伴可以访问整个桌面。如果没有正确配置,这可能会成为安全隐患,尤其是在结对伙伴不够值得信任的情况下。我们强烈建议在使用 Tuple 之前,教育团队了解其隐私设置、最佳实践和使用礼仪。

    我们鼓励团队将最新版 Tuple 纳入开发工作流中进行尝试。它与我们的务实的远程结对建议一致,提供了低延迟的结对体验、直观的用户体验(UX)和显著的易用性改进。

  • 68. Turborepo

    Turborepo 通过分析、缓存、并行化和优化构建任务,帮助管理大型 JavaScript 或 TypeScript monorepo,从而加速构建过程。在大型 monorepo 中,项目之间通常存在相互依赖关系;每次更改时重新构建所有依赖项既低效又耗时,而 Turborepo 简化了这一过程。与Nx 不同,Turborepo 的默认配置使用多个 package.json 文件(每个项目一个)。这种方式允许在单个 monorepo 中包含不同版本的依赖项(如不同版本的 React),而 Nx 并不提倡这种做法。尽管这可能被视为一种反模式,但它确实能够解决某些特定用例,例如从多仓库迁移到 monorepo 的过程中,团队可能暂时需要多个版本的依赖项。根据我们的经验,Turborepo 设置相对简单且性能表现优秀。

评估 ?

  • 69. AnythingLLM

    AnythingLLM is an open-source desktop application to chat with large documents or pieces of content, backed by out-of-the-box integration with LLMs and vector databases. It has a pluggable architecture for embedder models and can be used with most of the commercial LLMs as well as open-weight models that can be managed by Ollama. In addition to RAG, different skills can be created and organized as agents to perform custom tasks and workflows. It lets users organize the documents and interactions with them in different workspaces and they act as long lived threads with different contexts. Recently, it also became possible to deploy it as a multi-user web application with a simple Docker image. Some of our teams are using it as a local personal assistant and finding it a powerful and useful utility.

  • 70. Gemma Scope

    Mechanistic interpretability — understanding the inner workings of large language models — is becoming an increasingly important field. Tools like Gemma Scope and the open-source library Mishax provide insights into the Gemma2 family of open models. Interpretability tools play a crucial role in debugging unexpected behavior, identifying components responsible for hallucinations, biases or other failure cases, and ultimately building trust by offering deeper visibility into models. While this field may be of particular interest to researchers, it's worth noting that with the recent release of DeepSeek-R1, model training is becoming more feasible for companies beyond the established players. As GenAI continues to evolve, both interpretability and safety will only grow in importance.

  • 71. Hurl

    Hurl is a Swiss Army knife for making sequences of HTTP requests, defined in plain text files using Hurl-specific syntax. Beyond sending requests, Hurl can validate responses, ensuring a request returns a specific HTTP status code; assert conditions on response headers or content using XPATH, JSONPath or regular expressions; and extract response data into variables, which can then be used to chain requests.

    With its feature set, Hurl is useful for simple API automations but also serves as an automated API testing tool. Its ability to generate detailed test reports in HTML or JSON enhances its utility for testing workflows. While dedicated tools like Bruno and Postman offer GUIs and additional features, we like Hurl for its simplicity. Like Bruno, which also uses plain text files, Hurl tests can be stored in the code repository.

  • 72. Jujutsu

    Git is the dominant distributed version control system (VCS), holding the vast majority of market share. Yet, despite over a decade of dominance, developers still struggle with its complex workflows for branching, merging, rebasing and conflict resolution. This ongoing frustration has fueled a wave of tools designed to ease the pain — some offering visualizations to clarify complexity, others providing their own graphical interfaces to abstract it away entirely.

    Jujutsu takes this a step further, offering a full-fledged alternative to Git while maintaining compatibility by using Git repositories as a storage backend. This allows developers to utilise existing Git servers and services while benefiting from Jujutsu's streamlined workflows. Positioned as "both simple and powerful," Jujutsu emphasizes ease of use for developers of all experience levels. One standout feature is its first-class conflict resolution, which has the potential to significantly improve the developer experience.

  • 73. kubenetmon

    Monitoring and understanding the network traffic associated with Kubernetes can prove a challenge, particularly when your infrastructure spans multiple zones, regions or clouds. kubenetmon, built by ClickHouse and recently open sourced, hopes to solve this problem by offering detailed Kubernetes data transfer metering across the major cloud providers. If you're running Kubernetes and have been frustrated by opaque data transfer costs on your bill it may be worth exploring kubenetmon.

  • 74. Mergiraf

    Resolving merge conflicts is probably one of the least liked activities in software development. And while there are techniques that reduce the complexity of merges — for example, practicing continuous integration in the original sense of merging to a shared mainline at least daily — we're seeing too much effort spent on merges. Long-lived feature branches are one culprit, but AI-assisted coding also has a tendency to increase the size of change sets. Help may come in the form of Mergiraf, a new tool that resolves merge conflicts by looking at the syntax tree rather than treating code as lines of text. As a git merge driver, it can be set up so that git subcommands like merge and cherry-pick automatically use Mergiraf instead of the default heuristics.

  • 75. ModernBERT

    The successor to BERT (Bidirectional Encoder Representations from Transformers), ModernBERT is a next-generation family of encoder-only transformer models designed for a wide range of natural language processing (NLP) tasks. As a drop-in replacement, ModernBERT improves both performance and accuracy while addressing some of BERT's limitations — notably including support for dramatically longer context lengths thanks to Alternating Attention. Teams with NLP needs should consider ModernBERT before defaulting to a general-purpose generative model.

  • 76. OpenRouter

    OpenRouter is a unified API for accessing multiple large language models. It provides a single integration point for mainstream LLM providers, simplifies experimentation, reduces vendor lock-in, and optimizes costs by routing requests to the most appropriate model. Popular tools like Cline and Open WebUI use OpenRouter as their endpoint. During our Radar discussion, we questioned whether most projects truly need to switch between models, given that OpenRouter must add price markup as a profit model on top of this encapsulation layer. However, we also recognize that OpenRouter provides various load-balancing strategies to help optimize costs. One particularly useful feature is its ability to bypass API rate limits. If your application exceeds the rate limit of a single LLM provider, OpenRouter can help you break through this limitation and achieve better throughput.

  • 77. Redactive

    Redactive is an enterprise AI enablement platform designed to help regulated organizations securely prepare unstructured data for AI applications, such as AI-powered assistants and copilots. It integrates with content platforms like Confluence, creating secure text indices for retrieval-augmented generation (RAG) searches. By serving only live data and enforcing real-time user permissions from source systems, Redactive ensures AI models access accurate, authorized information without compromising security. Additionally, it provides engineering teams with tools to build AI use cases safely using any LLM. For organizations exploring AI-driven solutions, Redactive offers a streamlined approach to data preparation and compliance, balancing security and accessibility for teams experimenting with AI capabilities in a controlled environment.

  • 78. System Initiative

    We continue to be excited by System Initiative. This experimental tool represents a radical new direction for DevOps work. We really like the creative thinking that has gone into this tool and hope it will encourage others to break with the status quo of infrastructure-as-code approaches. System Initiative is now out of beta and available free and open source under an Apache 2.0 license. While the tool’s developers use it to manage production infrastructure, it still has a way to go before it can scale to meet the demands of large enterprises. However, we continue to think it's worth checking out to experience a completely different approach to DevOps tooling.

  • 79. TabPFN

    TabPFN is a transformer-based model designed for fast and accurate classification on small tabular data sets. It leverages in-context learning (ICL) to make predictions directly from labeled examples without hyperparameter tuning or additional training. Pretrained on millions of synthetic data sets, TabPFN generalizes well across diverse data distributions and handles missing values and outliers effectively. Its strengths include efficient processing of heterogeneous data and robustness to uninformative features.

    TabPFN is particularly suitable for small-scale applications where speed and accuracy are crucial. However, it faces scalability challenges with larger data sets and has limitations in handling regression tasks. As a cutting-edge solution, TabPFN is worth evaluating for its potential to outperform traditional models in tabular classification, especially where transformers are less commonly applied.

  • 80. v0

    v0 by Vercel is an AI tool for generating front-end code from a screenshot, Figma design or simple prompt. It supports React, Vue, shadcn and Tailwind among other front-end frameworks. Beyond AI-generated code, v0 offers a great user experience, including the ability to preview the generated code and deploy it to Vercel in one step. While building real-world applications involves integrating multiple functionalities beyond a single screen, v0 provides a solid way to prototype and can be used to initialize a starting point for developing complex applications.

  • 81. Windsurf

    Windsurf is an AI coding assistant by Codeium that stands out for its agentic capabilities. Similar to Cursor and Cline, it lets developers drive their implementation from an AI chat that navigates and changes code and executes commands. It frequently releases interesting new features and integrations for the agentic mode. Recently, for instance, it released a browser preview that makes it easy for the agent to access DOM elements and the browser console, and a web research capability that lets Windsurf look for documentation and solutions on the internet when appropriate. Windsurf provides access to a range of popular models, and users can activate and reference web search, library documentation and MCP integration as additional context providers.

  • 82. YOLO

    The YOLO (You Only Look Once) series, developed by Ultralytics, continues to advance computer vision models. The latest release, YOLO11, delivers significant improvements in both precision and efficiency over previous versions. YOLO11 can perform image classification at high speed with minimum resources, making it suitable for real-time applications in edge devices. We also found that the ability to use the same framework to do pose estimation, object detection, image segmentation and other tasks is very powerful. This significant development also reminds us that using ‘traditional’ machine-learning models for specific tasks can be more powerful than general AI models, such as LLMs.

暂缓 ?

无异常点

无法找到需要的信息?每期技术雷达中的条目都在试图反映我们在过去六个月中的技术洞见,或许你所搜索的内容已经在前几期中出现过。由于我们有太多想要谈论的内容,有时候不得不剔除一些长期没有发生变化的条目。技术雷达来自于我们的主观经验,而非全面的市场分析,所以你可能会找不到自己最在意的技术条目。

下载 PDF

 

English | Español | Português | 中文

订阅技术雷达简报

 

立即订阅

查看存档并阅读往期内容