Cypress is still a favorite among our teams where developers manage end-to-end tests themselves, as part of a healthy test pyramid, of course. We decided to call it out again in this Radar because recent versions of Cypress have added support for Firefox, and we strongly suggest testing on multiple browsers. The dominance of Chrome and Chromium-based browsers has led to a worrying trend of teams seemingly only testing with Chrome which can lead to nasty surprises.
Figma has demonstrated to be the go-to tool for collaborative design, not only for designers but for multidisciplinary teams too; it allows developers and other roles to view and comment on designs through the browser without the desktop version. Compared to its competitors (e.g., Invision or Sketch) which have you use more than one tool for versioning, collaborating and design sharing, Figma puts together all of these features in one tool that makes it easier for our teams to discover new ideas together. Our teams find Figma very useful, especially in remote and distributed design work enablement and facilitation. In addition to its real-time design and collaboration capabilities, Figma also offers an API that helps to improve the DesignOps process.
A few years ago, Docker — and containers in general — radically changed how we think about packaging, deploying and running our applications. But despite this improvement in production, developers still spend a lot of time setting up development environments and regularly run into "but it works on my machine" style problems. Dojo aims to fix this by creating standard development environments, versioned and released as Docker images. Several of our teams use Dojo to streamline developing, testing and building code from local development through production pipelines.
In 2018 we mentioned DVC in conjunction with the versioning data for reproducible analytics. Since then it has become a favorite tool for managing experiments in machine learning (ML) projects. Since it's based on Git, DVC is a familiar environment for software developers to bring their engineering practices to ML practice. Because it versions the code that processes data along with the data itself and tracks stages in a pipeline, it helps bring order to the modeling activities without interrupting the analysts’ flow.
The day-to-day work of machine learning often boils down to a series of experiments in selecting a modeling approach and the network topology, training data and optimizing or tweaking the model. Data scientists must use experience and intuition to hypothesize changes and then measure the impact those changes have on the overall performance of the model. As this practice has matured, our teams have found an increasing need for experiment tracking tools for machine learning. These tools help investigators keep track of the experiments and work through them methodically. Although no clear winner has emerged, tools such as MLflow and platforms such as Comet or Neptune have introduced rigor and repeatability into the entire machine learning workflow.
We mentioned Goss, a tool for provisioning testing, in passing in previous Radars, for example, when describing the technique of TDD'ing containers. Although Goss isn't always an alternative to Serverspec, simply because it doesn't offer the same amount of features, you may want to consider it when its features meet your needs, especially since it comes as a small, self-contained binary (rather than requiring a Ruby environment). A common anti-pattern with using tools such as Goss is double-entry bookkeeping, where each change in the actual infrastructure as code files requires a corresponding change in the test assertions. Such tests are maintenance heavy and because of the close correspondence between code and test, failures mostly occur when an engineer updates one side and forgets the other. And these tests rarely catch genuine problems.
Jaeger is an open source distributed tracing system. Similar to Zipkin, it's been inspired by the Google Dapper paper and complies with OpenTelemetry. We've used Jaeger successfully with Istio and Envoy on Kubernetes and like its UI. Jaeger exposes tracing metrics in the Prometheus format so they can be made available to other tools. However, a new generation of tools such as Honeycomb integrates traces and metrics into a single observability stream for simpler aggregate analysis. Jaeger joined CNCF in 2017 and has recently been elevated to CNCF's highest level of maturity, indicating its widespread deployment into production systems.
We continue to be ardent supporters of infrastructure as code, and we continue to believe that a robust monitoring solution is a prerequisite for operating distributed applications. Sometimes an interactive tool such as the AWS web console can be a useful addition. It allows us to explore all kinds of resources in an ad-hoc fashion without having to remember every single obscure command. Using an interactive tool to make manual modifications on the fly is still a questionable practice, though. For Kubernetes we now have k9s, which provides an interactive interface for basically everything that kubectl can do. And to boot, it's not a web application but runs inside a terminal window, evoking fond memories of Midnight Commander for some of us.
kind is a tool for running local Kubernetes clusters using Docker container nodes. With kubetest integration, kind makes it easy to do end-to-end testing on Kubernetes. We've used kind to create ephemeral Kubernetes clusters to test Kubernetes resources such as Operators and Custom Resource Definitions (CRDs) in our CI pipelines.
mkcert is a convenient tool for creating locally trusted development certificates. Using certificates from real certificate authorities (CAs) for local development can be challenging if not impossible (for hosts such as example.test, localhost or 127.0.0.1). In such situations self-signed certificates may be your only option. mkcert lets you generate self-signed certificates and installs the local CA in the system root store. For anything other than local development and testing, we strongly recommend using certificates from real CAs to avoid trust issues.
MURAL describes itself as a "digital workspace for visual collaboration" and allows teams to interact with a shared workspace based on a whiteboard/sticky notes metaphor. Its features include voting, commenting, notes and "follow the presenter." We particularly like the template feature that allows a facilitator to design and then reuse guided sessions with a team. Each of the major collaboration suites have a tool in this space (for example, Google Jamboard and Microsoft Whiteboard) and these are worth investigating, but we've found MURAL to be slick, effective and flexible.
Open Policy Agent (OPA) has rapidly become a favorable component of many distributed cloud-native solutions that we build for our clients. OPA provides a uniform framework and language for declaring, enforcing and controlling policies for various components of a cloud-native solution. It's a great example of a tool that implements security policy as code. We've had a smooth experience using OPA in multiple scenarios, including deploying resources to K8s clusters, enforcing access control across services in a service mesh and fine-grained security controls as code for accessing application resources. A recent commercial offering, Styra's Declarative Authorization Service (DAS), eases the adoption of OPA for enterprises by adding a management tool, or control plane, to OPA for K8s with a prebuilt policy library, impact analysis of the policies and logging capabilities. We look forward to maturity and extension of OPA beyond operational services to (big) data-centric solutions.
UX research demands data collection and analysis to make better decisions about the products we need to build. Our teams find Optimal Workshop useful because it makes it easy to validate prototypes and configure tests for data collection and thus make better decisions. Features such as first-click, card sorting, or a heatmap of user interaction help to both validate prototypes and improve website navigation and information display. It's an ideal tool for distributed teams since it allows them to conduct remote research.
As mentioned in our description of Crowdin, you now have a choice of platforms to manage the translation of a product into multiple languages instead of emailing large spreadsheets. Our teams report positive experiences with Phrase, emphasizing that it's easy to use for all key user groups. Translators use a convenient browser-based UI. Managers can add new fields and synchronize translations with other teams in the same UI. Developers can access Phrase locally and from a build pipeline. A feature that deserves a specific mention is the ability to apply versioning to translations through tags, which makes it possible to compare the look of different translations inside the actual product.
ScoutSuite is an expanded and updated tool based on Scout2 (featured in the Radar in 2018) that provides security posture assessment across AWS, Azure, GCP and other cloud providers. It works by automatically aggregating configuration data for an environment and applying rules to audit the environment. We've found this very useful across projects for doing point-in-time security assessments.
Since we first mentioned visual regression testing tools in 2014, the use of the technique has spread and the tools landscape has evolved. BackstopJS remains an excellent choice with new features being added regularly, including support for running inside Docker containers. Loki was featured in our previous Radar. Applitools, CrossBrowserTesting and Percy are SaaS solutions. Another notable mention is Resemble.js, an image diffing library. Although most teams use it indirectly as part of BackstopJS, some of our teams have been using it to analyze and compare images of web pages directly. In general, our experience shows that visual regression tools are less useful in the early stages when the interface goes through significant changes, but they certainly prove their worth as the product matures and the interface stabilizes.
Visual Studio Live Share is a suite of extensions for Visual Studio Code and Visual Studio. At a time when teams are searching for good remote collaboration options, we want to call attention to the excellent tooling here. Live Share provides a good, low-latency remote-pairing experience, and requires significantly less bandwidth than the brute-force approach of sharing your entire desktop. Importantly, developers can work with their preferred configuration, extensions and key mappings during a pairing session. In addition to real-time collaboration for editing and debugging code, Live Share allows voice calls and sharing terminals and servers.
Apache Superset is a great business intelligence (BI) tool for data exploration and visualization to work with large data lake and data warehouse setups. It works, for example, with Presto, Amazon Athena and Amazon Redshift and can be nicely integrated with enterprise authentication. Moreover, you don't have to be a data engineer to use it; it’s meant to benefit all engineers exploring data in their everyday work. It's worth pointing out that Apache Superset is currently undergoing incubation at the Apache Software Foundation (ASF), meaning it's not yet fully endorsed by ASF.
Open standards are one of the foundational pillars of building distributed systems. For example, the OpenAPI (formerly Swagger) specification, as an industry standard to define RESTful APIs, has been instrumental to the success of distributed architectures such as microservices. It has enabled a proliferation of tooling to support building, testing and monitoring RESTful APIs. However, such standardizations have been largely missing in distributed systems for event-driven APIs.
AsyncAPI is an open source initiative to create a much needed event-driven and asynchronous API standardization and development tooling. The AsyncAPI specification, inspired by the OpenAPI specification, describes and documents event-driven APIs in a machine-readable format. It's protocol agnostic, so it can be used for APIs that work over many protocols, including MQTT, WebSockets, and Kafka. We're eager to see the ongoing improvements of AsyncAPI and further maturity of its tooling ecosystem.
If you're looking for a service to support dynamic feature toggles (and bear in mind that simple feature toggles work well too), check out ConfigCat. We'd describe it as "like LaunchDarkly but cheaper and a bit less fancy" and find that it does most of what we need. ConfigCat supports simple feature toggles, user segmentation, and A/B testing and has a generous free tier for low-volume use cases or those just starting out.
You can build most software following a simple two-step process: check out a repository, and then run a single build script. The process of setting up a full coding environment can still be cumbersome, though. Gitpod addresses this by providing cloud-based, "ready-to-code" environments for Github or GitLab repositories. It offers an IDE based on Visual Studio Code that runs inside the web browser. By default, these environments are launched on the Google Cloud Platform, although you can also deploy on-premise solutions. We see the immediate appeal, especially for open source software where this approach can lower the bar for casual contributors. However, it remains to be seen how viable this approach will be in corporate environments.
With the increasing adoption of Kubernetes and service mesh, API gateways have been experiencing an existential crisis in cloud-native distributed systems. After all, many of their capabilities (such as traffic control, security, routing and observability) are now provided by the cluster’s ingress controller and mesh gateway. Gloo is a lightweight API gateway that embraces this change; it uses Envoy as its gateway technology, while providing added value such as a cohesive view of the APIs to the external users and applications. It also provides an administrative interface for controlling Envoy gateways and runs and integrates with multiple service mesh implementations such as Linkerd, Istio and AWS App Mesh. While its open source implementation provides the basic capabilities expected from an API gateway, its enterprise edition has a more mature set of security controls such as API key management or integration with OPA. Gloo is a promising lightweight API gateway that plays well with the ecosystem of cloud-native technology and architecture, while avoiding the API gateway trap of enabling business logic to glue APIs for the end user.
One of the strengths of Kubernetes is its flexibility and range of configuration possibilities along with the API-driven, programmable configuration mechanisms and command-line visibility and control using manifest files. However, that strength can also be a weakness: when deployments are complex or when managing multiple clusters, it can be difficult to get a clear picture of the overall status through command-line arguments and manifests alone. Lens attempts to solve this problem with an integrated environment for viewing the current state of the cluster and its workloads, visualizing cluster metrics and changing configurations through an embedded text editor. Rather than a simple point-and-click interface, Lens brings together the tools an administrator would run from the command line into a single, navigable interface. This tool is one of several approaches that are trying to tame the complexity of Kubernetes management. We've yet to see a clear winner in this space, but Lens strikes an interesting balance between a graphical UI and command-line–only tools.
Manifold is a model-agnostic visual debugger for machine learning (ML). Model developers spend a significant amount of time on iterating and improving an existing model rather than creating a new one. By shifting the focus from model space to data space, Manifold supplements the existing performance metrics with a visual characteristics of the data set that influences the model performance. We think Manifold will be a useful tool to assess in the ML ecosystem.
Building web applications that look just as intended on a large number of devices and screen sizes can be cumbersome. Sizzy is a SaaS solution that shows many viewports in a single browser window. The application is rendered in all viewports simultaneously and interactions with the application are also synched across the viewports. In our experience interacting with an application in this way can make it easier to spot potential issues earlier, before a visual regression testing tool flags the issue in the build pipeline. We should mention, though, that some of our developers who tried Sizzy for a while did, on balance, prefer to work with the tooling provided by Chrome.
Security is everyone's concern and capturing risks early is always better than facing problems later on. In the infrastructure as code space, where Terraform is an obvious choice to manage cloud environments, we now also have tfsec, which is a static analysis tool that helps to scan Terraform templates and find any potential security issues. It comes with preset rules for different cloud providers including AWS and Azure. We always like tools that help to mitigate security risks, and tfsec not only excels in identifying security risks, it's also easy to install and use.