We keep receiving positive feedback on "post-Selenium" web UI testing tools such as Cypress, TestCafe and Puppeteer. Running end-to-end tests can present challenges, such as the long duration of the running process, the flakiness of some tests and the challenges of fixing failures in CI when running tests in headless mode. Our teams have had very good experiences with Cypress by solving common issues such as lack of performance and long wait time for responses and resources to load. Cypress has become the tool of choice for end-to-end testing within our teams.
Over the past couple of years, we've noticed a steady rise in the popularity of analytics notebooks. These are Mathematica-inspired applications that combine text, visualization and code in a living, computational document. Jupyter Notebooks are widely used by our teams for prototyping and exploration in analytics and machine learning. We've moved Jupyter to Adopt for this issue of the Radar to reflect that it has emerged as the current default for Python notebooks. However, we caution to use Jupyter Notebooks in production.
One of the challenges of using cloud services is being able to develop and test locally. LocalStack solves this problem for AWS by providing local test double implementations of a wide range of AWS services, including S3, Kinesis, DynamoDB and Lambda. It builds on top of best-of-breed tools such as Kinesalite, dynalite and Moto and adds isolated processes and error injection functionality. LocalStack is very easy to use, ships with a simple JUnit runner and a JUnit 5 extension and can also run inside a docker container. For many teams, it has become the default for testing services that are deployed on AWS.
Terraform, is rapidly becoming a de facto choice for creating and managing cloud infrastructures by writing declarative definitions. The configuration of the servers instantiated by Terraform is usually left to Puppet, Chef or Ansible. We like Terraform because the syntax of its files is quite readable and because it supports a number of cloud providers while making no attempt to provide an artificial abstraction across those providers. The active community will add support for the latest features from most cloud providers. Following our first, more cautious, mention of Terraform almost two years ago, it has seen continued development and has evolved into a stable product with a good ecosystem that has proven its value in our projects. The issue with state file management can now be sidestepped by using what Terraform calls a "remote state backend." We've successfully used AWS S3 for that purpose.
As more and more teams embrace DesignOps, practices and tooling in this space mature. UI dev environments provide a comprehensive environment for quickly iterating on UI components, focusing on collaboration between user experience designers and developers. We now have a few options in this space: Storybook, React Styleguidist, Compositor and MDX. You can use these tools standalone in component library or design system development as well as embedded in a web application project. Many teams were able to decrease their UI feedback cycles and improve timing of UI work in preparation for development work, which has made using UI dev environments a reasonable default for us.
As developers used to pushing many small commits daily, we rely on monitors to notify us when builds go green. AnyStatus is a lightweight Windows desktop app that rolls up metrics and events from various sources into one place. Examples include build results and releases, health checks for different services and OS metrics. Think of it as CCTray on steroids. It's also available as a Visual Studio plugin.
So much energy and effort continue to be wasted on configuring local development environments and troubleshooting the "works on my machine" problem. For many years our teams have adopted the "check out and go" approach where we use a scripted approach to ensure the local development environment is configured consistently. batect is an open source tool developed by a ThoughtWorker that makes it easy to set up and share a build environment based on Docker. batect becomes the entry point script for your build system, launching containers to perform build tasks that don't rely at all on local setup. Changes to build configuration and dependencies are simply shared through source control without requiring any changes or installations on local machines or CI agents. While we like Cage, among other tools, in this space, we see batect quickly growing in favor with our teams.
One of the challenges of search is ensuring the most relevant results for the user appear at the top of the list. This is where learning to rank (LTR) can help. LTR is the process of applying machine learning to rank documents retrieved by a search engine. If you're using Elasticsearch, you can achieve search-relevant ranking with the Elasticsearch LTR plugin. The plugin uses RankLib for generating the models during the training phase. Then, when querying Elasticsearch, you can use this plugin to "rescore" the top results. We've used it in a few projects and have been happy with the results. There's also an equivalent LTR solution for Solr users.
Helm is a package manager for Kubernetes. It comes with a repository of curated Kubernetes applications that are maintained in the official Charts repository. Helm has two components: a command line utility called Helm and a cluster component called Tiller. Securing a Kubernetes cluster is a wide and nuanced topic, but we highly recommend setting up Tiller in a role-based access control (RBAC) environment. We've used Helm in a number of client projects and its dependency management, templating and hook mechanism has greatly simplified the application lifecycle management in Kubernetes. However, we recommend proceeding with caution — Helm's YAML templating can be difficult to understand, and Tiller still has some rough edges. Helm 3 is expected to address these issues.
How does an organization give autonomy to delivery teams while still making sure their deployed solutions are safe and compliant? How do you ensure that servers, once deployed, maintain a consistent configuration without drift? InSpec is positioned as a solution for continuous compliance and security, but you can also use it for general infrastructure testing. InSpec allows the creation of declarative infrastructure tests, which can then be continuously run against provisioned environments including production. Our teams particularly praise its extensible design with resources and matchers for multiple platforms. We recommend trialling InSpec as a solution to the problem of assuring compliance and security.
Good UI animation could greatly improve user experience. However, to reproduce a designer's delicate animation on an app is usually a challenging task for developers. Lottie is a library for Android, iOS, web, and Windows that parses Adobe After Effects animations exported as JSON with Bodymovin and renders them natively on mobile and on the web. Both designers and developers can continue to use their familiar tools and have a fluent collaboration.
Setting up highly available PostgreSQL instances can be tricky, which is why we like Patroni — it helps us speed up the setup of PostgreSQL clusters. Stolon is another tool that we've used successfully to run high-availability (HA) clusters of PostgreSQL instances in production using Kubernetes. Although PostgreSQL supports streaming replication out of the box, the challenge in an HA setup is to assure that the clients always connect to the current master. We like that Stolon enforces the connection to the right PostgreSQL master by actively closing connections to unelected masters and routing requests to the active one.
Traefik is an open-source reverse proxy and load balancer. If you're looking for an edge proxy that provides simple routing without all the features of NGINX and HAProxy, Traefik is a good choice. The router provides a reload-less reconfiguration, metrics, monitoring and circuit breakers that are essential when running microservices. It also integrates nicely with Let's Encrypt to provide SSL termination as well as infrastructure components such as Kubernetes, Docker Swarm or Amazon ECS to automatically pick up new services or instances to include in its load balancing.
Anka is a set of tools to create, manage and distribute build and test macOS reproducible virtual environments for iOS and macOS development. It brings Docker-like experience to macOS environments: instant start, CLI to manage virtual machines and registry to version and tag virtual machines for distribution. We discovered Anka when we proposed a macOS private cloud solution to a client. This tool is worth considering when applying DevOps workflow to iOS and macOS environments.
Cage is an open-source wrapper around Docker Compose that lets you configure and run multiple dependent components as a larger application. It lets you orchestrate the execution of components such as Docker images, service source code from repo, scripts to load datastores and pods, which are containers that run together as a unit. Cage uses the Docker Compose v2 configuration file format. It addresses some of the Docker Compose gaps such as supporting multiple environments, including the dev environment for running a distributed application on the local developer machine and the test environment for running integration tests and production.
Traditional Linux network security approaches, such as iptables, filter on IP address and TCP/UDP ports. However, these IP addresses frequently churn in dynamic microservices environments. By leveraging Linux eBPF, Cilium provides API-aware networking and security by transparently inserting security in a way that is based on service, pod or container identity in contrast to IP address identification. By decoupling security from addressing, Cilium could play a significant role as a new network protection layer and we recommend you to check it out.
Detekt is a static code analysis tool for Kotlin. It finds code smells and code complexity. You can run it from the command line or use its plugins for integration with popular developer tools such as Gradle (to perform code analysis via builds) or SonarQube (to perform code coverage in addition to static code analysis), and IntelliJ. Detekt is a great addition to build pipelines of Kotlin applications.
Feature toggles are an important technique in continuous deployment scenarios. We've come across a number of good home-grown solutions, but we do like the approach Flagr takes: a complete feature toggle as a service, distributed as a Docker container. It comes with SDKs for all major languages, has a simple and well-documented REST API and provides a convenient frontend.
Gremlin is a SaaS solution for organizations to conduct chaos experiments and help test the resilience of their systems. It comes with a series of failure attacks — including resource, network and state failures — that can be run ad hoc or on schedule and require minimal setup (especially for Kubernetes users, who can run Helm to install Gremlin). The Gremlin client also has a nice web-based user interface, which makes it easy to execute and manage chaos experiments.
Honeycomb is an observability tool that ingests rich data from production systems and makes it manageable through dynamic sampling. Developers can log large amounts of rich events and decide later how to slice and correlate them. This interactive approach is useful when working with today's large distributed systems because we've passed the point where we can reasonably anticipate which questions we might want to ask of production systems.
Humio is a fairly new player in the log management space. It's been built from the ground up to be super fast at both log ingestion and query using its built-in query language on top of a custom-designed time series database. It integrates with just about everything out there from an ingestion, visualization and alerting perspective. The log management space has been dominated by Splunk and the ELK Stack, so having alternatives is a good thing. We'll be watching Humio's development with interest.
We're excited about the impact Kubernetes has had on our industry but also concerned about the operational complexity that comes with it. Keeping a Kubernetes cluster up and running and then managing packages deployed on it requires special skills and time. Operational processes such as upgrades, migrations, backups, among others, can be a full-time job. We think that Kubernetes Operators will play a key role in reducing this complexity. The framework provides a standard mechanism to describe automated operational processes for packages running in a Kubernetes cluster. Although Operators were spearheaded and promoted by RedHat, several community-developed Operators for common open-source packages such as Jaeger, MongoDB and Redis have begun to emerge.
One of the challenges in adopting an open-source alternative to popular commercial packages is sorting through the complicated landscape of projects to understand which components you need, which ones play nicely together and exactly which part of a total solution each component covers. This is particularly difficult in the world of observability, where the standard practice is to purchase one comprehensive but pricey package to do it all. OpenAPM makes the open-source selection process for observability tools easier. It displays the current crop of open-source packages classified by component roles, so you can interactively select compatible components. As long as you keep the tool up to date, it should help you navigate through the confusing array of possible tools.
It's easy to think of many of the processes we work within as linear chains of cause and effect. Most of the time we are working within more complex systems where positive and negative feedback loops influence outcomes. Systems is a set of tools for describing, executing and visualizing systems diagrams. Using a compact DSL and running either standalone or within a Jupyter Notebook, it's super easy to describe fairly complex processes and the flow of information through them. It's pretty much a niche tool; but an interesting and fun one.
Taurus is a handy application and service performance testing tool written in Python. It wraps many performance testing executors, including Gatling and Locust. You can run it from the command line and easily integrate it with continuous delivery pipelines to run performance tests at different stages of the pipeline. Taurus also has great reporting either as console text-based output or integrated with an interactive web UI. Our teams have found that configuring Taurus YAML files is easy because you can use multiple files to describe each test scenario and refer to underlying executer's scenario definitions.
Terraform provider GoCD lets you build pipelines using Terraform, a mature and widely used tool in the infrastructure as code space. With this provider, you can write pipelines in the HashiCorp Configuration Language (HCL) that use all of the functionality Terraform provides, including workspaces, modules and remote state. This approach is an excellent alternative to Gomatic, which we highlighted in the Pipelines as code blip before. The Golang SDK used in this provider has automatic regression tests for the GoCD API which should minimize issues while upgrading.
We widely use Terraform as code to configure a cloud infrastructure. Terratest is a Golang library that makes it easier to write automated tests for infrastructure code. A test run creates real infrastructure components (such as servers, firewalls or load balancers), deploys applications on them and validates the expected behavior using Terratest. At the end of the test, Terratest can undeploy the apps and clean up resources. This makes it largely useful for end-to-end tests of your infrastructure in a real environment.
AWS CloudFormation is a proprietary declarative language to provision AWS infrastructure as code. Handwriting CloudFormation files is often a default approach to bootstrap AWS infrastructure automation. Although this might be a sensible way to start a small project, our teams, and the industry at large, have found that handwritten CloudFormation simply does not scale as the infrastructure grows. Noticeable pitfalls of handwritten CloudFormation files for large projects include poor readability, lack of imperative constructs, limited parameter definition and usage, and lack of type checking. Addressing these shortfalls has led to a rich ecosystem of both open-source and custom tooling. We find Terraform a sensible default that not only addresses shortfalls of CloudFormation but also has an active community to add the latest AWS features and fix bugs. In addition to Terraform, you can choose from many other tools and languages, including troposphere, sceptre, Stack Deployment Tool and Pulumi.