Enable javascript in your browser for better experience. Need to know to enable it? Go here.

Platforms

Adopt ?

  • In an increasingly digital world, improving developer effectiveness in large organizations is often a core concern of senior leaders. We've seen enough value with developer portals in general and Backstage in particular that we're happy to recommend it in Adopt. Backstage is an open-source developer portal platform created by Spotify that improves discovery of software assets across the organization. It uses Markdown TechDocs that live alongside the code for each service, which nicely balances the needs of centralized discovery with the need for distributed ownership of assets. Backstage supports software templates to accelerate new development and a plugin architecture that allows for extensibility and adaptability into an organization's infrastructure ecosystem. Backstage Service Catalog uses YAML files to track ownership and metadata for all the software in an organization's ecosystem; it even lets you track third-party SaaS software, which usually requires tracking ownership.

  • Delta Lake is an open-source storage layer, implemented by Databricks, that attempts to bring ACID transactions to big data processing. In our Databricks-enabled data lake or data mesh projects, our teams prefer using Delta Lake storage over the direct use of file storage types such as AWS S3 or ADLS. Until recently, Delta Lake has been a closed proprietary product from Databricks, but it's now open source and accessible to non-Databricks platforms. However, our recommendation of Delta Lake as a default choice currently extends only to Databricks projects that use Parquet file formats. Delta Lake facilitates concurrent data read/write use cases where file-level transactionality is required. We find Delta Lake's seamless integration with Apache Spark batch and micro-batch APIs very helpful, particularly features such as time travel (accessing data at a particular point in time or commit reversion) as well as schema evolution support on write.

Trial ?

  • Many of our teams have successfully used AWS Database Migration Service (DMS) to migrate data to and from AWS. In one of our Digital Transformation engagements, we achieved nearly zero downtime cut-over to the new system as we migrated data from Microsoft SQL Server to an AWS Relational Database Service (RDS) PostgreSQL instance. Such transformations involve many moving parts that require planning and coordination across multidisciplinary teams, but for data migration we're quite happy with DMS. It automatically manages the deployment, management and monitoring of all required resources. Over the years DMS has matured to support several source and target databases, and we continue to like it.

  • Colima is becoming a popular open alternative to Docker Desktop. It provisions the Docker container run time in a Lima VM, configures the Docker CLI on macOS and handles port-forwarding and volume mounts. Colima uses containerd as its run time, which is also the run time on most managed Kubernetes services — improving the important dev-prod parity. With Colima you can easily use and test the latest features of containerd, such as lazy loading for container images. We've been having good results with Colima in our projects. When in the Kubernetes space, we also use nerdctl, a Docker-compatible CLI for containerd. Since Kubernetes has deprecated Docker as container run time and most managed-services (EKS, GKE, etc) are following its lead, more people will be looking to containerd native tools, hence the importance of tools like nerdctl. In our opinion, Colima is realizing its strong potential and becoming a go-to option as an alternative to Docker Desktop.

  • Starting with Databricks 9.1 LTS (Long Term Support), a new run time became available called Databricks Photon, an alternative that was rewritten from the ground up in C++. Several of our teams have now used Photon in production and have been pleased with the performance improvements and corresponding cost savings. Actual improvements and changes in costs will depend upon multiple factors such as data set size and transaction types. We recommend trialing against a realistic workload to gather data for a comparison before making any decision on Photon's use.

  • Since we first mentioned data discoverability in the Radar, LinkedIn has evolved WhereHows to DataHub, the next generation platform that addresses data discoverability via an extensible metadata system. Instead of crawling and pulling metadata, DataHub adopts a push-based model where individual components of the data ecosystem publish metadata via an API or a stream to the central platform. This push-based integration shifts ownership from the central entity to individual teams, making them accountable for their metadata. As a result, we've used DataHub successfully as an organization-wide metadata repository and entry point for multiple autonomously maintained data products. When taking this approach, be sure to keep it lightweight and avoid the slippery slope leading to centralized control over a shared resource.

  • DataOps.live is a data platform that automates environments in Snowflake. Inspired by DevOps practices, DataOps.live lets you treat the data platform like any other web platform by embracing continuous integration and continuous delivery (CI/CD), automated testing, observability and code management. You can roll back changes immediately without impacting the data or recover from complete failures and rebuild a fresh Snowflake tenant in minutes or hours instead of days. Our teams had good experiences with DataOps.live, because it allowed us to iterate quickly when building data products on top of Snowflake.

  • For several years now, the Linux kernel has included the extended Berkeley Packet Filter (eBPF), a virtual machine that provides the ability to attach filters to particular sockets. But eBPF goes far beyond packet filtering and allows custom scripts to be triggered at various points within the kernel with very little overhead. By allowing you to run sandboxed programs within the operating system kernel, application developers can run eBPF programs to add additional capabilities to the operating system at run time. Some of our projects require troubleshooting and profiling at the system call level, and our teams found that tools like bcc and bpftrace have made their jobs easier. Observability and network infrastructure also benefit from eBPF — for example, the Cilium project can implement traffic load balancing and observability without sidecar overhead in Kubernetes, and Hubble provides further security and traffic observability on top of it. The Falco project uses eBPF for security monitoring, and the Katran project uses eBPF to build more efficient L4 load balancing. The eBPF community is growing rapidly, and we're seeing more and more synergy with the field of observability.

  • Feast is an open-source Feature Store for machine learning. It has several useful properties, including generating point-in-time correct feature sets — so error-prone future feature values do not leak to models during training — and supporting both streaming and batch data sources. However, it currently only supports timestamped structured data and therefore may not be suitable if you work with unstructured data in your models. We've successfully used Feast at a significant scale as an offline store during model training and as an online store during prediction.

  • Monte Carlo is a data observability platform. Using machine learning models, it infers and learns about data, identifying issues and notifying users when they arise. It allows our teams to maintain data quality across ETL pipelines, data lakes, data warehouses and business intelligence (BI) reports. With features such as monitoring dashboards as code, a central data catalog and field-level lineage, our teams find Monte Carlo to be an invaluable tool for overall data governance.

  • In previous editions, we’ve recommended assessing bounded low-code platforms as a method for applying low-code solutions to specific use cases in very limited domains. We’ve seen some traction in this space, specifically with Retool, a low-code platform that our teams use to build solutions for internal users, predominantly to query and visualize data. It allows them to produce non-business-critical read-only solutions faster. The main reported benefits of Retool are its UI components and its ability to be integrated quickly and easily with common data sources.

  • Seldon Core is an open-source platform to package, deploy, monitor and manage machine learning models in Kubernetes clusters. With out-of-the-box support for several machine-learning frameworks, you can easily containerize your models using prepackaged inference servers, custom inference servers or language wrappers. With distributed tracing through Jaeger and model explainability via Alibi, Seldon Core addresses several last-mile delivery challenges with machine learning deployments, and our data teams like it.

  • Teleport is a tool for zero trust network access to infrastructure. Traditional setups require complex policies or jump servers to restrict access to critical resources. Teleport, however, simplifies this with a unified access plane and with fine-grained authorization controls that replace jump servers, VPNs or shared credentials. Implemented as a single binary with out-of-the-box support for several protocols (including SSH, RDP, Kubernetes API, MySQL, MongoDB and PostgreSQL wire protocols), Teleport makes it easy to set up and manage secured access across Linux, Windows or Kubernetes environments. Since we first mentioned it in the Radar, a few teams have used Teleport and our overall positive experience prompted us to highlight it.

  • Modern observability relies on collecting and aggregating an exhaustive set of granular metrics to fully understand, predict and analyze system behavior. But when applied to a cloud native system composed of many redundant and cooperating processes and hosts, the cardinality (or number of unique time series) becomes unwieldy because it grows exponentially with each additional service, container, node, cluster, etc. When dealing with high-cardinality data, we've found that VictoriaMetrics performs well. VictoriaMetrics is particularly useful for operating Kubernetes-hosted microservice architectures, and the VictoriaMetrics operator makes it easy for teams to implement their own monitoring in a self-service way. We also like its componentized architecture and ability to continue collecting metrics even when the central server is unavailable. Although our team has been happy with VictoriaMetrics, this is a rapidly evolving area, and we'd recommend keeping an eye on other high-performance, Prometheus-compatible time series databases such as Cortex or Thanos.

Assess ?

  • Bun is a new JavaScript runtime, similar to Node.js or Deno. Unlike Node.js or Deno, however, Bun is built using WebKit's JavaScriptCore instead of Chrome's V8 engine. Designed as a drop-in replacement for Node.js, Bun is a single binary (written in Zig) that acts as a bundler, transpiler and package manager for JavaScript and TypeScript applications. Bun is currently in beta, so expect bugs or compatibility issues with a few Node.js libraries. However, it’s been built from the ground up with several optimizations, including fast startup and improved server-side rendering, and we believe it’s worthwhile to assess.

  • Databricks Unity Catalog is a data governance solution for assets such as files, tables or machine learning models in a lakehouse. Although you'll find several platforms in the enterprise data governance space, if you're already using other Databricks solutions, you should certainly assess Unity Catalog. We want to highlight that while these governance platforms usually implement a centralized solution for better consistency across workspaces and workloads, the responsibility to govern should be federated by enabling individual teams to govern their own assets.

  • Dragonfly is a new in-memory data store with compatible Redis and Memcached APIs. It leverages the new Linux-specific io_uring API for I/O and implements novel algorithms and data structures on top of a multithreaded, shared-nothing architecture. Because of these clever choices in implementation, Dragonfly achieves impressive results in performance. Although Redis continues to be our default choice for in-memory data store solutions, we do think Dragonfly is an interesting choice to assess.

  • In previous Radars, we've written about TinyML — the practice of running trained models on small devices with onboard sensors to make decisions or extract features without a roundtrip to the cloud. Edge Impulse has made the process of collecting sensor data and then training and deploying a model as simple as possible. Edge Impulse is an end-to-end hosted platform for developing models optimized to run on small edge devices such as microcontrollers. The platform guides the developer through the entire pipeline, including the task of collecting and labeling training data. They've made it easy to get started using your mobile phone for both data collection and running the classifier while the model training and refining happens in the more powerful, cloud-hosted environment. The resulting recognition algorithms can also be optimized, compiled and uploaded to a wide range of microcontroller architectures. Although Edge Impulse is a commercial venture, the platform is free for developers and makes the entire process fun and engaging even for those who are new to machine learning. The low barrier of entry to creating a working application means that we'll be seeing more edge devices with smart decisioning built in.

  • GCP Vertex AI is a unified artificial intelligence platform that allows teams to build, deploy and scale machine-learning (ML) models. Vertex AI includes pretrained models, which can be used directly, fine-tuned or combined with AutoML, as well as infrastructure such as feature stores and pipelines for ML models. We like Vertex AI's integrated capabilities, which help to make it feel like a coherent AI platform.

  • Gradient is a platform for building, deploying and running machine-learning applications, very similar to Google's Colab. Notebooks can be created from templates, helping you to get started with PyTorch or TensorFlow or with applications like Stable Diffusion. In our experience, Gradient is well-suited for GPU-intensive models, and we like that the web-based environment is persistent.

  • IAM Roles Anywhere is a new service from AWS that lets you obtain temporary security credentials in IAM for workloads such as servers, containers and applications that run outside of AWS. We find it particularly useful in hybrid cloud setups where workloads are split across AWS and non-AWS resources. Instead of creating long-lived credentials, with IAM Roles Anywhere, you can now create short-lived credentials to access AWS resources using X.509 certificates. We believe this approach streamlines the access pattern across the hybrid cloud and recommend you check it out.

  • Keptn is a control plane for delivery and operations that relies on CloudEvents for instrumentation. Like one of the techniques we mentioned in observability for CI/CD pipelines, Keptn visualizes its orchestration as traces. The declarative definition of the delivery pipeline aims to separate SRE intentions from the underlying implementation, relying on other observability, pipeline and deployment tooling to respond to the appropriate events. We're particularly excited by the idea of adding service-level objective (SLO) verifications as architectural fitness functions to CI/CD pipelines: Keptn lets you define service-level indicators (SLIs) as key-value pairs, with the value representing the query to your observability infrastructure. It will then evaluate the result against the defined SLOs as a quality gate. Keptn takes the same approach to automated operations, allowing a declarative definition that specifies the intent of scaling a ReplicaSet in response to a degradation of average response time, for example. Created by Dynatrace, Keptn also integrates with Prometheus and Datadog.

  • Undoubtedly, data discoverability has become a very important focal point for companies since it is an enabler for data to be shared and used efficiently by different people. We’ve included platforms such as DataHub and Collibra in previous editions of the Radar. However, our teams are constantly assessing options in this space and have recently shown interest in OpenMetadata, a platform dedicated to metadata management by using open standards. Our teams like this open-source platform because it improves the development experience due to its simple architecture, easy deployment with a focus on automation and strong focus on data discoverability.

  • OrioleDB is a new storage engine for PostgreSQL. Our teams use PostgreSQL a lot, but its storage engine was originally designed for hard drives. Although there are several options to tune for modern hardware, it can be difficult and cumbersome to achieve optimal results. OrioleDB addresses these challenges by implementing a cloud-native storage engine with explicit support for solid-state drives (SSDs) and nonvolatile random-access memory (NVRAM). To try the new engine, first install the enhancement patches to the current table access methods and then install OrioleDB as a PostgreSQL extension. We believe OrioleDB has great potential to address several long-pending issues in PostgreSQL, and we encourage you to carefully assess it.

Hold ?

 
  • platforms quadrant with radar rings Adopt Trial Assess Hold Adopt Trial Assess Hold
  • New
  • Moved in/out
  • No change

Unable to find something you expected to see?

 

Each edition of the Radar features blips reflecting what we came across during the previous six months. We might have covered what you are looking for on a previous Radar already. We sometimes cull things just because there are too many to talk about. A blip might also be missing because the Radar reflects our experience, it is not based on a comprehensive market analysis.

Unable to find something you expected to see?

 

Each edition of the Radar features blips reflecting what we came across during the previous six months. We might have covered what you are looking for on a previous Radar already. We sometimes cull things just because there are too many to talk about. A blip might also be missing because the Radar reflects our experience, it is not based on a comprehensive market analysis.

Download Technology Radar Volume 27

English | Español | Português | 中文

Stay informed about technology

 

Subscribe now

Visit our archive to read previous volumes