Apache Flink has seen increasing adoption since our initial assessment on 2016. Flink is recognized as the leading stream-processing engine and also gradually matured in the fields of batch processing and machine learning. One of Flink's key differentiator from other stream-processing engines is its use of consistent checkpoints of an application's state. In the event of failure, the application is restarted and its state is loaded from the latest checkpoint — so that the application can continue processing as if the failure had never happened. This helps us to reduce complexity of building and operating external systems for fault tolerance. We see more and more companies using Flink to build their data-processing platform.
Once exclusive to tech giants, self-driving technology isn't rocket science anymore, as demonstrated by Apollo Auto. The goal of the Baidu-owned Apollo program is to become the Android of the autonomous driving industry. The Apollo platform has components such as perception, simulation, planning and intelligent control that enable car companies to integrate their own autonomous driving systems into their vehicles' hardware. The developer community is still new but with a lot of vendors joining to contribute more ports. One of our projects helped our client to complete self-driving license exams with the Apollo-based autopilot system. Apollo also provides an evolutionary architecture approach to adopt advanced features gradually, which enables us to integrate more sensors and functions in an agile, iterative way.
GCP Pub/Sub is Google Cloud's event streaming platform. It's a popular piece of infrastructure for many of our architectures running Google Cloud Platform, including mass event ingestion, communication of serverless workloads and streaming data-processing workflows. One of its unique features is support of pull and push subscriptions: subscribing to receive all published messages available at the time of subscription or pushing messages to a particular endpoint. Our teams have enjoyed its reliability and scale and that it just works as advertised.
Mongoose OS remains one of our preferred open-source microcontroller operating systems and embedded firmware development frameworks. It's worth noting that Mongoose OS fills a noticeable gap for embedded software developers: the gap between Arduino firmware suitable for prototyping and bare-metal microcontrollers' native SDKs. Our teams have successfully used Cesanta's new end-to-end device management platform, mDash, for small-scale greenfield hardware projects. Major Internet of Things (IoT) cloud platform providers today support the Mongoose OS development framework for their device management, connectivity, and over-the-air (OTA) firmware upgrades. Since we last reported on Mongoose OS, the number of supported boards and microcontrollers has grown to include STM, Texas Instruments and Espressif. We continue to enjoy its seamless support for OTA updates and its built-in security at the individual device level.
ROS (Robot Operating System) is a set of libraries and tools to help software developers create robot applications. It's a development framework that provides hardware abstraction, device drivers, libraries, visualizers, message-passing, package management and more. Apollo Auto is based on ROS. In our other ADAS simulation project, we've also used ROS's messaging system (bag). The technology isn't new, but it has regained developers’ attention with the development of ADAS.
For many of our teams Terraform has become the default choice for defining cloud infrastructure. However, some of our teams have been experimenting with AWS Cloud Development Kit (AWS CDK) and they like what they've seen so far. In particular, they like the use of first-class programming languages instead of configuration files which allows them to use existing tools, test approaches and skills. Like similar tools, care is still needed to ensure deployments remain easy to understand and maintain. Given that support for C# and Java is coming soon and ignoring for now some gaps in functionality, we think AWS CDK is worth watching as an alternative to other configuration file–based approaches.
Azure DevOps services include a set of managed services such as hosted Git repos, CI/CD pipelines, automated testing tooling, backlog management tooling and artifact repository. Azure DevOps Pipelines have been maturing over time. We particularly like its ability to define Pipelines as code and its ecosystem of extensions on the Azure DevOps marketplace. At the time of writing, our teams are still running into a few immature features, including lack of an effective UI for pipeline visualization and navigation and the inability to trigger a pipeline from artifacts or other pipelines.
Azure Pipelines is a product of Azure DevOps that offers cloud-based solutions to implement pipelines as code for projects hosted in Azure DevOps Git server or other Git solution such as GitHub or Bitbucket. The interesting part of this solution is the ability to run your scripts in Linux, MacOS and Windows agents without the overhead of managing a virtual machine on your own. This represents a big step forward, especially for teams that work on Windows environments with .NET Framework solutions; we're also assessing this service for continuous delivery in iOS.
Most of the projects with multilingual support start with development teams building features in one language and managing the rest through offline translation via emails and spreadsheets. Although this simple setup works, things can quickly get out of hand. You may have to keep answering the same questions for different language translators, sucking the energy out of the collaboration between translators, proofreaders and the development team. Crowdin is one of a handful of platforms that help in streamlining the localization workflow of your project. With Crowdin the development team can continue building features and the platform streamlines the text that needs translation into an online workflow. We like that Crowdin nudges the teams to continuously and incrementally incorporate translation rather than managing them in large batches toward the end.
Crux is an open-source document database with bitemporal graph queries. Most database systems are temporal, meaning they help us model facts along with the time at which they occurred. Bitemporal database systems let you model not just the valid time the fact occurred but also the transaction time when it was received. If you need a document store with graph capabilities for querying the content, then give Crux a try. It's currently in alpha and lacks SQL support, but you can use a Datalog query interface for reading and traversing relationships.
Delta Lake is an open-source storage layer by Databricks that attempts to bring transactions to big data processing. One of the problems we often encounter when using Apache Spark is the lack of ACID transactions. Delta Lake integrates with the Spark API and addresses this problem by its use of a transaction log and versioned Parquet files. With its serializable isolation, it allows concurrent readers and writers to operate on Parquet files. Other welcome features include schema enforcement on write and versioning, which allows us to query and revert to older versions of data if necessary. We've started to use it in some of our projects and quite like it.
Kubernetes's serverless ecosystem is growing. We talked about Knative in a previous Radar; now we're seeing Fission gaining traction. Fission lets developers focus on writing short-lived functions and map them to HTTP requests while the framework handles the rest of the plumbing and automation of Kubernetes resources behind the scenes. Fission also lets you compose functions, integrate with third-party providers via web hooks and automate the management of the Kubernetes infrastructure.
FoundationDB is an open-source multimodel database, acquired by Apple in 2015 and then open sourced in April 2018. The core of FoundationDB is a distributed key-value store, which provides strict serializability transactions. One of the interesting aspects of FoundationDB is its concept of layers to offer additional models. These layers are essentially stateless components built on top of the core key-value store, such as the Record layer and the Document layer. FoundationDB sets a high standard with its Simulation testing where they run daily tests simulating various system failures. With its performance, rigorous testing and easy operability, FoundationDB is not just a database but can also be used by those looking to build distributed systems where they can use FoundationDB as a core primitive on which to build their system.
Not everyone needs a self-hosted OAuth2 solution, but if you do, we found Hydra — a fully compliant open-source OAuth2 server and OpenID connect provider — quite useful. We really like that Hydra doesn't provide any identity management solutions out of the box; so no matter what flavor of identity management you have, it's possible to integrate it with Hydra through a clean API. This clear separation of identity from the rest of the OAuth2 framework makes it easier to integrate Hydra with an existing authentication ecosystem.
Kuma is a platform-agnostic service mesh for Kubernetes, VMs and bare metal environments. Kuma is implemented as a control plane on top of Envoy and as such can instrument any Layer 4/Layer 7 traffic to secure, observe, route and enhance connectivity between services. Most of the service mesh implementations are targeted natively at the Kubernetes ecosystem which in itself is not bad but hinders the adoption of service mesh for existing non-Kubernetes applications. Rather than waiting for large platform transformation efforts to be complete, you can now use Kuma and modernize the network infrastructure.
We talked about Kubernetes in the past and it continues to be the default choice for deploying and managing containers in production clusters. However, it's getting increasingly difficult to provide a similar experience offline for developers. Among other options, we've found MicroK8s to be quite useful. To install the MicroK8s snap, pick a release channel (stable, candidate, beta or edge), and you can get Kubernetes running with a few commands. You can also keep track of mainstream releases and choose to upgrade your setup automatically.
We've long tracked AR/VR (Augmented/Virtual Reality) in our Radar, but its appeal has been limited to specific platforms and tethering options. The Oculus Quest changes the game, becoming one of the first consumer mass-market standalone VR headsets that requires no tethering or support outside a smartphone. This device opens the door for a huge jump in potential exposure to VR applications, whose demand will in turn drive the market toward more aggressive innovation. We applaud the democratization of VR this device helps usher in and can't wait to see what's on the horizon.
The tools and frameworks ecosystem around neural networks have been evolving rapidly. The interoperability between them, however, has been a challenge. It's not uncommon in the ML industry to quickly prototype and train the model in one tool and then deploy it in a different tool for inference. Because the internal format of these tools aren't compatible, we need to implement and maintain messy convertors to make the models compatible. The Open Neural Network Exchange format ONNX addresses this problem. In ONNX, the neural networks are represented as graphs using standard operator specifications, and together with a serialization format for trained weights, neural network models can be transferred from one tool to another. This opens up lots of possibilities, including Model Zoo, a collection of pretrained models in ONNX format.
Ideally, containers should be managed and run by the respective container runtime without root privileges. This is not trivial but when achieved, it reduces the attack surface and avoids whole classes of security problems, notably privilege escalation out of the container. The community has discussed this as rootless containers for quite a while, and it is part of the open container runtime specification and its standard implementation runc, which underpins Kubernetes. Now, Docker 19.03 introduces rootless containers as an experimental feature. Although fully functional, the feature doesn't yet work with several other features such as cgroups resource controls and AppArmor security profiles.
We often relate data warehousing to a central infrastructure that is hard to scale and manage with the growing demands around data. Snowflake, however, is a new SQL Data Warehouse as a Service solution built from the ground up for the cloud. With a bunch of neatly crafted features such as database-level atomicity, structured and semi-structured data support, in-database analytics functions and above all with a clear separation of storage, compute and services layer, Snowflake addresses most of the challenges faced in data warehousing.
Teleport is a security gateway for remotely accessing cloud native infrastructures. One of Teleport's interesting features is its ability to double as a Certificate Authority (CA) for your infrastructure. You can issue short-lived certificates and build richer role-based access control (RBAC) for your Kubernetes infrastructure (or for just SSH). With increased focus on infrastructure security it's important to keep track of changes. However, not all events require the same level of auditing. With Teleport you can stick with logging for most of the events but go the extra mile by recording the user screen for more privileged root sessions.