Techniques
Adopt
-
We keep getting good feedback from teams applying product management to internal platforms. One key feature to remember, though: It's not just about team structure or renaming existing platform teams; it’s also about applying product-centric working practices within the team. Specifically, we've received feedback that teams face challenges with this technique unless they have a product-centric mindset. This likely means additional roles, such as a product manager, alongside changes to other areas, such as requirements gathering and the measurement of success. Working this way means establishing empathy with internal consumers (the development teams) and collaborating with them on the design. Platform product managers create roadmaps and ensure the platform delivers value to the business and enhances the developer experience. We continue to see this technique as key to building internal platforms to roll out new digital solutions quickly and efficiently.
-
The options for CI/CD infrastructure as a service have become so manifold and mature that the cases in which it's worth managing your entire CI infrastructure yourself are becoming very rare. Using managed services like GitHub Actions, Azure DevOps or Gitlab CI/CD comes with all the common advantages (and trade-offs) of managed cloud services. You don't have to spend time, effort and hardware costs on maintenance and operations of this often complex infrastructure. Teams can take advantage of elasticity and self-service, whereas provisioning more of the right agents or getting a new plugin or feature are often a bottleneck in companies that host CI themselves. Even the use cases that require to run build and verification on your own hardware can now mostly be covered with self-hosted runners (we've written about some for GitHub Actions, actions-runner-controller and the Philips's self-hosted GitHub runner). Note, however, that you won’t get out-of-the-box security just because you are using a managed services; while mature services provide all the security features you need, you'll still need to use them to implement zero trust security for your CI/CD infrastructure.
-
Starter kits and templates are widely used in software projects to speed up initial setup, but they can pull in many unnecessary dependencies for a particular project. It's important to practice dependency pruning — periodically taking a hard look at these dependencies and pruning any that are not used. This helps reduce build and deploy times and decrease the project's attack surface by removing potential vulnerabilities. Although this isn't a new technique, given the increasing frequency of attacks on software supply chains, we advocate for renewed attention to it.
-
Automatically estimating, tracking and predicting cloud infrastructure run cost is crucial for today's organizations. The cloud providers' savvy pricing models, combined with the proliferation of pricing parameters and the dynamic nature of today's architecture, can lead to surprisingly expensive run costs. Even though this technique has been in Adopt since 2019, we want to highlight the importance of considering run cost as an architecture fitness function, especially today, due to accelerated cloud adoption and the growing attention to FinOps practices. Many commercial platforms provide tools that can consolidate and clarify cloud costs for business leaders. Some of them are designed to show cloud run costs to finance organizations or originating business units.
However, cloud consumption decisions are usually made at the engineering level, where systems are designed. It's important that the engineers making design decisions have some way of predicting the cost impact of their architectural decisions. Some teams automate this prediction early in the development lifecycle. Tools like Infracost help teams predict cost impact when thinking about possible changes to infrastructure as code. This computation can be automated and woven into the CD pipeline. Note that cost will be impacted by architectural decisions combined with actual usage levels; to do this properly, you need good projections of expected usage levels. Early and frequent feedback on run cost can prevent it from soaring. When the predicted cost deviates from what was expected or acceptable, the team can discuss whether it's time to evolve the architecture.
Trial
-
The earlier accessibility is considered in software delivery, the easier and cheaper it is to ensure what's built works for as many people as possible. Tools that help communicate accessibility annotations in designs help teams consider important elements like document structure, semantic HTML and alternative texts from the beginning of their work. This enables them to ensure user interfaces meet global accessibility standards and address common failures that are actually fairly easy to avoid. Figma offers a range of accessibility notation plugins: The A11y Annotation Kit, Twitter's Accessibility Annotation Library and the Axe toolset's Axe for Designers.
-
We've always been advocates of writing less code. Simplicity is one of the core values underlying our sensible defaults for software development. For example, we try not to anticipate needs and only introduce code that satisfies immediate business requirements and nothing else. One way to achieve this is to create engineering platforms that make this possible on an organizational basis.
This is also the stated aim of many low-code platforms surging in popularity right now. Platforms like Mendix or Microsoft Power Apps can expose common business processes for reuse and simplify the problems of getting new functionality deployed and in the hands of users. These platforms have made great strides in recent years with testability and support for good engineering practices. They're particularly useful for simple tasks or event-triggered apps. However, asking them to adapt to a nearly infinite range of business requirements brings complexity. Although developers might be writing less (or zero) code, they must also become experts in an all-encompassing commercial platform. We would advise businesses to consider if they need all the functionality these products bring or if they're better off pursuing bounded low-code platforms, either by developing their own platform as an internal product or by carefully constraining the use of commercial low-code products to those simple tasks at which they excel.
-
One of the big challenges in developing APIs is capturing and communicating their business value. APIs are, by their nature, technical artifacts. Whereas developers can easily comprehend JSON payloads, OpenAPI (Swagger) specs and Postman demos, business stakeholders tend to respond better to demos they can interact with. The value of the product is more clearly articulated when you can see and touch it, which is why we sometimes find it worthwhile to invest in demo frontends for API-only products. When a custom graphical UI is built alongside an API product, stakeholders can see analogies to paper forms or reports that might be more familiar to them. As the interaction model and richness of the demo UI evolves, it allows them to make more informed decisions about the direction the API product should take. Working on the UI has the added benefit of increasing developers' empathy for business users. This isn't a new technique — we've been doing this successfully when necessary as long as API products have been around. However, because this technique isn't widely known, we thought it worthwhile calling attention to it.
-
Lakehouse architecture is an architectural style that combines the scalability of data lakes with the reliability and performance of data warehouses. It enables organizations to store and analyze large volumes of diverse data in a single platform as opposed to having them in separate lake and warehouse tiers, using the same familiar SQL-based tools and techniques. While the term is often associated with vendors like Databricks, open alternatives such as Delta Lake, Apache Iceberg and Apache Hudi are worth considering. Lakehouse architecture can complement data mesh implementations. Autonomous data product teams can choose to leverage a Lakehouse within their data products.
-
When we first included it in the Radar three years ago, verifiable credentials (VC) was an intriguing standard with some promising potential applications, but it wasn't widely known or understood outside the community of enthusiasts. This was particularly true when it came to the credential-granting institutions, such as state governments, who would be responsible for implementing the standards. Three years and one pandemic later, the demand for cryptographically secure, privacy-respecting and machine-verifiable electronic credentials has grown and, as a result, governments are starting to wake up to VC's potential. The W3C standard puts credential holders at the center, which is similar to our experience when using physical credentials: users can put their verifiable credentials in their own digital wallets and show them to anyone at any time without the permission of the credentials' issuer. This decentralized approach also helps users to better manage and selectively disclose their own information which greatly improves data privacy protection.
Several of our teams have engaged in projects involving verifiable credentials technology in the past six months. Not surprisingly, the scenarios vary across countries and government departments. Our team has explored different combinations of decentralized identifiers, verifiable credentials and verifiable presentation on multiple projects. This is a developing field, and now that we've had more experience, we want to keep track of it in the Radar.
Assess
-
One of the many places in the software delivery process to consider accessibility requirements early on is during web component testing. Testing framework plugins like chai-a11y-axe provide assertions in their API to check for the basics. But in addition to using what testing frameworks have to offer, accessibility-aware component test design further helps to provide all the semantic elements needed by screen readers and other assistive technologies.
Firstly, instead of using test ids or classes to find and select the elements you want to validate, use a principle of identifying elements by ARIA roles or other semantic attributes that are used by assistive technologies. Some testing libraries, like Testing Library, even recommend this in their documentation. Secondly, do not just test for click interactions; also consider users who cannot use a mouse or see the screen, and consider adding additional tests for the keyboard and other interactions.
-
Like many in the software industry, we've been exploring the rapidly evolving AI tools that can support us in writing code. We see many people feed ChatGPT with an implementation, and then ask it to generate tests for that implementation. However, because we're big believers in TDD, and we don't always want to feed an external model with our potentially sensitive implementation code, one of our experiments in this space is a technique we call AI-aided test-first development. In this approach, we get ChatGPT to generate tests for us, and then a developer implements the functionality. Specifically, we first describe the tech stack and the design patterns we're using in a prompt "fragment" that is reusable across multiple use cases. Then we describe the specific feature we want to implement, including the acceptance criteria. Based on all that, we ask ChatGPT to generate an implementation plan for that feature in our architectural style and tech stack. Once we sanity check that implementation plan, we ask it to generate tests for our acceptance criteria.
This approach has worked surprisingly well for us: It required the team to come up with a concise description of their architectural style and helped junior developers and new team members code features aligned with the team’s existing style. The main drawback of this approach is that even though we don't give the model our source code, we still feed it potentially sensitive information such as our tech stack and feature descriptions. Teams should ensure they're working with their legal advisors to avoid any intellectual property issues, at least until a "for business" version of these AI tools becomes available.
-
We've featured large language models (LLMs) like BERT and ERNIE in the Radar before; domain-specific LLMs, however, are an emerging trend. Fine-tuning general-purpose LLMs with domain-specific data can tailor them for various tasks, including information retrieval, customer support augmentation and content creation. This practice has shown promising results in industries like legal and finance, as demonstrated by OpenNyAI for legal document analysis. With more organizations experimenting with LLMs and new models like GPT4 being released, we can expect more domain-specific use cases in the near future.
However, there are challenges and pitfalls to consider. First, LLMs can be confidently wrong, so it's essential to build mechanisms into your process to ensure the accuracy of results. Second, third-party LLMs may retain and re-share your data, posing a risk to proprietary and confidential information. Organizations should carefully review the terms of use and trustworthiness of providers or consider training and running LLMs on an infrastructure they control. As with any new technology, businesses must tread carefully, understanding the implications and risks associated with LLM adoption.
-
It can be a bit daunting to make a web application compliant with assistive technologies when you yourself never use them, and you feel like you don't yet know anything about directives like the Web Content Accessibility Guidelines (WCAG). Intelligent guided accessibility tests are one category of tools that help test if you've done the right thing without needing to be an expert on accessibility. These tools are browser extensions that scan your website, summarize how assistive technology would interpret it and then ask you a set of questions to confirm whether the structure and elements you created are as intended. We've used axe DevTools, Accessibility Insights for Web or the ARC Toolkit on some of our projects.
-
Team knowledge management is a familiar concept with teams using tools such as wikis to store information and onboard new team members. Some of our teams now prefer to use Logseq as a team knowledge base. An open-source knowledge-management system, Logseq is powered by a graph database, helps users organize thoughts, notes and ideas and can be adapted for team use with Git-based storage. Logseq allows teams to build a democratic and accessible knowledge base, providing each member with a personalized learning journey and facilitating efficient onboarding. However, as with any knowledge management tool, teams will need to apply good curation and management of their knowledge base to avoid information overload or disorganization.
While similar functionality is available in tools like Obsidian, the key difference lies in Logseq's focus on consumption, with paragraph-based linking enabling team members to quickly find the relevant context without having to read an entire article.
-
Prompt engineering refers to the process of designing and refining prompts for generative AI models to obtain high-quality responses from the model. This involves carefully crafting prompts that are specific, clear and relevant to the desired task or application in order to elicit useful outputs from the model. Prompt engineering aims to enhance large language model (LLM) capabilities in tasks like question answering and arithmetic reasoning or in domain-specific contexts. For software creation, you might use prompt engineering to get an LLM to write a story, an API or a test suite based on a brief conversation with a stakeholder or some notes. Developing effective prompting techniques is becoming a valuable skill in working with AI systems. There is debate over whether prompt engineering is an art or science, and potential security risks, such as “prompt injection attacks,” should be considered.
-
When deploying infrastructure as code, we've noticed that a lot of time can be spent diagnosing and repairing production issues that result from systems being unable to communicate with one another. Because the network topology between them can be complex, the entire route may not be traversable even if individual ports and endpoints have been configured correctly. Infrastructure testing practices usually include verifying the right ports are open or closed or that an endpoint can be accessed, but we've only recently begun doing reachability analysis when testing infrastructure. The analysis generally involves more than simple yes/no determinations. For example, a tool might traverse and report on multiple routes through transit gateways. This technique is supported by tools across all the major cloud providers. Azure has a service called Network Watcher that can be scripted in automated tests and GCP supports Connectivity Tests. Now, in AWS, you can test reachability across accounts in the same organization.
-
Large language models (LLMs) generally require significant GPU infrastructure to operate. We're now starting to see ports, like llama.cpp, that make it possible to run LLMs on different hardware — including Raspberry Pis, laptops and commodity servers. As such, self-hosted LLMs are now a reality, with open-source examples including GPT-J, GPT-JT and LLaMA. This approach has several benefits, offering better control in fine-tuning for a specific use case, improved security and privacy as well as offline access. However, you should carefully assess the capability within the organization and the cost of running such LLMs before making the decision to self-host.
-
Tracking technical debt is a perennial topic in software delivery organizations. What is technical debt and what is not? How do you prioritize it? And most importantly, how do you express the value of paying it off to your internal stakeholders? Following the Agile Manifesto’s manner of reasoning — "while there is value in the item on the right, we value the item on the left more" — we like the idea of tracking health over debt. The folks at REA in Australia share a good example of what such health tracking can look like. They track system ratings in the categories of development, operations and architecture.
Focusing on health instead of debt is a more constructive framing. It connects a team to the ultimate value of reducing debt and helps them prioritize it. Every piece of tackled technical debt should ideally be connectable to one of the agreed expectations. Teams should treat the health rating the same as other service-level objectives (SLOs) and prioritize improvements whenever they drop out of the "green zone" for a given category.
-
If not properly secured, the infrastructure and tools that run our build and delivery pipelines can become a big liability. Pipelines need access to critical data and systems like source code, credentials and secrets to build and deploy software. This makes these systems very inviting to malicious actors. We therefore highly recommend applying zero trust security for CI/CD pipelines and infrastructure — trusting them as little as necessary. This encompasses a number of techniques: If available, authenticate your pipelines with your cloud provider via federated identity mechanisms like OIDC, instead of giving them direct access to secrets. Implement the principle of least privilege by minimizing the access of individual user or runner accounts, rather than employing "god user accounts" with unlimited access. Use your runners in an ephemeral way instead of reusing them, to reduce the risk of exposing secrets from previous jobs or running jobs on compromised runners. Keep the software in your agents and runners up to date. Monitor the integrity, confidentiality and availability of your CI/CD systems the same way you would monitor your production software.
We're seeing teams forget about these types of practices particularly when they’re used to working with a self-managed CI/CD infrastructure in internal network zones. While all of these practices are important in your internal networks, they become even more crucial when using a managed service, as that extends the attack surface and blast radius even more.
Hold
-
As remote work continues to increase, so does the adoption of chat collaboration platforms and ChatOps. These platforms often offer webhooks as a simple way to automate sending messages and notifications, but we're noticing a concerning trend: the casual management of webhooks — where they’re treated as configuration rather than a secret or credential. This can lead to phishing attacks and compromised internal spaces.
Webhooks are credentials that offer privileged access to an internal space and may contain API keys that can be easily extracted and utilized directly. Not treating them as secrets opens up the possibility of successful phishing attacks. Webhooks in Git repos can easily be extracted and used to send fraudulent payloads, which the user may not have any way to authenticate. To mitigate this threat, teams handling webhooks need to shift their culture and treat webhooks as sensitive credentials. Software developers building integrations with ChatOps platforms must also be mindful of this risk and ensure that webhooks are handled with proper security measures.
-
While serverless architectures can be extremely useful for solving some problems, they do come with a certain level of complexity, especially when they involve nontrivial execution and data flows across multiple interdependent Lambdas — this can sometimes result in a Lambda pinball architecture. Our teams have reported that maintaining and testing Lambda pinball architectures can be very challenging: understanding the infrastructure, deployment, diagnosis and debugging can become difficult. At a code level, simple mapping between domain concepts and the multiple Lambdas involved is practically impossible, making any changes and additions challenging. Although we believe serverless is the right fit for some problems and domains, it's not a "silver bullet" for every problem, which is why you should try to avoid Lambda pinball. One pattern that can help is to draw a distinction between public and published interfaces and apply domain boundaries with published interfaces between them.
-
While the practice of creating excess capacity in the delivery process is well-known in the product management community, we still see far too many teams planning for full utilization of team members. Reserving some capacity during sprint planning generally leads to better predictability and better quality; it promotes team resilience to unexpected events like illnesses, production issues, unexpected product requests and tech debt, while also allowing productive activities like team building and ideation that can lead to product innovation. Running at less than full utilization means teams can be more thoughtful about the robustness of the resulting software and pay closer attention to the right observability signals. Our experience is that a fully utilized team leads to a collapse in throughput as well, just as a fully utilized highway creates slow and demoralizing traffic. For example, when one of our teams had unpredictable support issues, they saw a 25% increase in throughput and a 50% decrease in cycle time volatility by planning feature velocity based on only two of the three developer pairs' capacities.
- New
- Moved in/out
- No change
Unable to find something you expected to see?
Each edition of the Radar features blips reflecting what we came across during the previous six months. We might have covered what you are looking for on a previous Radar already. We sometimes cull things just because there are too many to talk about. A blip might also be missing because the Radar reflects our experience, it is not based on a comprehensive market analysis.
Unable to find something you expected to see?
Each edition of the Radar features blips reflecting what we came across during the previous six months. We might have covered what you are looking for on a previous Radar already. We sometimes cull things just because there are too many to talk about. A blip might also be missing because the Radar reflects our experience, it is not based on a comprehensive market analysis.
