Making wholesales changes within an IT organization is risky, expensive, and prone to failure. If your combined systems have evolved into a tangled web then you need to lay the groundwork to evolve to a solution. You have several options for improvement. A one time step change of systems or practices can yield good, if mixed, results. For example, if you invest capital in a program to improve design or development skills, simplify and streamline infrastructure, retire redundant systems, or even replace parts of you enterprise then you should see some returns on this investment. Once the burst of activity is finished organizations generally fall back into a new normal state, which is not significantly different than the old normal.
Continuous delivery (CD) and lean practices are a better route for longer term, sustainable improvement of the enterprise. Most people think of CD practices when they're starting the development a single system but I think the real value comes from untangling your enterprise. Systems that are difficult to upgrade or change need CD practices. Systems that are difficult to integrate need lean thinking. Moving an enterprise to leaner practices means getting increased value out of the entire integrated systems portfolio. Isolated changes don't contribute significantly to larger value chain. This applies not only to newer greenfield systems where you have a high degree of control over the delivery practices, but especially to larger and older systems that are developed in-house or by a vendor.
#1 Review your Systems/Change Management
Start by looking in detail at the deployment and change practices around your individual key systems. Most likely there is a high degree of manual processes and controls for making changes. Apply value stream mapping or even standard flow charting techniques to the most basic building block of systems management - the ability to successfully deploy a system into a new environment. Although many systems are upgraded in place this rarely touches all parts of the system footprint which includes deployment of all artifacts to operate the system, applying an environment specific configuration, setting up database schemas and base data, configuring endpoints for integrations, and ensuring that there is monitoring and logging. Deploying a system fresh is the only sure way to have a known state and to understand the characteristics of changing any part of the whole system. Think of it as "green fielding" an existing system.
#2 Automate as much as you can
Once you think that you understand the process then actually measure deployment through to a fully functional state within the new environment. Gather time and people measures on all the steps that you identified and see if anything is missing. If the systems have little or not automation for deployments then keep it low tech at this point using timers or stopwatches for each step. Identify the roles and experienced people needed for each step in the deployment process. For example, do you need a DBA with deep proprietary knowledge or can any support engineer setup the database? Were there wait states between steps when nothing was happening? Make sure those are part of the measures. Was it difficult to verify that system was configured properly? Is everything actually operating correctly such as integration points and logging? Was it difficult to setup a base set of data so the system could run correctly? Did you need to copy or replicate any component, configuration, or data from an existing environment to be successful?
Automated processes for the development, testing, deployment, and operations of a system are preferable to manual steps or processes. No question about it. Your desired vision and end state should always be automated and easily evidenced pipelines of delivery and management for your systems.
While you're moving towards that ideal end state always consider manual improvements along the way. The aim here is to develop an ability to continually improve your systems management so you can methodically untangle and improve your enterprise. If you can easily improve manual processes then you should do it immediately - especially if it reduces wait states in your process.
So what should you automate? - Anything that you do repeatedly in each environment from development to production. Deployments are a good place for initial focus. If this is currently difficult with a particular system then break down the problem into smaller parts. Are deployable artifacts managed and versioned? Is it difficult to deploy the application without manually changing the configuration each time? Can we automate system verification with a health check page or report that includes configuration, versions of code and schemas, and current state of any end points?
#3 Start small
You may need to repeat this process more than once until you're actually successful. Ask qualitative questions to the staff who are actually doing the actual work. If other people replace certain individuals would you have a much poorer outcome? Are vendors necessary for any or all of the process? Is there heavy resistance to the idea of testing your deployment process? These represent constraints in your future ability to make improvements to individual systems within your organization.
Next, take a look at the overall process of making a small change to that system, and then moving that change to production. Again use value stream mapping to get an overall understanding of the process. You should now have a unit of measure - the ability to deploy the system to a new environment - use this to identify likely problems in the overall process. Again, measure the steps and wait states in the process. Do you upgrade a system rather than do a clean deploy? Does this take more or less time? Does an upgrade and a clean deploy yield the same results? Does it take the same time to verify the successful deployment at each stage? Does the process follow a path and stages that is similar to other systems in your enterprise? Is it difficult to verify the versions of code, configurations, and schemas before and after deployment? Is the deployment process consistent in each environment including production? Are code management practices consistent and easy to verify?
#4 Scale the success.
Once you've developed a tangible understanding of the processes for managing the individual system in your enterprise then you can start identifying areas for improvement. Brainstorm ideas with members from delivery, operations, and support teams. If you're not sure who should be involved then think about which teams you would gather in a room to diagnose a serious issue with the system. They all have a stake. Question each part of the process and ask if it is adding value and can be improved. Generate ideas, size them, estimate impact to the overall process, and prioritize them. A very helpful technique in analyzing problems or bottlenecks is root cause analysis using the '5 Whys' which is a technique for finding the real cause of the problem and dealing with it rather than simply identifying the symptoms. This will generate multiple ideas in a structured and lightweight manner that will be useful in brainstorming improvements.
#5 Setup a system of Continuous Improvement
At this point you can start to apply a continuous improvement process to each system using lean startup thinking. At it's simplest think of it as a three step cycle. First, based on your ideas and priorities, introduce an improvement into the deployment and delivery process through to production. With the modeling described above you should be able to quantify the change you're going make both in terms of the current cost and improvement you hope to see. Time is the most common measure, either in people's time or the elapsed time to complete a step. Second, measure the change both in the individual step and the overall process. It is important to measure if the change had side benefits or consequences to the overall process. Third, as a team review the impact of the change to process and learn from the experience. Even if the change didn't have the impact that you expected you've still learned valuable information about the process and system it supports. Again, review your priority list of improvements and make a change to the process of delivering the system into production. Then start the improvement cycle again. The "Build - Measure - Learn" cycle that Eric Ries describes in his Lean Startup book is an extremely powerful and focused approach for improving the management of large corporate systems.
Once you develop a predictable and continually improving cadence for deploying your critical systems then you have the basic groundwork for improving your overall enterprise footprint. You can start to plan small, lower risk changes to your architecture and integrations since the cost of moving to production is a known constraint that you can easily incorporate into an improvement plan. In essence, it's a powerful risk management tool. Deployments become a known, and lowering variable - not an impediment.
After the deployment process is stable and improving then you can start to look at other techniques for reducing your cycle times:
- Functional, stable end points in the system for smoke and functional testing
- Improved system verification points for testing and monitoring including data, integration points, and logging
- Versioned and flexible configurations which enable different test strategies and missions
- Versioned and automated deployments for database schemas including rollback
- Standard health check pages for all systems in your footprint with status, versions, and configurations
- Automated build and deploy tooling and pipelines which manage and report the overall process
An additional benefit is that you can create greenfield environments for experimentation and investigation. For example, you might want to improve the flexibility of the system configuration for a particular system but you're concerned about "polluting" your existing test environments. Create a clean install on a server with limited capability where your teams can experiment with ideas and techniques, then throw it away once they are confident in their plan. In the early days of continuous improvement you won't have the luxury of automated build, test, and deploy pipelines so you need a safe haven while you're building the capabilities.
Finally, questions always arise about scaling these types of continuous improvement efforts to a large enterprise with dozens, or even hundreds of systems. Many of the activities I describe sound labor intensive. Much of the work can be done with simple tools like whiteboards and spreadsheets. However, the skills and expertise that your teams develop at evaluating and improving the systems that they support and integrate is the most important outcome. Once a sustainable and measurable practice of improvement is in place for key systems then you have the opportunity to increase organizational alignment through the sharing of knowledge and skills across functional groups and in different system domains. Scaling comes through broader knowledge and an environment where skills and abilities cross-functional boundaries. The true measure of progress will be the accelerated learning across your organization.
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.