Menu

Constraint driven automation - A Case Study

My last gig as a tech lead was on Bums on the Saddle, an ecommerce startup where we had to get a working piece of software with minimal functionality to production within a week. We then had to follow it up with frequent releases and be done with the minimum viable product (MVP) in 8 weeks. After the MVP we had to incorporate feedback and the store’s backend needs. We signed up for all of this and more in a Lean Startup style gig.

In this project, we questioned everything. By everything I mean everything-- TDD, CI, pairing, keeping to the left side of the road…everything! We wanted to follow Morpheus see what rules could be bent or broken.

We chose spree as our ecommerce platform. Given that it’s a Rails-based platform with a very healthy extension ecosystem we were in good shape to go live soon.

Automation "First Blood" - Deployments

Our first real problem was regarding deployments. We were deploying anywhere from 4 to 10 times a day to prod. Doing it manually was proving very costly, as it required constant developer attention, the cost of making a mistake was very high, and more. It was proving to be a real constraint. We especially felt the pinch when the number of developers went up from 1 to 3. That essentially meant a significantly more number of deployments!

It was finally time for us to do our first automation-deploy using capistrano. We got the basic cap script working in a few hours and enhanced it throughout the lifecycle of the project, by adding the ability to deploy a given branch, rollback, rebuild the staging database, etc. Wait, what did you say about staging? Oh yes, lets talk about staging…

Introducing a staging server

Our sign-off mechanism in the early days was “Product owner takes a look at the dev box and then signs off on prod” (OK, don’t cringe. We are pushing the limits remember?). This worked fine for us a team of 4 people in engineering and 1 person from the customer’s team. Till then we were still building up the store’s catalogue. The store was live with a NEFT payment mechanism.

However, the moment we started working on a payment gateway integration, we were stuck. We could no longer test on production. That meant we had to get a different box. The standard solution is to get a staging server. We did just that. We also made sure the staging was a mirror of prod in terms of OS and the applications. We versioned our configuration.... wait…that required new automation to solve a different constraint.

Versioned Configuration

The configuration required was no longer easy to maintain since there were 2 of everything. We were making silly mistakes – sitemap was being served in staging but not prod, etc. Since we now had 2 places where configuration was required, we versioned it. Apache server, load balancer, database and the rails server configurations were all checked in. Our cap script made sure that the configurations were all reset and checked out at each deployment.

Integration tests crept up

Things were going smoothly until we realized we had to change the basic checkout flow that spree offers. This meant that we had to change some model objects, and in turn break some assumptions made by spree.

Enter integration tests. We wrote 4 basic tests which tested that the spree functionality – data and logic – worked fine with the changes we added through our inflections.

Unit tests for those pesky computations

Finally came the custom tax and shipping calculators. It’s just so easy when you have 10 test cases already running, and the 11th case that you need breaks an existing test if you introduce a bug! We made sure that happened in our case.

Evolution of automation in this project

By the end of the 12 weeks gig, we had pretty much 4 different kinds of automation -- unit, integration, infrastructure & configuration and finally deployment. What is interesting to me is the automation evolution. The key to me is that constraints drove our automation. Whatever stood in the way of fast feedback or cycle time got automated right away. Whatever was a nice-to-have was left as-is.

How different is the automation evolution on your project? Or do you follow the set path of processes like XP?