Alumni blogs

Lots of our people have lots of opinions. Here are just a few of them

ThoughtWorks embraces the individuality of the people in the organization and hence the opinions expressed in the blogs may contradict each other and also may not represent the opinions of ThoughtWorks.

Is good design to be equated with functional?

Musings on my past, good design, functionality, ergonomics, customer experience, taps, light switches and a juicer.

Is good design to be equated with functional?

That was the question. For the next 40 minutes I scribbled the answer to the ‘A’ Level History of Art question.  Twenty five years later two things strike me. Firstly, that my answer must have satisfied the examiner because I got a good grade.  And that after all these years, that question still sticks in my mind.  (My answer, not so much).  It sticks in my mind because I’ve spent most of my working life addressing…

Blog post by Marc McNeill
21 October 2014

Original Link

Neo4j: Modelling sub types

A question which sometimes comes up when discussing graph data modelling is how you go about modelling sub/super types.

In my experience there are two reasons why we might want to do this:

  • To ensure that certain properties exist on bits of data
  • To write drill down queries based on those types

At the moment the former isn’t built into Neo4j and you’d only be able to achieve it by wiring up some code in a pre commit hook of a transaction event handler so we’ll focus on the latter.

The typical example used for showing how to design…

Blog post by Mark Needham
20 October 2014

Original Link

Elasticsearch, Ruby and Unicorn

We have been using Ruby on Rails as well as Elasticsearch for a while. To avoid downtime during deployment, we have been using Unicorn more or less configured like this blog post describes. While running a single instance of Elasticsearch was pretty trivial with Karel’s new Elasticsearch Ruby Gem – moving to a Clustered setup forced us to understand the configuration of the Gem a bit better. I thought I’d sum up a few lessons learned here just in case it might be useful to someone:

There’s quite a few options you can pass into the Elasticsearch client that allow…

Blog post by Erling Wegger Linde
20 October 2014

Original Link

Python: Converting a date string to timestamp

I’ve been playing around with Python over the last few days while cleaning up a data set and one thing I wanted to do was translate date strings into a timestamp.

I started with a date in this format:

date_text = "13SEP2014"

So the first step is to translate that into a Python date – the strftime section of the documentation is useful for figuring out which format code is needed:

import datetime
date_text = "13SEP2014"
date = datetime.datetime.strptime(date_text, "%d%b%Y")
$ python

Blog post by Mark Needham
20 October 2014

Original Link

Neo4j: LOAD CSV – The sneaky null character

I spent some time earlier in the week trying to import a CSV file extracted from Hadoop into Neo4j using Cypher’s LOAD CSV command and initially struggled due to some rogue characters.

The CSV file looked like this:

$ cat foo.csv

I wrote the following LOAD CSV query to extract some of the fields and compare others:

load csv with headers from "file:/Users/markneedham/Downloads/foo.csv" AS line
RETURN,, = "2"
==> +--------------------------------------+
==> | | | = "2" |
==> +--------------------------------------+
==> |    | "2"     | false          |
==> +--------------------------------------+

Blog post by Mark Needham
18 October 2014

Original Link

R: Linear models with the lm function, NA values and Collinearity

In my continued playing around with R I’ve sometimes noticed ‘NA’ values in the linear regression models I created but hadn’t really thought about what that meant.

On the advice of Peter Huber I recently started working my way through Coursera’s Regression Models which has a whole slide explaining its meaning:

2014 10 17 06 21 07

So in this case ‘z’ doesn’t help us in predicting Fertility since it doesn’t give us any more information that we can’t already get from ‘Agriculture’ and ‘Education’.

Although in this case we know why ‘z’ doesn’t have a coefficient sometimes it may not be clear which other variable…

Blog post by Mark Needham
18 October 2014

Original Link

Introducing Bifrost: Archive Kafka data to Amazon S3

We're happy to announce the public release of a tool we've been using in production for a while now: Bifrost

We use Bifrost to incrementally archive all our Kafka data into Amazon S3; these transaction logs can then be ingested into our streaming data pipeline (we only need to use the archived files occasionally when we radically change our computation).


There are a few other projects that scratch the same itch: notably Secor from Pinterest and Kafka's old hadoop-consumer. Although Secor doesn't rely on running Hadoop jobs, it still uses Hadoop's SequenceFile file format; sequence files allow…

Blog post by Paul Ingles
17 October 2014

Original Link

Monitoring Go programs with Riemann uses Riemann to monitor the operations of the various applications and services that compose our system. Riemann helps us aggregate systems and application metrics together and quickly assemble dashboards to understand how things are behaving now.

Most of these services and applications are written in Clojure, Java or Ruby, but recently we needed to monitor a service we implemented with Go.

Hopefully this is interesting for Go authors that haven’t come across Riemann or Clojure programmers who’ve not considered deploying Go before. I think Go is a very interesting option for building systems-type services.

Whence Go?

Recently at…

Blog post by Paul Ingles
17 October 2014

Original Link

XP Wabi Sabi (Refactored)

All requested features delivered. Speculation avoided. Mindful of our tendency toward completeness, necessary code is added, unnecessary code is removed. Refactored.

Implementations incomplete - shadows of the their real-world counterparts, yet precisely the functions and properties required. The desire to add more, tempered by the satisfaction of not doing so.

Technique and knowledge are increased to decrease their application.


I posted that on the WabiSabi page of the c2 wiki on or about October 28th, 2002.

I typically feel the same about my old writing and my old code ("what was I thinking?"), but I like this (even if…

Blog post by William Caputo
17 October 2014

Original Link

XP Customer and Developer Bill of Rights

To help customers and the development team work together, XP came up with a customer and developer bill of rights.

Customer Bill of Rights

  • You have the right to an overall plan, to know what can be accomplished when and at what cost.
  • You have the right to get the most possible value out of every programming week.
  • You have the right to see progress in a running system, proven to work by passing repeatable tests that you specify.
  • You have the right to change your mind, to substitute functionality, and to change priorities without paying exorbitant costs.
  • You have…

Blog post by Jonathan Rasmusson
16 October 2014

Original Link