To use points or not to use them, that is often the topic of a lot of discussion and conjecture. And the humble point is often lost/misunderstood in all the talk. So I thought I'd dedicate this post to do a Point 101 to refresh our understanding of the point, and its usefulness in estimation.
It is a subjective unit of estimation used by Agile teams to estimate User Stories.
They represent the amount of effort required to implement a user story. Some agilists argue that it is a measure of complexity, but that is only true if the complexity or risk involved in implementing a user story translates into the effort involved in implementing it.
It includes the amount of effort required to get the story done. This should ideally include both the development and testing effort to implement a story in a production-like environment.
Story point estimation is done using relative sizing by comparing one story with a sample set of perviously sized stories. Relative sizing across stories tends to be much more accurate over a larger sample, than trying to estimate each individual story for the effort involved.
As an analogy, it is much easier to say that Delhi to Bangalore is twice the distance of Mumbai to Bangalore than saying that the distance from Delhi to Bangalore is 2061 kms.
Teams are able to estimate much more quickly without spending too much time in nailing down the exact number of hours or days required to finish a user story.
The most common way is to categorize them into 1, 2, 4, 8, 16 points and so on. Some teams prefer to use the Fibonacci series (1, 2, 3, 5, 8). Once the stories are ready, the team can start sizing the first card it considers to be of a “smaller” complexity.
For example, a team might assign the “Login user” story 2 points and then put 4 points for a “customer search” story, as it probably involves double the effort to implement than the “Login user” story. This exercise is continued till all stories have a story point attached to them.
The team who is responsible for getting a story done should ideally be part of the estimation. The team’s QAs should be part of the estimation exercise, and should call out if the story has additional testing effort involved.
For example supporting a customer search screen on 2 new browsers might be a 1 point development effort but a lot more from a testing perspective. QAs should call this out and size the story to reflect the adequate testing effort.
This can be done by providing 3 different point values for the best, likely and worst case scenarios. It is quite effective when estimating a large sample set of stories especially during the first release of the project where little code has been written. Doing this provides a range across which estimates may vary depending on outcomes of certain assumptions made. For example a best case estimate for the Login story could be 2 points assuming integration with a local LDAP server, but if that assumption changes to a 3rd party provider integration, the worst case could be 8 points.
To do that, the team needs to calculate their velocity in terms of number of points the team can deliver in an iteration. This is typically done using yesterday’s weather by averaging the velocity achieved by the team in the last 3 iterations.
If the team is starting afresh, then a raw velocity exercise is done, where the team decides how many stories it can finish in an iteration. This is done by repeatedly picking different sample sets of (previously-sized) stories which can be done within an iteration. Average the total points across different picks to get the team’s iteration velocity.
For example, if the result of 3 picks was 6, 8 and 10 points for a 2 week iteration then (10+8+6)/3 = 8 points is the raw velocity for the team for 2 weeks. A schedule can then be laid out assuming the team finishes 8 points in a 2 week iteration.
Different teams will have different measures of story points based on the set of stories they are sizing. Unless they are building the same system, the effort required to finish a 1-point story by team A will differ from that required by team B in their system. This difference will reflect in the velocities of teams A and B.
If there is a large program of work split amongst multiple teams, it is tempting to attempt to standardize the point scale across these teams. This defeats the purpose of estimating using story points and it being a unit of measure subjective to a team.
Spike stories are played to better understand how to implement a particular feature, or as a proof of concept. Since in a spike very little is known about the amount of effort involved, it is typically time boxed with an outcome that the team can agree upon. This can be approximately converted into points by looking at the velocity trend.
For example, if it is required to plan a week-long spike, and the team velocity is 16 points, then we can assign 8 points to the spike story.
Cost per point will typically be (Cost of an iteration) / (Velocity per iteration (in points)). In cases where there is an additional stabilization sprint or regression iteration, the cost of that iteration should also be included.
The effort and time required to arrive at an accurate number in days/hours for a story weighs against the benefits of estimating as such.
Moreover estimating in days/hours puts undue pressure on the team to deliver within those number of days, leading to the team unable to reach a sustainable pace, and possibly burning out .
Story points are an internal measure of effort involved in implementing a user story. It does not, in any way, reflect the amount of business value a user story provides. There might be cases where a 1-point story might provide a lot of business value versus a 4-point story in the same system. Business value is best left for the product owner and business stakeholders to be able to determine.
It is a popular belief that if the team were to estimate in ideal days, then it would be much easier to track if the estimation is accurate, by checking the actual days elapsed on a story and the progress against it. This is however counter-productive as the team spends hours to estimate few stories to arrive at the magic number of days before being pressurized to deliver on that magic number.
When a team is relatively sizing stories in points, a trend slowly starts emerging where similarly sized stories start showing similar time to implement them. If there is a bad estimate, then that bubbles up automatically as an exception
If a story A was classified in the 2 points bucket, a similar story B coming in months later should be classified in the same bucket. If the team has learnt more about implementing them between when story A and story B were played, this will show up as an increase in velocity of the team.
It is good to setup a “relative sizing triangulation board” for the team with placeholder stories from the initial estimation session, for the team to relate to while sizing a new story.
Hope this set of questions helps clarify the use of story points in helping you estimate on your Agile project. What are your thoughts on story points?
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.