In Defense of Story Points
Story points often get a bad rap, but with the right approach, they’re a powerful tool for managing expectations. Here’s how our development team at Sanctuary Computer uses them effectively.
The mention of story points is bound to elicit a few sighs or eye rolls from developers who have been frustrated by their inaccuracy before. When I joined Sanctuary, I was skeptical. I’d been burned by underestimates and had to work long hours to make up the difference and deliver features on time anyway. Two years and several projects later, I won’t pretend that they’re perfect, but I do think they get a bad reputation due to a few pitfalls that can be easily avoided. So far, they’re the best tool we have for setting healthy expectations with our team.
Why estimate at all?
In a client services setting, our team needs to set expectations with clients about how much something will cost. That number usually gets crystallized in a proposal and then we do everything we can to stick to it and do right by our partners. There are other types of engagements, but a lot of the time it boils down to this.
What are story points and how do they work?
At Sanctuary Computer, we use story points to estimate complexity. Here are a few examples, pulled straight from our internal documentation:
1 Point
A trivial change that has zero chance of breaking anything. A copy change, adding comments to a file, changing minor CSS. Maximum investment: 1/2 day.
2 Points
A small-to-medium change that is mostly known, but requires a lil' investigation (maybe you don’t know which file to make the change in). Maximum investment: 1 day.
3 Points
A change that is straightforward, but likely touches 5+ files, and may need to interface lightly with parts of the codebase you aren’t familiar with. Maximum investment: 2 days.
5 Points
The “largest” estimate that doesn’t eventually require further breaking down. The work has limited scope but requires decision making, planning, building, testing and more. Maximum investment: 3.5 days.
And so on, up to 21. (If you’re curious about why we’re using Fibonacci numbers, keep reading!)
Story points map to time (and therefore budget) via a team’s velocity, i.e. the average number of story points completed by a team per week. Multiply the number of weeks of development needed by our hourly rate, and boom, you’ve got a budget!
Development teams don’t estimate the number of hours needed directly because emotional attachment to timelines and ideas about how long something “should” take can cloud a developer’s ability to look at a task objectively (head here to read more about story points in general—they have been used in Agile software development for a while).
At their worst, story points do the exact opposite of what they are supposed to do: they create a competitive and stressful environment. Sanctuary has not been immune to the pitfalls of using story points, and we still aren’t. However, there are a few guidelines we try to abide by that have been helpful:
Guideline # 1: Don’t use story points to measure individual performance.
Teams that place a strong emphasis on distilling an individual’s performance to the number of story points they knock out per week will be low-functioning teams. Doing this will foster a culture where engineers work in silos instead of helping each other out, race to grab the pieces of work they’re most comfortable with instead of expanding their skillset, and contribute to the volume of bugs in a codebase as they are more incentivized to mark a task as “done” than to make sure it is done right or spend time reviewing their peers’ code closely.
Furthermore, story points only tell the truth on the aggregate. Humans are notoriously bad at estimating things accurately: sometimes a task we thought would take two hours takes two days, and other times the opposite happens. A large volume of estimates is needed for them to balance out, meaning a team's velocity over an extended period will be more reliable than an individual's performance.
At Sanctuary, we aim to monitor the overall team’s velocity week over week instead of any one individual’s.
Guideline # 2: Involve the developers who will actually do the work.
Whenever possible, the developers who will do the work should be the developers who estimate the work. At the end of the day, story points translate to deadlines, and deadlines are much more effective when they are self-imposed. There is nothing worse than being handed a deadline that you did not agree to and do not think is possible.
Each developer comes to the table with their own areas of expertise and will help the estimates account for things that are often forgotten, such as testing, accessibility, extra communication with stakeholders needed for complex features, etc.
It is important to remember that getting buy-in from the team goes beyond just asking them for a quick review but actually making sure that they have the time, confidence, and context to review the estimates objectively and disagree with you if needed. Check your ego (did I estimate this low just to prove I could get it done quickly?) and check the power dynamics (did I create a safe environment for more junior engineers on my team to ask questions and speak up if they thought an estimate was too low?).
Guideline # 3: Make many small guesses instead of one big guess.
We estimate work using Fibonacci numbers: 1, 2, 3, 5, 8, 13, 21. As the numbers increase, the distance between them also increases. We use Fibonacci numbers because it’s pretty easy to tell the difference between a 1 and 2 point item, but much harder to tell the difference between a 5 and 6 point item - so if it’s a 6, let’s round up to an 8. As the complexity and scope of work increases, the margin of error on the estimate also needs to increase.
Along those lines, the sum of a bunch of small guesses will be more accurate than one big guess. Even when we need to estimate something before having the exact requirements or designs, we try to break the scope of work down into small pieces and estimate those. For example: a guess about how long it will take to build out all of “Account” functionality is likely to be off. But a guess that is comprised of many small guesses (how long it takes to build sign in, sign out, edit profile details, and other things that are typically found in Account functionality) will be more accurate, even if the final feature set for Account ends up being different.
Ultimately, the culture around story points in an organization is as crucial as the estimation process itself. While far from perfect, story points remain the best system we’ve found for setting realistic expectations about project costs and timelines.