Introduction
Four years ago, I embarked on my management career, managing a team of 5 software engineers. In addition to the management responsibilities, I was in the trenches with the engineers doing development, design reviews, code reviews and deployments. I knew all the details of the project at that stage of my growth. Eventually, I grew to manage multiple teams. I managed my team leads, and they managed the engineers. I had to depend on data from tools and leads. It was a transition that I’ve anticipated. I went through multiple iterations with the help of my teams, colleagues and mentors to define the right set of data to measure. High-performance teams need very little intervention. As a manager, you have to intervene when your teams need your support. To know when to intervene, you need the right data. I’ve compiled a list of data and metrics that I use to best support my teams.
I’ve categorized the data into two, tactical and strategic. I’ve listed the metrics and action items that you need to perform to support your team. JIRA, CodeClimate and Pluralsight Flow are the tools that I use to collect my data.
Tactical Data
Tactical data helps you identify items that need your immediate attention. You probably should chat with the individuals involved as soon as possible to resolve the matter.
Blocked tickets
Tickets that a blocked due to internal or external dependencies. You probably want to give your team a chance to resolve the blockers in a timeboxed limit. Often the teams resolve the internal dependencies faster than the external dependencies. Not all blockers are the same, so you should help your team identify the external/internal dependencies and coach them to resolve the dependencies themselves. As a leader who has a broader network in the organization, you may have more context into the matter to help your team by providing background and connecting with specific individuals to be unblocked. You have to train your team to resolve similar dependencies themselves in the future.
Stuck tickets
Tickets that are taking longer to complete than initially estimated. Someone is trying hard to complete the work but delayed. Most likely, due to unidentified dependencies and risks. You can derisk by: Adding extra resources Reducing scope without compromise to business Removing blockers
Too many Work in Progress(WIP)
Tickets that are in progress exceeds the total number of developers in your team. Developers are working on parallel items may be due to randomization. Parallelizing work is expensive due to the context switching. Parallelization results in slow progress neglected work and missed blindspots. Ensure that your teams are working on one item at a time to increase maximum performance.
Too much churn
Tickets are moving back and forth between testing, code review and development multiple times. You should identify the root cause by identifying the following: Technical debt Missing requirements Communication issues Lack of context
High priority production defects
Production defects are defects with financial impacts or business impediments. Such tickets are most significant that you must provide all the support you can to the team. Keep track of these defects with timely updates from the team. Cancel your meetings if you need to support your team. Book a meeting room or start a video conference for your team to swarm and solve
Strategic
Strategic data helps you identify items that have long term impact. Strategic data helps you identify patterns in your team and the organization that may be impacting the business in the long run. You don’t have to accept your current state as the norm but challenge the status quo and keep pushing for efficiency with the least amount of churn. Perfection is an ongoing journey, not a state.
System health and cost
System health is a measure of our customer experience and costs consciousness. Short response time and low errors showcase the quality of our end users and our lean expense. You can measure the response time and errors and calculate the savings. It is best if the system health and cost are visible to everyone in the organization as a scorecard. Visible data encourages collabroation and promotes engineers to come up with their plan. Execute the plan in phases without any impact to the business. Following metrics is a good starting point:
- Response time
- CPU usage
- Memory usage
- Number of server instances
- Errors/Crashes
Developer productivity
Tools like Pluralsight Flow provides metrics and dashboards that you can easily measure the productivity of your team in comparison to industry standards. If you do not have access to those tools, you can still measure productivity through Git commands or Github, through automated or manual data collection. I use the following data to identify where my team needs help:
- Code review duration
- Churn on a code review
- Code review contribution/collaboration
- Days per week that developers commit code
You should coach your team on the value of collaboration and reduced time to get features out the door. You may have to work with the team or the individuals to steer them in the direction to increase productivity.
Architecture health
I measure architecture health by measuring how fast we can get a feature out and how long can we keep adding new code without coming to a halt. In a well-architectured application, the developer can add a new feature and take risks without defects in the shortest amount of time possible. Following metrics help you measure architecture health
- Lead time to ship changes
- Churn on specific files or modules
- Code coverage
- Maintainability(Duplication, function/file length, Coding standards)
- Reproducibility of production defects in local environments
You must refactor, add tests and reduce complexity in our code to allow the developers to work without any impediments.
Conclusion
Keep in mind that people do actions with best intentions, but the results may vary. Using data, you can keep your team aligned to yield the best results. Be a great manager by stepping in when you have to and let your engineers do their job without interruptions. If the data does not reflect reality, then revisit and change how and what you measure. As an experienced leader, you must determine when you should stop improving as certain improvements may result in diminishing returns.