Aging community chart

To understand a free/open source software project, you need to know how the community that contribute is evolving. For a young project, it is important to attract new developers, meanwhile in mature projects you should retain old ones.

Among all the many aspects to explore, there are two important metrics:

Turnover

Shows how people are entering and leaving the community. Indirectly, it gives you an indication of how attractive the community is and how well it retains people once they join.

Age

This term refers to “length of time in the project” and measures how long ago each current member joined it. This tells you how many people are available at different stages of experience, from old-timers to newbies.

Together, both metrics can be used to estimate engagement, to predict the future structure and size of the community, and to detect early potential problems that could prevent a healthy growth.

The community aging chart

Both turnover and age structure can be estimated from data in software development repositories. The main source of this information is the source code management repository (such as Git), which provides information about active developers authoring the software. The issue tracking system is another interesting source of information as well.

The community aging chart can be used to visualize turnover and age structure data obtained from these repositories. It represents the “age” of developers in the project, in a way that provides insight on its structure. The following figure shows the community aging chart for contributors in Git repositories of the CHAOSS project in September 2020.

The Y axis shows different “generations” of project members. The chart is divided into periods of six months, with the oldest generation at the top and the youngest at the bottom. For each generation, the green bar (Attracted) represents the number of people that joined it. Meanwhile the blue bar (Retained) represents how many people in that generation are still active in the community.

The ratio of the pair of bars for each generation is its retention ratio. By comparing the lengths of each pair of bars, we can quickly learn which generations were most successfully retained, and which ones mostly abandoned the project. For the newest generation, retention will always be 100%, since people recently entering the community are still considered to be active (but that depends on the inactivity period, as I’ll explain in a moment). A ratio of 50% means that half the people in the generation are still retained.

The evolution of green bars tells us about the evolution of attraction over time. Most successful projects start with low attraction, but at some point they become very attractive, and the bars grow quickly. When a project enters maturity, its attraction usually becomes more stable, and can even decline, just because it is no longer “sexy enough” for potential newbies. A large project with declining attraction can remain extremely successful, though.

The evolution of blue bars tells us about the current age structure of the community. If bars in the top are large, but those in the bottom are small, the community is retaining early generations very well, but having difficulties retaining new blood. On the contrary, if bars in the top are small while those in the bottom are large, newcomers are staying, while experienced people have already left. Blue bars can be only as large as green bars (you cannot retain more people from a certain generation than you originally attracted). Therefore, “large” and “small” for blue bars is always relative to green bars.

The community aging chart is built taking into account three parameters:

Generation period

People in the community will be charted according to their generation, using this granularity.

Inactivity period

How long we wait before considering that somebody left the community. We don’t know whether anyone really left the community: maybe they are on vacation, or on a medical leave. So we have to choose a certain time period, and decide that “if somebody was not active during the last M months, we consider that person as a departure from the community”. That M is the inactivity period, which is usually equal to the generation period, but could be different. For Cauldron we are using 3 months since their last contribution.

Snapshot date

The date at which we determine who is retained. Although the above figure is generated with the current date as the snapshot day, it’s valuable to generate similar charts to show who was retained at various past dates. Comparisons of charts for different snapshot dates say a lot about the evolution of the project’s ability to attract and retain members. You can change in Cauldron the dates with the top-right filter.

Comparing the community aging chart from the past with the current chart shows the difference in the potential of the project to grow over time. In most development communities, people inactive for a long period are very unlikely to show up again. That means that the sum of the retention bars in the chart snapshotted one year ago is the maximum population that the community is going to have one year later, save for the generations entering during the intervening year.

This article is mainly obtained from https://www.oreilly.com/content/measure-your-open-source-communitys-age-to-keep-it-healthy/ and adapted for Cauldron. It was written originally by Jesus M. Gonzalez-Barahona.