(This post was crossposted with minor modifications on medium.)
Many Open Source Project maintainers suffer from a significant overdose of GitHub notifications. Many have turned them off completely for that.
We (GitMate.io) are constantly researching about how people handle a flood of incoming issues in our aim to improve the situation by applying modern technologies to the problem. (Oh and we love free software!)
By analyzing the biggest open source repositories on GitHub (more info on the data below) we’ve seen that the contributors to any of those projects responds to only 2.3% of all issues on average. (Let a contributor be a person that commented on at least two issues which they didn’t open.)
This makes clear that for any bigger open source project, “Watching” the repository is resulting in a lot of spam for most of the people. If they don’t respond, notifying them was of no value for the discussion after all.
We can also observe that only very few project managers care for any significant portion of the issues. Only 6 of our human contributors in total care for more than every 5th issue at all. Here’s our heros:
25.05%: golang/go -> ianlancetaylor, Watching 47.48%: moby/moby -> thaJeztah, Watching 27.31%: moby/moby -> cpuguy83, Watching 36.67%: owncloud/core -> PVince81, Not watching 47.12%: saltstack/salt -> gtmanfred, Watching 25.54%: saltstack/salt -> Ch3LL, Watching
However, we do see that 29.1% (117) of all contributors (402) are still subscribed to all notifications of the repository (watching it).
Switching to Polling
Many contributors switch to polling instead of watching the main repository.
However, we still see that the main maintainers keep watching the repository: without them, it’s very easy to miss out on new issues and it’s hard to make sure that the right people take a look at the right issues in a decentralized system.
In many communities we see home grown bots arising that apply labels and sometimes assign people based on keywords. This works especially well for automatically created issues (e.g. from sentry) but is not a full solution.
We’ve tried it. Contributors started mentioning keywords consciously and it didn’t really work for user reported issues.
We wouldn’t be GitMate if we didn’t strive for more. Our data suggests that people are spending way too much time on their notifications. We’ve maintained coala.io in the past and we know that reading through all of them is impossible even for core maintainers. Static keyword based automation doesn’t seem to be enough.
Since quite a while we’re hacking on an artificial intelligence that helps you dealing with this problem by analyzing exactly what every person in your team is discussing about on GitHub or GitLab and mentioning the ones who are important for solving any new issue.
GitMate is built as a full automated triaging solution. Right now it already mentions related developers in new issues, finds duplicates, labels issues and closes old issues. It is already used by companies like ownCloud and Kiwi.comand we’re looking for more beta testers.
About the Data…
We’ve scraped data from a lot of GitHub repositories. We only wanted to look at the biggest ones (measured by scraped file size, i.e. roughly amount of text over all issues communicated). We’ve excluded ‘ZeroK-RTS/CrashReports’ because no humans seem to be operating that repository. The results refer to statistics drawn from those repositories:
We have filtered out any account with
bot in the username as well as the
ownclouders account which is using GitMate.
If you’re interested in more information, we can share our Jupyter Notebook and the data with you — just hit us an email to firstname.lastname@example.org.