Worldwide Gender Differences in Public Code Contributions and how they have been affected by the COVID-19 pandemic

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS)(2022)

引用 8|浏览10
暂无评分
摘要
Gender imbalance is a well-known phenomenon observed throughout sciences which is particularly severe in software development and Free/Open Source Software communities. Little is know yet about the geography of this phenomenon in particular when considering large scales for both its time and space dimensions. We contribute to fill this gap with a longitudinal study of the population of contributors to publicly available software source code. We analyze the development history of 160 million software projects for a total of 2.2 billion commits contributed by 43 million distinct authors over a period of 50 years. We classify author names by gender using name frequencies and author geographical locations using heuristics based on email addresses and time zones. We study the evolution over time of contributions to public code by gender and by world region. For the world overall, we confirm previous findings about the low but steadily increasing ratio of contributions by female authors. When breaking down by world regions we find that the long-term growth of female participation is a world-wide phenomenon. We also observe a decrease in the ratio of female participation during the COVID-19 pandemic, suggesting that women's ability to contribute to public code has been more hindered than that of men. Software developers around the world work together to produce publicly available software (or public code). They do so using public identities and disclosing information about their work that include their names and when a software change was made. We use this information to characterize the gender gap in public code, that is, the difference in participation to public software development between men and women. Specifically, we study the development history of 160 million pieces of public software, developed over a period of 50 years by 43 million authors. We characterize the gender gap on this corpus over time and by world region. To determine author genders we rely on public data about name frequencies by gender around the world. To determine author locations we use email addresses, name frequencies around the world, and the time zone associated to each software change. We confirm that the gender gap in public code is huge. Female authors are only 8.1 % of the total and have authored only 13.5% software versions. The gender gap is however shrinking, with women participation having increased steadily over the past 12 years. This improvement is a global phenomenon, observable in most world regions. We also observe a decrease in the ratio of female participation during the COVID-19 pandemic, suggesting that women have been more hindered than men in their ability to contribute to public code.
更多
查看译文
关键词
gender,diversity,open source,commit,software heritage,covid19
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要