Page:Community Vital Signs Research Paper - Miquel Laniado Consonni.pdf/16

Sustainability 2022, 14, 4705 org/other/mediawiki_history [accessed 19 February 2022].) The code we deployed in Python3 is made available (Community Vital Signs GitHub repository, https://github.com/WikiCommunityHealth/community-vital-signs [accessed 19 February 2022]), as well as the resulting databases (https://vitalsigns.wmcloud.org/datasets/ [accessed 19 February 2022]). As far as the interactions with the community members, they were all held in the Wikimedia international conferences through video call during 2021.

4. Results

In this section, we dedicate a different subsection to each of the three objectives of the study. In Section 4.1, we explore the state of growth and decline of Wikipedia language editions. In Section 4.2, we present the results for the indicators to measure community renewal and answer each of the research questions. In Section 4.3, we discuss the feedback received by Wikimedians on the indicators.

4.1. Community Growth, Stagnation, and Decline

To pursue the objective [O1] of assessing growth/stagnation/decline patterns in Wikipedia communities, we inspected the temporal evolution of the number of active editors over time, comparing the trends obtained for different language editions and performing clustering to identify general trends. We computed the monthly number of active editors for each of the 308 Wikipedia language editions, and we focused on communities with a minimum of 100 active editors in August 2021; they are 52.

To be able to group communities exhibiting similar trends, we relied on the k-means clustering algorithm on the time series, and we used dynamic time warping to measure similarity between the temporal sequences focusing on the general pattern [40]. We ran the k-means algorithm with parameters σ = 6, and learning rate = 0.1; we obtained six clusters, shown in Figure 2.

We observe that many language editions of different size belong to the first cluster (cluster 0), characterized by a first phase of growth of about 7 years (which correspond to 84 months) and then a stagnation and decline period until stabilization with more or less accentuated oscillations around a lower number of editors. This roughly corresponds to the decline observed for English (included in cluster 0) and other major language editions since 2007 [14,15].

The second cluster (cluster 1) also includes some of the biggest language editions, including French, Japanese, and Spanish. The trend is similar to that of the previous cluster, with the difference that after the rise and peak, instead of decline, we observe a more stable stagnation pattern. We may observe a tendency to decline in the first years of stagnation and a smooth raise again in the last years.

The remaining clusters exhibit a different pattern, with a common tendency to keep growing, although at different rates after the initial rise. Cluster 2 includes smaller European language communities, characterized by stronger oscillations around a smooth growing trend in the second phase. Cluster 3 represents some Asian language communities that interestingly exhibit a decline/stagnation period followed by a strongly growing pattern. Communities in cluster 4 see a first rapid growth period (with different duration for different communities), followed by a less skewed but still growing trend. Finally, cluster 5 groups communities of different sizes together, characterized by a more or less stable growing trend.