{ RMarkdown Source | Analysis Codebase }

Introduction

The Wikimedia Foundation’s Editing team is working to improve how contributors communicate on Wikipedia using talk pages through a series of incremental improvements that will be released over time.

As part of this effort, the Editing team introduced a new workflow for replying to specific comments with the intention of making participating productively on talk pages easier and more intuitive. The reply tool is an extra button that appears at the end of a post on a talk page (as shown in the screenshot below). When you click on it, it opens a reply form that automatically signs and indents wikitext talk page comments and offers a quick way for pinging other users among other features 1.

**Figure 1**: Example of the reply tool. The reply tool is an extra button that appears at the end of a post on a talk page.

Figure 1: Example of the reply tool. The reply tool is an extra button that appears at the end of a post on a talk page.

The team ran an AB test of the Reply Tool from 11 February 2021 through 10 March 2021 2 to assess the efficacy of this new feature, specifically on Junior Contributors (defined as having under 100 cumulative edits). The test included logged-in users that have not previously interacted with the reply tool (defined as users whose discussiontools-editmode preference is empty) and viewed one of the 22 participating Wikipedias (see full list of participating Wikipedias) during the duration of the AB test. During this test, 50% of users included in the test had the Reply tool automatically enabled, and 50% did not. Users at these Wikipedias were still able to turn the tool on or off the tool in Special:Preferences.

You can find more information about features of this tool and project updates on the project page.

Purpose

The primary goal of the AB test was to test the hypothesis that using the reply tool will increase the likelihood of a Junior Contributor publishing a comment they start without a significant increase in disruption.

For this analysis, our key performance indicator to evaluate the impact of the tool is comment completion rate, which we define as:

Of the contributors that opened the Reply tool or page editing to make a comment, the percent of contributors that successfully published at least one comment on a talk page

We assessed disruption by looking at the percent of comments made to talk pages that are reverted within 48 hours and the percent of contributors who are blocked after making an edit to a talk page.

The results of this analysis will be used to determine if the reply tool should be deployed as default to all Wiki projects, as an opt-out user preference. Please see further details about the hypotheses, key performance indicators, and decision scenarios in the task description.

Methodology

The AB test was run on a per Wikipedia basis and contributors included in the test were randomly assigned to either the control (reply tool disabled by default) or treatment (reply tool enabled by default) based on their user ID. Contributors within each group also had the option to explicitly turn the tool on or off in their preferences; however, these contributors remained in the same group they were bucketed in for the duration of the test.

Upon conclusion of the test on 10 March 2021, we recorded a total of 10,175 comment attempts initiated across both test groups by 4,690 distinct contributors across all experience levels. A total of 2,612 (55.7%) of these contributors were identified as Junior Contributors. Data was collected in EditAttemptStep.

In this test, a user can complete a comment using the Reply Tool or using reply workflows available with wikitext full-page and section editing. For the purpose of this analysis, these two types of editing experiences are defined as follows:

Reply Tool Comment: Any edit to a talk page namespace made with the reply tool. The reply tool allows edits using both wikitext and source mode. Reply tool edits exclude edits to create new sections or new pages on a talk page using either the new discussion tool or wikitext editing. Reply tool events were sampled at 100%.

Recorded in EditAttemptStep as: event.action = 'init', event.integration = 'discussiontools', event.init_type = 'page'

Page Editing Comment: Any edit to a talk page namespace that was not made with the reply tool and not edits to create new sections or new pages using either the new discussion tool or wikitext editing. Note that it’s possible that some of the edits were corrective edits (i.e fixing a signature) but current instrumentation does not decipher between this type of edit and a reply. These events were sampled at a rate of 1/16, or 6.125%

Recorded in the EditAttemptStep as: event.action = 'init', event.integration = 'page' , event.init_type = 'section' or 'page', event.init_mechanism = 'click'

See the following Phabricator tickets for further details regarding instrumentation and implementation of the AB test:

Comment completion rate for Junior Contributors

We first calculated the comment completion rate for Junior Contributors by editing experience (i.e. Was the contributor able to successfully save at least one comment with or without using the reply tool?) overall and across each participating Wikipedia.

For this analysis, we are defining the comment completion rate as the percent of contributors that successfully published (event.action = 'saveSuccess') at least one comment after opening a particular editing interface (event.action = 'init') during the time of the AB test. Note that this does not take into account the number of attempts it took for the user to publish or the duration of their editing sessions.

Overall comment completion Rrte by Junior Contributors

Junior contributors comment completion rate
across all participating Wikipedias
Editing experience Number of users attempted Number of users completed Comment completion rate1
Page editing2 1301 359 27.6%
Reply tool3 1311 956 72.9%

1 Defined as percent of contributors that made a comment attempt and publish at least 1 comment.

2 Sampling rate for Non-Reply Tool events is 6.25%

3 Sampling rate for Reply Tool events is 100%

**Figure 2**: Percent of Junior Contributors that completed at least one comment attempt on a talk page during the AB test.

Figure 2: Percent of Junior Contributors that completed at least one comment attempt on a talk page during the AB test.

Comment completion rate by participating Wikipedia

Junior contributors comment completion rate by participating Wikipedia
Wikipedia Editing experience1,2 Number of users attempted Number of users completed Completion rate3
Afrikaans Wikipedia Reply tool 1 1 100%
Bengali Wikipedia Page editing 17 8 47.1%
Bengali Wikipedia Reply tool 8 6 75%
Chinese Wikipedia Page editing 52 15 28.8%
Chinese Wikipedia Reply tool 43 25 58.1%
Dutch Wikipedia Page editing 38 9 23.7%
Dutch Wikipedia Reply tool 45 40 88.9%
Egyptian Wikipedia Page editing 6 2 33.3%
Egyptian Wikipedia Reply tool 3 2 66.7%
French Wikipedia Page editing 191 65 34%
French Wikipedia Reply tool 303 240 79.2%
Hebrew Wikipedia Page editing 91 36 39.6%
Hebrew Wikipedia Reply tool 57 35 61.4%
Hindi Wikipedia Page editing 25 5 20%
Hindi Wikipedia Reply tool 11 6 54.5%
Indonesian Wikipedia Page editing 47 6 12.8%
Indonesian Wikipedia Reply tool 14 9 64.3%
Italian Wikipedia Page editing 174 52 29.9%
Italian Wikipedia Reply tool 237 173 73%
Japanese Wikipedia Page editing 81 13 16%
Japanese Wikipedia Reply tool 44 29 65.9%
Korean Wikipedia Page editing 24 7 29.2%
Korean Wikipedia Reply tool 15 8 53.3%
Persian Wikipedia Page editing 88 36 40.9%
Persian Wikipedia Reply tool 55 32 58.2%
Polish Wikipedia Page editing 58 16 27.6%
Polish Wikipedia Reply tool 74 46 62.2%
Portuguese Wikipedia Page editing 83 20 24.1%
Portuguese Wikipedia Reply tool 142 116 81.7%
Spanish Wikipedia Page editing 241 47 19.5%
Spanish Wikipedia Reply tool 209 154 73.7%
Swahili Wikipedia Page editing 1 0 0%
Swahili Wikipedia Reply tool 1 1 100%
Thai Wikipedia Page editing 12 3 25%
Thai Wikipedia Reply tool 8 6 75%
Ukrainian Wikipedia Page editing 37 13 35.1%
Ukrainian Wikipedia Reply tool 28 19 67.9%
Vietnamese Wikipedia Page editing 35 6 17.1%
Vietnamese Wikipedia Reply tool 13 8 61.5%

1 Sampling rate for Non-Reply Tool events is 6.25%

2 Sampling rate for Reply Tool events is 100%

3 Defined as percent of contributors that make a comment attempt and publish at least 1 comment.

**Figure 3**: Percent of Junior Contributors that completed a comment attempt on a talk page. There were a limited number of AB test events recorded for Swahili, Afrikaans, and Egyptian Wikipedia and no recorded AB test events for Amharic and Oromo Wikipedia. As a result, these Wikipedia projects were removed from the chart above as we are not able to conclude any effects from the reply tool on these specific projects.

Figure 3: Percent of Junior Contributors that completed a comment attempt on a talk page. There were a limited number of AB test events recorded for Swahili, Afrikaans, and Egyptian Wikipedia and no recorded AB test events for Amharic and Oromo Wikipedia. As a result, these Wikipedia projects were removed from the chart above as we are not able to conclude any effects from the reply tool on these specific projects.

Junior contributors had a much higher success rate posting a comment using the reply tool compared to page editing. Overall, 72.9% of all Junior Contributors that made a comment attempt were able to successfully publish at least 1 comment with the reply tool, while only 27.6% of all Junior Contributors successfully published a comment using page editing. This represents a 164% (2.6x) observed increase in comment completion rate.

This trend is reflected consistently for each participating Wikipedia as well. Junior Contributors had a higher comment completion rate using the reply tool compared to non-reply tool editor interfaces on every participating Wikipedia.

Indonesian, Japanese, Dutch and Spanish Wikipedias saw the highest percent increases in comment completion rates with the reply tool. We observed the two lowest percent increase in edit completion rate for Persian (42% increase) and Hebrew Wikipedias (55% increase). These are both right-to-left languages, which might impact the reply tool experience and workflows for contributors on these projects; however, we have limited data recorded for right to left languages in the AB test to confirm any impact from language direction.

Modeling the impact of the reply tool

We next explored different models to correctly infer the impact of the reply tool on whether a comment was completed or not and account for the random effects by the user and wiki. This allows us to confirm if the observed increase above is statistically significant (did not occur due to random chance).

Comment attempts completed on the same Wikipedia and by the users on that Wikipedia are related to each other. Therefore, we can more accurately infer the impact of the reply tool by accounting for the effect of the user and wiki on the success probability of a Junior Contributor completing an edit.

We used a Bayesian Hierarchical regression model to model this structure. In this model, the user and Wikipedia are random effects and whether the reply tool was used is the fixed effect or predictor variable.

`

Posterior summary of model parameters
Point Estimate 95% CI1
Parameter
(Intercept) −1.036 (−1.263, −0.832)
Using reply tool 1.974 (1.775, 2.234)
Function of parameter(s)
Multiplicative effect on odds 7.200 (5.903, 9.334)
Maximum Lift 49.4%2 (44.4%, 55.8%)
Average lift 45.6%3 (41.6%, 50.5%)

1 CI: Credible Interval

2 Maximum lift calculated using the divide-by-4-rule

3 Average lift = Pr(Success|Reply Tool) - Pr(Success|Page Editing) = logit-10 + β1) - logit-10)

Since the model parameters are on the log-odds scale, we needed to apply the following transformations to make sense of them. * We used the “divide-by-4” rule suggested by Gelman, Hill, and Vehtari 2021 3 to approximate the maximum increase in the probability of success corresponding to which editing interface (reply tool or page editing) was used. Using the bayesian model, we can also directly calculate the average lift. * Since the model parameters are on the log-odds scale, we need to take the exponentiation of the effect (exp(β1)) to determine the multiplicative effect on the odds of a Junior Contributor successfully publishing at least 1 comment.

Based on estimates from the model, we found that Junior Contributors who open the reply tool are about 7 times more likely to successfully publish a comment than Junior Contributors who use page editing.

We also found there is an average 45% increase (maximum 49% increase) in the probability of a Junior Contributor publishing a comment when they switch from using page editing to the reply tool.

We can confirm statistical significance at the 0.05 level for all of these estimates (as indicated by credible intervals that do not cross 1).

Accounting for experience level

Comment completion rate across all experience levels

As the purpose of this test, we primarily focused on determining if the reply tool had an impact on Junior Contributors’ comment completion rate; however, we also reviewed the reply tool impact across all contributors’ comment completion rates to provide insight into differences due to experience level.

In this analysis, we’ve defined experience level as two levels: junior and non-junior: Junior Contributors are contributors with under 100 cumulative edits and non-junior are contributors with over 100 cumulative edits.

Note that this binary definition doesn’t fully capture the gradual growth of an editor. For example, using this binary definition, a contributor with 99 edits would just have to make a 1 more edit to be redefined as a Senior Contributor and their probability of completing an edit would suddenly increase. Also, changes in lower edit counts (i.e. 1 to 2 edits) indicate a higher impact than changes in higher edit counts (e.g. 5,000 to 5,001 edits). However, we used the binary definition as it aligns with how we’ve defined the target audience and helps simplify the model for the purposes of this analysis.

Contributors comment completion rate
Across all experience levels and participating Wikipedias
Editing experience Number of users attempted Number of users completed Completion rate1
Page editing2 2650 1369 51.7%
Reply tool3 2040 1406 68.9%

1 Defined as percent of contributors that make a comment attempt and publish at least 1 comment.

2 Sampling rate for Non-Reply Tool events is 6.25%

3 Sampling rate for Reply Tool events is 100%

**Figure 4**: Percent of contributors that completed a comment attempt on a talk page across all contributor experience levels and participating Wikipedias.

Figure 4: Percent of contributors that completed a comment attempt on a talk page across all contributor experience levels and participating Wikipedias.

Across all contributor experience levels, there was a 35% increase in the percent of contributors that were able to successfully publish a comment using the reply tool compared to contributors using page editing methods. This percent change is much lower than what we found when focusing only on Junior Contributors’ comment completion rates indicating that the experience level has a significant effect on the impact of the reply tool.

68.9% of Contributors across all experience levels were able to publish one comment using the reply tool. This is slightly lower than the percent of Junior Contributors (72.9%) that published a comment using the reply tool. However, the comment completion rate for page editing comments is much higher when looking at Contributors across all experience levels. 51.7% by Contributors across all experience levels were able to complete at least one comment using non-reply tool editing interfaces compared to only 27% of Junior Contributors.

Based on this observed data, it appears that experience level has a small impact on the ability of contributors to publish a comment using the reply tool but a large impact on the ability of contributors able to publish a comment using page editing.

Comment completion rate by experience level

Contributors comment completion rate by experience level
Across all participating Wikipedias
Experience level Editing experience Number of users attempted Number of users completed Completion rate1
Non-Junior Contributor2 Page editing 1365 1010 74%
Non-Junior Contributor Reply tool 738 450 61%
Junior Contributor3 Page editing 1301 359 27.6%
Junior Contributor Reply tool 1311 956 72.9%

1 Defined as percent of contributors that make a comment attempt and publish at least 1 comment.

2 Defined as having over 100 cumulative edits

3 Defined as having under 100 cumulative edits

**Figure 5**: Percent of contributors that completed a comment attempt on a talk page by contributors’ experience level. A Junior Contributor is a contributor with under 100 cumulative edits and a non-Junior Contributor is a contributor with over 100 cumulative edits

Figure 5: Percent of contributors that completed a comment attempt on a talk page by contributors’ experience level. A Junior Contributor is a contributor with under 100 cumulative edits and a non-Junior Contributor is a contributor with over 100 cumulative edits

When comparing Junior Contributor comment completion rate to Non-Junior Contributors, we see a clear difference between the two experience levels. Junior contributors comment completion rate with page editing methods is lower than the comment completion rate observed for non-junior contributors during the AB test. However, using the reply tool, Junior contributors’ comment completion rate was almost the same as the Non-Junior contributors comment completion rate using page editing.

Modeling the impact of the reply tool

Since comment completion rates seem to vary significantly based on the contributor’s experience level, we adjusted the Bayesian Hierarchical Regression Model to include the Contributors’ experience level as an interaction term in the model in addition to the effects of the user and wiki on comment completion rate.

**Figure 6**: The conditional effects of a Contributor’s experience level and type of editor used on the likelihood of completing an comment.

Figure 6: The conditional effects of a Contributor’s experience level and type of editor used on the likelihood of completing an comment.

The above plot shows the predicted effects of the Contributor’s experience level and the type of editor that the used (reply tool vs page editing) on the probability of successfully publishing a comment that they started on a talk page.

Based on the model, we can confirm the following: - A junior contributor is significantly more likely to successfully publish an edit than a junior contributor using page editing. - A junior contributor using the reply tool is, roughly, just as likely to post as a non-junior contributor using page editing. - A non-junior contributor using page editing is more likely to publish an edit using page editing then the reply tool.

The higher comment completion rate we see for Non-Junior Contributors may be due a tendency to stick with what they already know. In addition, our page editing definition currently includes corrective edits made to the page which are frequently conducted by more Senior Contributors and also not possible with the reply tool.

Guardrail analysis

We also wanted to ensure that enabling the reply tool did not result in an increase in the number of disruptive edits being made to talk pages.

To evaluate any disruption caused by the reply tool, we determined the percent of comments made to talk pages that were reverted within 48 hours and the percent of contributors blocked after making a comment to a talk page.

Comment revert rate for Junior Contributors

Methodology

For this analysis, we reviewed data recorded in mediawiki_history to identify the percent comments posted by the reply tool (identified by the revision tag: discussiontools-reply) on talk pages that are reverted within 48 hours 4.

We compared the revert rate for comments published using the reply tool to the revert rate for comments made using page editing during the same timeframe.

The reviewed data excludes wikitext edits to create new pages and edits to start new topics using the new discussion tool.

Overall revert rate by editor type

Junior contributors comment revert rate across all participating Wikipedias
Across all participating Wikipedias
Editing experience1 Number of comments reverted Number of comments published Revert rate2
Page editing 2648 24615 10.76 %
Reply tool 60 2716 2.21 %

1 Data comes from mediawiki_history

2 Defined as percent of comments reverted within 48 hours.

**Figure 7**: Percent of comments made by Junior Contributors on  talk pages that are reverted within 48 hours of being published.

Figure 7: Percent of comments made by Junior Contributors on talk pages that are reverted within 48 hours of being published.

Overall, across all participating Wikipedia, we observed a 79.5% decrease in the revert rate for comments made with the reply tool compared to page editing. The reply tool seems to enable Junior Contributors to not only successfully complete a comment but reduce the number of errors in the published comment that might lead to the comment to being reverted.

Revert rate by wiki

Junior Contributors comment revert rate by participating Wikipedia
Wikipedia Editing experience1 Number of comments reverted Number of comments published Revert rate2
Afrikaans Wikipedia Page editing 0 50 0 %
Afrikaans Wikipedia Reply tool 0 5 0 %
Amharic Wikipedia Page editing 0 9 0 %
Amharic Wikipedia Reply tool 0 1 0 %
Bengali Wikipedia Page editing 93 898 10.36 %
Bengali Wikipedia Reply tool 1 11 9.09 %
Chinese Wikipedia Page editing 63 1003 6.28 %
Chinese Wikipedia Reply tool 2 89 2.25 %
Dutch Wikipedia Page editing 29 539 5.38 %
Dutch Wikipedia Reply tool 2 170 1.18 %
Egyptian Wikipedia Page editing 4 94 4.26 %
Egyptian Wikipedia Reply tool 0 2 0 %
French Wikipedia Page editing 139 3882 3.58 %
French Wikipedia Reply tool 11 651 1.69 %
Hebrew Wikipedia Page editing 160 1379 11.6 %
Hebrew Wikipedia Reply tool 0 110 0 %
Hindi Wikipedia Page editing 241 726 33.2 %
Hindi Wikipedia Reply tool 0 15 0 %
Indonesian Wikipedia Page editing 64 523 12.24 %
Indonesian Wikipedia Reply tool 1 17 5.88 %
Italian Wikipedia Page editing 134 2332 5.75 %
Italian Wikipedia Reply tool 5 493 1.01 %
Japanese Wikipedia Page editing 252 1636 15.4 %
Japanese Wikipedia Reply tool 3 103 2.91 %
Korean Wikipedia Page editing 163 1349 12.08 %
Korean Wikipedia Reply tool 1 23 4.35 %
Oromo Wikipedia Page editing 0 3 0 %
Persian Wikipedia Page editing 306 3367 9.09 %
Persian Wikipedia Reply tool 1 102 0.98 %
Polish Wikipedia Page editing 114 877 13 %
Polish Wikipedia Reply tool 2 143 1.4 %
Portuguese Wikipedia Page editing 307 1805 17.01 %
Portuguese Wikipedia Reply tool 3 330 0.91 %
Spanish Wikipedia Page editing 429 2818 15.22 %
Spanish Wikipedia Reply tool 25 336 7.44 %
Swahili Wikipedia Page editing 0 21 0 %
Swahili Wikipedia Reply tool 0 6 0 %
Thai Wikipedia Page editing 14 168 8.33 %
Thai Wikipedia Reply tool 0 18 0 %
Ukrainian Wikipedia Page editing 65 662 9.82 %
Ukrainian Wikipedia Reply tool 2 68 2.94 %
Vietnamese Wikipedia Page editing 71 474 14.98 %
Vietnamese Wikipedia Reply tool 1 23 4.35 %

1 Data comes from mediawiki_history. Sampling rate is 100% all events

2 Defined as percent of comments reverted within 48 hours.

**Figure 8**: Percent of comments made by Junior Contributors on  talk pages that are reverted within 48 hours of being published. No published talk page comments were recorded for Oromo Wikipedia during the duration of the AB test and limited data were recorded for Afrikaans, Amharic,  Swahili , and Egyptian Wikipedias As a result, these Wikipedia projects were removed from the chart above as we are not able to accurately determine a revert rate representative of the population.

Figure 8: Percent of comments made by Junior Contributors on talk pages that are reverted within 48 hours of being published. No published talk page comments were recorded for Oromo Wikipedia during the duration of the AB test and limited data were recorded for Afrikaans, Amharic, Swahili , and Egyptian Wikipedias As a result, these Wikipedia projects were removed from the chart above as we are not able to accurately determine a revert rate representative of the population.

Some per participating Wikipedia trend highlights :

  • Comments made with the reply tool had lower revert rates compared to comments made with non-reply tool editing on each participating Wikipedias.
  • We observed the highest reply tool revert rates on Bengali Wikipedia (9.09%), Spanish Wikipedia (7.44%), and Indonesian Wikipedia (5.88%). Reply tool revert rates for all the other participating Wikipedias were under 5%.

Revert rate By experience level

Contributors comment revert rate by experience level
Experience level1 Editing experience2 Number of comments reverted Number of comments published Revert rate3
Non-Junior Contributor Page editing 4177 272605 1.53 %
Non-Junior Contributor Reply tool 136 7147 1.9 %
Junior Contributor Page editing 2648 24615 10.76 %
Junior Contributor Reply tool 60 2716 2.21 %

1 Junior contributor defined as having under cumulative 100 edits. Non-Junior Contributor is defined as having over 100 cumulative edits

2 Data comes from mediawiki_history

3 Defined as percent of comments reverted within 48 hours.

**Figure 9**: Percent of comments made by contributors on  talk pages that are reverted by experience level. A Junior Contributor is a contributors with under 100 cumulative edits and a non-Junior Contributor is a contributor with over 100 cumulative edits

Figure 9: Percent of comments made by contributors on talk pages that are reverted by experience level. A Junior Contributor is a contributors with under 100 cumulative edits and a non-Junior Contributor is a contributor with over 100 cumulative edits

Similar to our finding for comment completion rate, experience level has an impact on the revert rate for comments published using page editing. We see just a slightly higher revert rate for the reply tool for Junior Contributors compared to Non-Junior Contributors but there is a significant difference in page editing revert rates between the two experience level groups.

Percent of Junior Contributors blocked after posting a comment

We also reviewed the number of Junior Contributors blocked after posting a comment using the reply tool.

Data comes from the mediawiki_user_history table. All block events were identified in the data by caused_by_event_type = "alterblocks". The data includes any Contributors that were blocked after posting a comment; however, we do not know if they were blocked specifically due to the comment posted. Data is also currently limited to dates of the AB test. Users may have been blocked following this analysis.

Blocked users overall

**Figure 10**: Percent of Junior Contributors blocked after making a comment on a talk page.

Figure 10: Percent of Junior Contributors blocked after making a comment on a talk page.

Blocked Users By Wiki

Junior Contributors blocked after publishing a comment by participating Wikipedia
Wikipedia Editing experience1 Number of users blocked Number of users that made a comment Percent of users blocked2
Afrikaans Wikipedia Page editing 0 21 0 %
Afrikaans Wikipedia Reply tool 0 2 0 %
Amharic Wikipedia Page editing 0 7 0 %
Amharic Wikipedia Reply tool 0 1 0 %
Bengali Wikipedia Page editing 0 279 0 %
Bengali Wikipedia Reply tool 0 7 0 %
Chinese Wikipedia Page editing 7 290 2.41 %
Chinese Wikipedia Reply tool 1 38 2.63 %
Dutch Wikipedia Page editing 1 189 0.53 %
Dutch Wikipedia Reply tool 0 54 0 %
Egyptian Wikipedia Page editing 0 33 0 %
Egyptian Wikipedia Reply tool 0 2 0 %
French Wikipedia Page editing 15 1475 1.02 %
French Wikipedia Reply tool 5 331 1.51 %
Hebrew Wikipedia Page editing 9 354 2.54 %
Hebrew Wikipedia Reply tool 0 41 0 %
Hindi Wikipedia Page editing 1 307 0.33 %
Hindi Wikipedia Reply tool 0 8 0 %
Indonesian Wikipedia Page editing 7 198 3.54 %
Indonesian Wikipedia Reply tool 1 12 8.33 %
Italian Wikipedia Page editing 16 701 2.28 %
Italian Wikipedia Reply tool 5 210 2.38 %
Japanese Wikipedia Page editing 9 448 2.01 %
Japanese Wikipedia Reply tool 0 37 0 %
Korean Wikipedia Page editing 1 207 0.48 %
Korean Wikipedia Reply tool 0 7 0 %
Oromo Wikipedia Page editing 0 2 0 %
Persian Wikipedia Page editing 27 1058 2.55 %
Persian Wikipedia Reply tool 1 42 2.38 %
Polish Wikipedia Page editing 10 299 3.34 %
Polish Wikipedia Reply tool 2 58 3.45 %
Portuguese Wikipedia Page editing 25 746 3.35 %
Portuguese Wikipedia Reply tool 4 148 2.7 %
Spanish Wikipedia Page editing 13 1074 1.21 %
Spanish Wikipedia Reply tool 3 193 1.55 %
Swahili Wikipedia Page editing 0 9 0 %
Swahili Wikipedia Reply tool 0 5 0 %
Thai Wikipedia Page editing 2 68 2.94 %
Thai Wikipedia Reply tool 0 7 0 %
Ukrainian Wikipedia Page editing 2 222 0.9 %
Ukrainian Wikipedia Reply tool 2 24 8.33 %
Vietnamese Wikipedia Page editing 2 203 0.99 %
Vietnamese Wikipedia Reply tool 0 10 0 %

1 Data comes from mediawiki_user_history

2 Percent of junior contributors blocked after posting a comment during the AB test

Overall, across all participating Wikipedias, 1.94% of reply tool Junior Contributors were blocked after posting a comment with the reply tool, which is only slightly higher than the percent of Junior Contributors that were blocked after making a comment using page editing during the same timeframe (1.79%)

Under 3.5% of all Junior Contributors that used the reply tool were blocked on all participating Wikipedias with the exception of Ukranian Wikipedia and Indonesisan Wikipedia. 8.33% of Ukranian Wikipedia Junior Contributors reply tool users were blocked; however, comments were only posted by 24 users on this project during the reviewed timeframe and we confirmed that the two users blocked were the same ones blocked after using the non-reply tool as well. Indonesian Wikipedia only had 12 users that made a comment with the reply tool during the AB test and therefore it is difficult to accurately confirm if this percent blocked rate is representative of the population.

Blocked users By experience level

Contributors blocked after publishing a comment by experience level
Editing experience1 Experience level Number of users blocked Number of users that made a comment Percent of users blocked2
Page editing Non-Junior Contributor 103 8500 1.21 %
Page editing Junior Contributor 147 8190 1.79 %
Reply tool Non-Junior Contributor 19 1048 1.81 %
Reply tool Junior Contributor 24 1237 1.94 %

1 Data comes from mediawiki_user_history

2 Percent of Junior Contributors blocked after posting a comment during the AB test

**Figure 11**: Percent of contributors blocked after making a comment on a talk page by experience level. Junior contributor defined as having under cumulative 100 edits. Non-Junior Contributor is defined as having over 100 cumulative edits

Figure 11: Percent of contributors blocked after making a comment on a talk page by experience level. Junior contributor defined as having under cumulative 100 edits. Non-Junior Contributor is defined as having over 100 cumulative edits

There’s not a large difference in the percent of users blocked between experience levels. Under 2% of users were blocked in both experience levels and using both editing experience methods (page editing and reply tool).

Curiosities

We also explored if the reply tool resulted in a greater number of Junior Contributors to start participating productively on talk pages and if it caused a greater percentage of Junior Contributors to continue participating productively on talk pages.

Number of Junior Contributors

This metric was defined as the number of distinct Junior Contributors who make at least one edit to a page in a talk namespace that is not reverted within 48 hours. Since different sampling rates were applied to each editor type, we removed any events that were oversampled (sampling rate increased to 100%) to allow us to directly compare the numbers between the two groups.

Number of Junior Contributors that made a comment during the AB test by test group and interface1
Editing experience Number of users that attempted a comment Number of users that published a comment
control2
Page editing 654 199
Reply tool 2 2
test3
Page editing 592 160
Reply tool 135 89

1 Based on a sampling rate of 6.25% for all events. Any oversampled events were removed so data for the two editor types could be directly compared

2 Users were not shown the reply tool by default

3 Users were shown the reply tool by default

A few explanations regarding the numbers above: * There are Contributors for each editing experience type in each AB test group. This is because contributors within each group also had the option to explicitly turn the tool on or off in their preferences; however, these contributors remained in the same group they were bucketed in for the duration of the test. * Reply tool comment attempts that appear in control are only for people who went and manually enabled the feature and then made a comment. * There are a number of page editing Contributors in the test group because page editing is still available even when the reply tool is enabled. Contributors on a talk page might continue to use page editing because they not aware of the new reply tool, deciding to use what they know, or making corrective edits to the page.

**Figure 12**: Number of Junior Contributors that made a comment by AB test group. The test group users were shown the reply tool enabled as default. Based on a sampling rate of 6.25% for all events. Any oversampled events were removed so data for the two editor types could be directly compared.

Figure 12: Number of Junior Contributors that made a comment by AB test group. The test group users were shown the reply tool enabled as default. Based on a sampling rate of 6.25% for all events. Any oversampled events were removed so data for the two editor types could be directly compared.

Across all participating Wikipedias, there were 23.8% more Junior Contributors complete a comment attempt when shown the reply tool enabled as default than users that were not shown the reply tool enabled on talk pages.

Retention of Junior Contributors

In addition, we looked into whether Junior Contributors that made at least one comment during the time of the AB test (our cohort) returned to make another comment on a talk namespace. Specifically, we calculated the percentage of Junior Contributors who who make at least one edit to a page in a talk namespace in each of the following retention windows: * 1 week after making a comment (2 -8 days). Note: Since user activity naturally comes in bursts, we excluded the time (first 24 hours) immediately following the Contributors’ first edit. * 2 weeks after making a comment (9- 15 days)

Data for this analysis came from events logged in mediawiki_history. We only reviewed comments that were not reverted within 48 hours. Due to availability of data at the time of this analysis, we only reviewed retention that occured within the first and second week after the Contributor’s first edit.

Note that while the test was completed on 11 March 2021, users that were in the AB test continued to have the same experience, which allows us to compare user behavior that occured after the AB test. See T276967).

# Join all the data

retention_rates_two <- inner_join(week_one_retention, week_two_retention)
Junior contributors retention rate1
Editing experience2 Week 1 (2-8 days)3 Week 2 (9-15 days)3
Page editing 13.34 % 8.31 %
Reply tool 14.11 % 7.37 %

1 Defined as percent of contributors that made a comment during the AB test and returned to make another comment.

2 Sampling rate for Non-Reply Tool events is 6.25%

3 Defined as days since first comment during the AB test.

**Figure 13**: Percent of Junior Contributors that made a comment and returned to make another comment. Week 1 is defined as 2-8 days following their first edit. Week 2 is defined as 9 to 15 days following their first edit

Figure 13: Percent of Junior Contributors that made a comment and returned to make another comment. Week 1 is defined as 2-8 days following their first edit. Week 2 is defined as 9 to 15 days following their first edit

There is not a lot of variation between the retention rates observed for page editing and the reply tool. Most Contributors that returned to make an edit within the first two week made their return edit within week 1 (2 to 8 days following their first).

By Experience Level

**Figure 14**: Percent of Contributors that made a comment and returned to make another comment by experience level

Figure 14: Percent of Contributors that made a comment and returned to make another comment by experience level

Broken down by experience level, we see a higher retention rate of Non-Junior Contributors compared to Junior Contributors for both editing experience types; however, there is not a lot of variation between the page editing and reply tool retention rates for either editing experience type. Most Contributors that returned to make an edit within the first two week made their return edit within week 1 (2 to 8 days following their first).


  1. Screenshot available on Wikimedia Commons, licensed under MIT Liscense.

  2. Note that we excluded data from from 2021 February 25 to 2021 March 1 in this analysis due to an error in the sampling configuration that resulted in the loss of non-reply tool edit events.

  3. Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and other stories. https://doi.org/10.1017/9781139161879.

  4. 48 hours is a common cutoff, as research suggests that, at least for the English Wikipedia, nearly all reverts take place within 48 hours. Source: Research: Revert. Mediawiki. https://meta.wikimedia.org/wiki/Research:Revert.