{ RMarkdown Source | Analysis Codebase }
The Wikimedia Foundation’s Editing team is working to improve how contributors communicate on Wikipedia using talk pages through a series of incremental improvements that will be released over time.
As part of this effort, the Editing team introduced a new workflow for replying to specific comments with the intention of making participating productively on talk pages easier and more intuitive. The reply tool is an extra button that appears at the end of a post on a talk page (as shown in the screenshot below). When you click on it, it opens a reply form that automatically signs and indents wikitext talk page comments and offers a quick way for pinging other users among other features 1.
The team ran an AB test of the Reply Tool from 11 February 2021 through 10 March 2021 2 to assess the efficacy of this new feature, specifically on Junior Contributors (defined as having under 100 cumulative edits). The test included logged-in users that have not previously interacted with the reply tool (defined as users whose discussiontools-editmode
preference is empty) and viewed one of the 22 participating Wikipedias (see full list of participating Wikipedias) during the duration of the AB test. During this test, 50% of users included in the test had the Reply tool automatically enabled, and 50% did not. Users at these Wikipedias were still able to turn the tool on or off the tool in Special:Preferences.
You can find more information about features of this tool and project updates on the project page.
The primary goal of the AB test was to test the hypothesis that using the reply tool will increase the likelihood of a Junior Contributor publishing a comment they start without a significant increase in disruption.
For this analysis, our key performance indicator to evaluate the impact of the tool is comment completion rate, which we define as:
Of the contributors that opened the Reply tool or page editing to make a comment, the percent of contributors that successfully published at least one comment on a talk page
We assessed disruption by looking at the percent of comments made to talk pages that are reverted within 48 hours and the percent of contributors who are blocked after making an edit to a talk page.
The results of this analysis will be used to determine if the reply tool should be deployed as default to all Wiki projects, as an opt-out user preference. Please see further details about the hypotheses, key performance indicators, and decision scenarios in the task description.
The AB test was run on a per Wikipedia basis and contributors included in the test were randomly assigned to either the control (reply tool disabled by default) or treatment (reply tool enabled by default) based on their user ID. Contributors within each group also had the option to explicitly turn the tool on or off in their preferences; however, these contributors remained in the same group they were bucketed in for the duration of the test.
Upon conclusion of the test on 10 March 2021, we recorded a total of 10,175 comment attempts initiated across both test groups by 4,690 distinct contributors across all experience levels. A total of 2,612 (55.7%) of these contributors were identified as Junior Contributors. Data was collected in EditAttemptStep.
In this test, a user can complete a comment using the Reply Tool or using reply workflows available with wikitext full-page and section editing. For the purpose of this analysis, these two types of editing experiences are defined as follows:
Reply Tool Comment: Any edit to a talk page namespace made with the reply tool. The reply tool allows edits using both wikitext and source mode. Reply tool edits exclude edits to create new sections or new pages on a talk page using either the new discussion tool or wikitext editing. Reply tool events were sampled at 100%.
Recorded in EditAttemptStep as: event.action = 'init'
, event.integration = 'discussiontools'
, event.init_type = 'page'
Page Editing Comment: Any edit to a talk page namespace that was not made with the reply tool and not edits to create new sections or new pages using either the new discussion tool or wikitext editing. Note that it’s possible that some of the edits were corrective edits (i.e fixing a signature) but current instrumentation does not decipher between this type of edit and a reply. These events were sampled at a rate of 1/16, or 6.125%
Recorded in the EditAttemptStep as: event.action = 'init'
, event.integration = 'page'
, event.init_type = 'section' or 'page'
, event.init_mechanism = 'click'
See the following Phabricator tickets for further details regarding instrumentation and implementation of the AB test:
As the purpose of this test, we primarily focused on determining if the reply tool had an impact on Junior Contributors’ comment completion rate; however, we also reviewed the reply tool impact across all contributors’ comment completion rates to provide insight into differences due to experience level.
In this analysis, we’ve defined experience level as two levels: junior and non-junior: Junior Contributors are contributors with under 100 cumulative edits and non-junior are contributors with over 100 cumulative edits.
Note that this binary definition doesn’t fully capture the gradual growth of an editor. For example, using this binary definition, a contributor with 99 edits would just have to make a 1 more edit to be redefined as a Senior Contributor and their probability of completing an edit would suddenly increase. Also, changes in lower edit counts (i.e. 1 to 2 edits) indicate a higher impact than changes in higher edit counts (e.g. 5,000 to 5,001 edits). However, we used the binary definition as it aligns with how we’ve defined the target audience and helps simplify the model for the purposes of this analysis.
Contributors comment completion rate | |||
---|---|---|---|
Across all experience levels and participating Wikipedias | |||
Editing experience | Number of users attempted | Number of users completed | Completion rate1 |
Page editing2 | 2650 | 1369 | 51.7% |
Reply tool3 | 2040 | 1406 | 68.9% |
1
Defined as percent of contributors that make a comment attempt and publish at least 1 comment.
2
Sampling rate for Non-Reply Tool events is 6.25%
3
Sampling rate for Reply Tool events is 100%
|
Across all contributor experience levels, there was a 35% increase in the percent of contributors that were able to successfully publish a comment using the reply tool compared to contributors using page editing methods. This percent change is much lower than what we found when focusing only on Junior Contributors’ comment completion rates indicating that the experience level has a significant effect on the impact of the reply tool.
68.9% of Contributors across all experience levels were able to publish one comment using the reply tool. This is slightly lower than the percent of Junior Contributors (72.9%) that published a comment using the reply tool. However, the comment completion rate for page editing comments is much higher when looking at Contributors across all experience levels. 51.7% by Contributors across all experience levels were able to complete at least one comment using non-reply tool editing interfaces compared to only 27% of Junior Contributors.
Based on this observed data, it appears that experience level has a small impact on the ability of contributors to publish a comment using the reply tool but a large impact on the ability of contributors able to publish a comment using page editing.
Contributors comment completion rate by experience level | ||||
---|---|---|---|---|
Across all participating Wikipedias | ||||
Experience level | Editing experience | Number of users attempted | Number of users completed | Completion rate1 |
Non-Junior Contributor2 | Page editing | 1365 | 1010 | 74% |
Non-Junior Contributor | Reply tool | 738 | 450 | 61% |
Junior Contributor3 | Page editing | 1301 | 359 | 27.6% |
Junior Contributor | Reply tool | 1311 | 956 | 72.9% |
1
Defined as percent of contributors that make a comment attempt and publish at least 1 comment.
2
Defined as having over 100 cumulative edits
3
Defined as having under 100 cumulative edits
|
When comparing Junior Contributor comment completion rate to Non-Junior Contributors, we see a clear difference between the two experience levels. Junior contributors comment completion rate with page editing methods is lower than the comment completion rate observed for non-junior contributors during the AB test. However, using the reply tool, Junior contributors’ comment completion rate was almost the same as the Non-Junior contributors comment completion rate using page editing.
Since comment completion rates seem to vary significantly based on the contributor’s experience level, we adjusted the Bayesian Hierarchical Regression Model to include the Contributors’ experience level as an interaction term in the model in addition to the effects of the user and wiki on comment completion rate.
The above plot shows the predicted effects of the Contributor’s experience level and the type of editor that the used (reply tool vs page editing) on the probability of successfully publishing a comment that they started on a talk page.
Based on the model, we can confirm the following: - A junior contributor is significantly more likely to successfully publish an edit than a junior contributor using page editing. - A junior contributor using the reply tool is, roughly, just as likely to post as a non-junior contributor using page editing. - A non-junior contributor using page editing is more likely to publish an edit using page editing then the reply tool.
The higher comment completion rate we see for Non-Junior Contributors may be due a tendency to stick with what they already know. In addition, our page editing definition currently includes corrective edits made to the page which are frequently conducted by more Senior Contributors and also not possible with the reply tool.
We also wanted to ensure that enabling the reply tool did not result in an increase in the number of disruptive edits being made to talk pages.
To evaluate any disruption caused by the reply tool, we determined the percent of comments made to talk pages that were reverted within 48 hours and the percent of contributors blocked after making a comment to a talk page.
For this analysis, we reviewed data recorded in mediawiki_history to identify the percent comments posted by the reply tool (identified by the revision tag: discussiontools-reply
) on talk pages that are reverted within 48 hours 4.
We compared the revert rate for comments published using the reply tool to the revert rate for comments made using page editing during the same timeframe.
The reviewed data excludes wikitext edits to create new pages and edits to start new topics using the new discussion tool.
Junior contributors comment revert rate across all participating Wikipedias | |||
---|---|---|---|
Across all participating Wikipedias | |||
Editing experience1 | Number of comments reverted | Number of comments published | Revert rate2 |
Page editing | 2648 | 24615 | 10.76 % |
Reply tool | 60 | 2716 | 2.21 % |
1
Data comes from mediawiki_history
2
Defined as percent of comments reverted within 48 hours.
|
Overall, across all participating Wikipedia, we observed a 79.5% decrease in the revert rate for comments made with the reply tool compared to page editing. The reply tool seems to enable Junior Contributors to not only successfully complete a comment but reduce the number of errors in the published comment that might lead to the comment to being reverted.
Junior Contributors comment revert rate by participating Wikipedia | ||||
---|---|---|---|---|
Wikipedia | Editing experience1 | Number of comments reverted | Number of comments published | Revert rate2 |
Afrikaans Wikipedia | Page editing | 0 | 50 | 0 % |
Afrikaans Wikipedia | Reply tool | 0 | 5 | 0 % |
Amharic Wikipedia | Page editing | 0 | 9 | 0 % |
Amharic Wikipedia | Reply tool | 0 | 1 | 0 % |
Bengali Wikipedia | Page editing | 93 | 898 | 10.36 % |
Bengali Wikipedia | Reply tool | 1 | 11 | 9.09 % |
Chinese Wikipedia | Page editing | 63 | 1003 | 6.28 % |
Chinese Wikipedia | Reply tool | 2 | 89 | 2.25 % |
Dutch Wikipedia | Page editing | 29 | 539 | 5.38 % |
Dutch Wikipedia | Reply tool | 2 | 170 | 1.18 % |
Egyptian Wikipedia | Page editing | 4 | 94 | 4.26 % |
Egyptian Wikipedia | Reply tool | 0 | 2 | 0 % |
French Wikipedia | Page editing | 139 | 3882 | 3.58 % |
French Wikipedia | Reply tool | 11 | 651 | 1.69 % |
Hebrew Wikipedia | Page editing | 160 | 1379 | 11.6 % |
Hebrew Wikipedia | Reply tool | 0 | 110 | 0 % |
Hindi Wikipedia | Page editing | 241 | 726 | 33.2 % |
Hindi Wikipedia | Reply tool | 0 | 15 | 0 % |
Indonesian Wikipedia | Page editing | 64 | 523 | 12.24 % |
Indonesian Wikipedia | Reply tool | 1 | 17 | 5.88 % |
Italian Wikipedia | Page editing | 134 | 2332 | 5.75 % |
Italian Wikipedia | Reply tool | 5 | 493 | 1.01 % |
Japanese Wikipedia | Page editing | 252 | 1636 | 15.4 % |
Japanese Wikipedia | Reply tool | 3 | 103 | 2.91 % |
Korean Wikipedia | Page editing | 163 | 1349 | 12.08 % |
Korean Wikipedia | Reply tool | 1 | 23 | 4.35 % |
Oromo Wikipedia | Page editing | 0 | 3 | 0 % |
Persian Wikipedia | Page editing | 306 | 3367 | 9.09 % |
Persian Wikipedia | Reply tool | 1 | 102 | 0.98 % |
Polish Wikipedia | Page editing | 114 | 877 | 13 % |
Polish Wikipedia | Reply tool | 2 | 143 | 1.4 % |
Portuguese Wikipedia | Page editing | 307 | 1805 | 17.01 % |
Portuguese Wikipedia | Reply tool | 3 | 330 | 0.91 % |
Spanish Wikipedia | Page editing | 429 | 2818 | 15.22 % |
Spanish Wikipedia | Reply tool | 25 | 336 | 7.44 % |
Swahili Wikipedia | Page editing | 0 | 21 | 0 % |
Swahili Wikipedia | Reply tool | 0 | 6 | 0 % |
Thai Wikipedia | Page editing | 14 | 168 | 8.33 % |
Thai Wikipedia | Reply tool | 0 | 18 | 0 % |
Ukrainian Wikipedia | Page editing | 65 | 662 | 9.82 % |
Ukrainian Wikipedia | Reply tool | 2 | 68 | 2.94 % |
Vietnamese Wikipedia | Page editing | 71 | 474 | 14.98 % |
Vietnamese Wikipedia | Reply tool | 1 | 23 | 4.35 % |
1
Data comes from mediawiki_history. Sampling rate is 100% all events
2
Defined as percent of comments reverted within 48 hours.
|
Some per participating Wikipedia trend highlights :
Contributors comment revert rate by experience level | ||||
---|---|---|---|---|
Experience level1 | Editing experience2 | Number of comments reverted | Number of comments published | Revert rate3 |
Non-Junior Contributor | Page editing | 4177 | 272605 | 1.53 % |
Non-Junior Contributor | Reply tool | 136 | 7147 | 1.9 % |
Junior Contributor | Page editing | 2648 | 24615 | 10.76 % |
Junior Contributor | Reply tool | 60 | 2716 | 2.21 % |
1
Junior contributor defined as having under cumulative 100 edits. Non-Junior Contributor is defined as having over 100 cumulative edits
2
Data comes from mediawiki_history
3
Defined as percent of comments reverted within 48 hours.
|
Similar to our finding for comment completion rate, experience level has an impact on the revert rate for comments published using page editing. We see just a slightly higher revert rate for the reply tool for Junior Contributors compared to Non-Junior Contributors but there is a significant difference in page editing revert rates between the two experience level groups.
We also reviewed the number of Junior Contributors blocked after posting a comment using the reply tool.
Data comes from the mediawiki_user_history table. All block events were identified in the data by caused_by_event_type = "alterblocks"
. The data includes any Contributors that were blocked after posting a comment; however, we do not know if they were blocked specifically due to the comment posted. Data is also currently limited to dates of the AB test. Users may have been blocked following this analysis.
Junior Contributors blocked after publishing a comment by participating Wikipedia | ||||
---|---|---|---|---|
Wikipedia | Editing experience1 | Number of users blocked | Number of users that made a comment | Percent of users blocked2 |
Afrikaans Wikipedia | Page editing | 0 | 21 | 0 % |
Afrikaans Wikipedia | Reply tool | 0 | 2 | 0 % |
Amharic Wikipedia | Page editing | 0 | 7 | 0 % |
Amharic Wikipedia | Reply tool | 0 | 1 | 0 % |
Bengali Wikipedia | Page editing | 0 | 279 | 0 % |
Bengali Wikipedia | Reply tool | 0 | 7 | 0 % |
Chinese Wikipedia | Page editing | 7 | 290 | 2.41 % |
Chinese Wikipedia | Reply tool | 1 | 38 | 2.63 % |
Dutch Wikipedia | Page editing | 1 | 189 | 0.53 % |
Dutch Wikipedia | Reply tool | 0 | 54 | 0 % |
Egyptian Wikipedia | Page editing | 0 | 33 | 0 % |
Egyptian Wikipedia | Reply tool | 0 | 2 | 0 % |
French Wikipedia | Page editing | 15 | 1475 | 1.02 % |
French Wikipedia | Reply tool | 5 | 331 | 1.51 % |
Hebrew Wikipedia | Page editing | 9 | 354 | 2.54 % |
Hebrew Wikipedia | Reply tool | 0 | 41 | 0 % |
Hindi Wikipedia | Page editing | 1 | 307 | 0.33 % |
Hindi Wikipedia | Reply tool | 0 | 8 | 0 % |
Indonesian Wikipedia | Page editing | 7 | 198 | 3.54 % |
Indonesian Wikipedia | Reply tool | 1 | 12 | 8.33 % |
Italian Wikipedia | Page editing | 16 | 701 | 2.28 % |
Italian Wikipedia | Reply tool | 5 | 210 | 2.38 % |
Japanese Wikipedia | Page editing | 9 | 448 | 2.01 % |
Japanese Wikipedia | Reply tool | 0 | 37 | 0 % |
Korean Wikipedia | Page editing | 1 | 207 | 0.48 % |
Korean Wikipedia | Reply tool | 0 | 7 | 0 % |
Oromo Wikipedia | Page editing | 0 | 2 | 0 % |
Persian Wikipedia | Page editing | 27 | 1058 | 2.55 % |
Persian Wikipedia | Reply tool | 1 | 42 | 2.38 % |
Polish Wikipedia | Page editing | 10 | 299 | 3.34 % |
Polish Wikipedia | Reply tool | 2 | 58 | 3.45 % |
Portuguese Wikipedia | Page editing | 25 | 746 | 3.35 % |
Portuguese Wikipedia | Reply tool | 4 | 148 | 2.7 % |
Spanish Wikipedia | Page editing | 13 | 1074 | 1.21 % |
Spanish Wikipedia | Reply tool | 3 | 193 | 1.55 % |
Swahili Wikipedia | Page editing | 0 | 9 | 0 % |
Swahili Wikipedia | Reply tool | 0 | 5 | 0 % |
Thai Wikipedia | Page editing | 2 | 68 | 2.94 % |
Thai Wikipedia | Reply tool | 0 | 7 | 0 % |
Ukrainian Wikipedia | Page editing | 2 | 222 | 0.9 % |
Ukrainian Wikipedia | Reply tool | 2 | 24 | 8.33 % |
Vietnamese Wikipedia | Page editing | 2 | 203 | 0.99 % |
Vietnamese Wikipedia | Reply tool | 0 | 10 | 0 % |
1
Data comes from mediawiki_user_history
2
Percent of junior contributors blocked after posting a comment during the AB test
|
Overall, across all participating Wikipedias, 1.94% of reply tool Junior Contributors were blocked after posting a comment with the reply tool, which is only slightly higher than the percent of Junior Contributors that were blocked after making a comment using page editing during the same timeframe (1.79%)
Under 3.5% of all Junior Contributors that used the reply tool were blocked on all participating Wikipedias with the exception of Ukranian Wikipedia and Indonesisan Wikipedia. 8.33% of Ukranian Wikipedia Junior Contributors reply tool users were blocked; however, comments were only posted by 24 users on this project during the reviewed timeframe and we confirmed that the two users blocked were the same ones blocked after using the non-reply tool as well. Indonesian Wikipedia only had 12 users that made a comment with the reply tool during the AB test and therefore it is difficult to accurately confirm if this percent blocked rate is representative of the population.
Contributors blocked after publishing a comment by experience level | ||||
---|---|---|---|---|
Editing experience1 | Experience level | Number of users blocked | Number of users that made a comment | Percent of users blocked2 |
Page editing | Non-Junior Contributor | 103 | 8500 | 1.21 % |
Page editing | Junior Contributor | 147 | 8190 | 1.79 % |
Reply tool | Non-Junior Contributor | 19 | 1048 | 1.81 % |
Reply tool | Junior Contributor | 24 | 1237 | 1.94 % |
1
Data comes from mediawiki_user_history
2
Percent of Junior Contributors blocked after posting a comment during the AB test
|
There’s not a large difference in the percent of users blocked between experience levels. Under 2% of users were blocked in both experience levels and using both editing experience methods (page editing and reply tool).
We also explored if the reply tool resulted in a greater number of Junior Contributors to start participating productively on talk pages and if it caused a greater percentage of Junior Contributors to continue participating productively on talk pages.
This metric was defined as the number of distinct Junior Contributors who make at least one edit to a page in a talk namespace that is not reverted within 48 hours. Since different sampling rates were applied to each editor type, we removed any events that were oversampled (sampling rate increased to 100%) to allow us to directly compare the numbers between the two groups.
Number of Junior Contributors that made a comment during the AB test by test group and interface1 | ||
---|---|---|
Editing experience | Number of users that attempted a comment | Number of users that published a comment |
control2 | ||
Page editing | 654 | 199 |
Reply tool | 2 | 2 |
test3 | ||
Page editing | 592 | 160 |
Reply tool | 135 | 89 |
1
Based on a sampling rate of 6.25% for all events. Any oversampled events were removed so data for the two editor types could be directly compared
2
Users were not shown the reply tool by default
3
Users were shown the reply tool by default
|
A few explanations regarding the numbers above: * There are Contributors for each editing experience type in each AB test group. This is because contributors within each group also had the option to explicitly turn the tool on or off in their preferences; however, these contributors remained in the same group they were bucketed in for the duration of the test. * Reply tool comment attempts that appear in control are only for people who went and manually enabled the feature and then made a comment. * There are a number of page editing Contributors in the test group because page editing is still available even when the reply tool is enabled. Contributors on a talk page might continue to use page editing because they not aware of the new reply tool, deciding to use what they know, or making corrective edits to the page.
Across all participating Wikipedias, there were 23.8% more Junior Contributors complete a comment attempt when shown the reply tool enabled as default than users that were not shown the reply tool enabled on talk pages.
In addition, we looked into whether Junior Contributors that made at least one comment during the time of the AB test (our cohort) returned to make another comment on a talk namespace. Specifically, we calculated the percentage of Junior Contributors who who make at least one edit to a page in a talk namespace in each of the following retention windows: * 1 week after making a comment (2 -8 days). Note: Since user activity naturally comes in bursts, we excluded the time (first 24 hours) immediately following the Contributors’ first edit. * 2 weeks after making a comment (9- 15 days)
Data for this analysis came from events logged in mediawiki_history. We only reviewed comments that were not reverted within 48 hours. Due to availability of data at the time of this analysis, we only reviewed retention that occured within the first and second week after the Contributor’s first edit.
Note that while the test was completed on 11 March 2021, users that were in the AB test continued to have the same experience, which allows us to compare user behavior that occured after the AB test. See T276967).
# Join all the data
retention_rates_two <- inner_join(week_one_retention, week_two_retention)
Junior contributors retention rate1 | ||
---|---|---|
Editing experience2 | Week 1 (2-8 days)3 | Week 2 (9-15 days)3 |
Page editing | 13.34 % | 8.31 % |
Reply tool | 14.11 % | 7.37 % |
1
Defined as percent of contributors that made a comment during the AB test and returned to make another comment.
2
Sampling rate for Non-Reply Tool events is 6.25%
3
Defined as days since first comment during the AB test.
|
There is not a lot of variation between the retention rates observed for page editing and the reply tool. Most Contributors that returned to make an edit within the first two week made their return edit within week 1 (2 to 8 days following their first).
Broken down by experience level, we see a higher retention rate of Non-Junior Contributors compared to Junior Contributors for both editing experience types; however, there is not a lot of variation between the page editing and reply tool retention rates for either editing experience type. Most Contributors that returned to make an edit within the first two week made their return edit within week 1 (2 to 8 days following their first).
Screenshot available on Wikimedia Commons, licensed under MIT Liscense.↩
Note that we excluded data from from 2021 February 25 to 2021 March 1 in this analysis due to an error in the sampling configuration that resulted in the loss of non-reply tool edit events.↩
Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and other stories. https://doi.org/10.1017/9781139161879.↩
48 hours is a common cutoff, as research suggests that, at least for the English Wikipedia, nearly all reverts take place within 48 hours. Source: Research: Revert. Mediawiki. https://meta.wikimedia.org/wiki/Research:Revert.↩
Comment completion rate for Junior Contributors
We first calculated the comment completion rate for Junior Contributors by editing experience (i.e. Was the contributor able to successfully save at least one comment with or without using the reply tool?) overall and across each participating Wikipedia.
For this analysis, we are defining the comment completion rate as the percent of contributors that successfully published (
event.action = 'saveSuccess'
) at least one comment after opening a particular editing interface (event.action = 'init'
) during the time of the AB test. Note that this does not take into account the number of attempts it took for the user to publish or the duration of their editing sessions.Overall comment completion Rrte by Junior Contributors
1 Defined as percent of contributors that made a comment attempt and publish at least 1 comment.
2 Sampling rate for Non-Reply Tool events is 6.25%
3 Sampling rate for Reply Tool events is 100%
Figure 2: Percent of Junior Contributors that completed at least one comment attempt on a talk page during the AB test.
Comment completion rate by participating Wikipedia
1 Sampling rate for Non-Reply Tool events is 6.25%
2 Sampling rate for Reply Tool events is 100%
3 Defined as percent of contributors that make a comment attempt and publish at least 1 comment.
Figure 3: Percent of Junior Contributors that completed a comment attempt on a talk page. There were a limited number of AB test events recorded for Swahili, Afrikaans, and Egyptian Wikipedia and no recorded AB test events for Amharic and Oromo Wikipedia. As a result, these Wikipedia projects were removed from the chart above as we are not able to conclude any effects from the reply tool on these specific projects.
Junior contributors had a much higher success rate posting a comment using the reply tool compared to page editing. Overall, 72.9% of all Junior Contributors that made a comment attempt were able to successfully publish at least 1 comment with the reply tool, while only 27.6% of all Junior Contributors successfully published a comment using page editing. This represents a 164% (2.6x) observed increase in comment completion rate.
This trend is reflected consistently for each participating Wikipedia as well. Junior Contributors had a higher comment completion rate using the reply tool compared to non-reply tool editor interfaces on every participating Wikipedia.
Indonesian, Japanese, Dutch and Spanish Wikipedias saw the highest percent increases in comment completion rates with the reply tool. We observed the two lowest percent increase in edit completion rate for Persian (42% increase) and Hebrew Wikipedias (55% increase). These are both right-to-left languages, which might impact the reply tool experience and workflows for contributors on these projects; however, we have limited data recorded for right to left languages in the AB test to confirm any impact from language direction.
Modeling the impact of the reply tool
We next explored different models to correctly infer the impact of the reply tool on whether a comment was completed or not and account for the random effects by the user and wiki. This allows us to confirm if the observed increase above is statistically significant (did not occur due to random chance).
Comment attempts completed on the same Wikipedia and by the users on that Wikipedia are related to each other. Therefore, we can more accurately infer the impact of the reply tool by accounting for the effect of the user and wiki on the success probability of a Junior Contributor completing an edit.
We used a Bayesian Hierarchical regression model to model this structure. In this model, the user and Wikipedia are random effects and whether the reply tool was used is the fixed effect or predictor variable.
`
1 CI: Credible Interval
2 Maximum lift calculated using the divide-by-4-rule
3 Average lift = Pr(Success|Reply Tool) - Pr(Success|Page Editing) = logit-1(β0 + β1) - logit-1(β0)
Since the model parameters are on the log-odds scale, we needed to apply the following transformations to make sense of them. * We used the “divide-by-4” rule suggested by Gelman, Hill, and Vehtari 2021 3 to approximate the maximum increase in the probability of success corresponding to which editing interface (reply tool or page editing) was used. Using the bayesian model, we can also directly calculate the average lift. * Since the model parameters are on the log-odds scale, we need to take the exponentiation of the effect (exp(β1)) to determine the multiplicative effect on the odds of a Junior Contributor successfully publishing at least 1 comment.
Based on estimates from the model, we found that Junior Contributors who open the reply tool are about 7 times more likely to successfully publish a comment than Junior Contributors who use page editing.
We also found there is an average 45% increase (maximum 49% increase) in the probability of a Junior Contributor publishing a comment when they switch from using page editing to the reply tool.
We can confirm statistical significance at the 0.05 level for all of these estimates (as indicated by credible intervals that do not cross 1).