User:Amgine/Wikidata random sample test
This is a simple check to examine recent user contributions on Wikidata vs WMF global contributions. My hypothesis is at least one of these 10 contributors will have reduced their non-Wikidata contributions, or xyr most recent edits will solely be to Wikidata, which is a reasonable proxy measure of contributor poaching.
Methodology
editI will select 10 random items on Wikidata, and examine the most recent human contributor - checking 10 most recent contributions timestamps on Wikidata, and the 10 most recent global user contributions. A more thorough model would examine total contributions timestamps over time blocks before and after initial Wikidata contributions based on contribution size, frequency, diversity of articles edited, and total number of edits.
Methodology change: Selecting a random edit from the top 500 Recent Changes minus bot (after 10 articles with nothing but bot edits.) This necessarily biases the sample to humans currently editing Wikidata.
Currently listing last contribution to all global wikis the user has edited at least 10 times. NB: this is probably inaccurate for template/module contributors whose work has been imported to other wikis.
Data
editAmara - 25 edits current session | |
|
|
Ladsgroup - 2 current session | |
|
|
TintoMeches - 63 current session | |
|
|
Cekli829 - 8 current session | |
|
|
Jbribeiro1 - 2 current session | |
|
|
Discussion
editNotes: At this point, halfway through the sample gathering, the following impressions are being developed:
- 100% of the sample are drawn from existing contributors.
- The sampled individuals use bots and/or automated editing across many languages using their SUL (That is, regardless of the local community's rules regarding same.)
- Majority of sample are regular contributors to Commons.
- I *think* at least two of the contributors link from wikidata to sister project pages to edit. That is, they edit sister projects to normalize them to wikidata, rather than the other way around.