@Smmenen can any of your reaction to this article also be your bias for Gush and possibly your fear it might be on the chopping block next B&R announcement? Just curious. No offense intended.
It is possible that my sensitivity to Gush could make me more inclined to respond to unfactual or unfounded statements regarding it. I probably would not have responded to a comment made about Dredge in the same way.
But it's more the fact that Vintage commentators and pundits on websites and social media have a really bad habit of making completely unsupported claims about data far too often, and it's long been a pet peeve of mine. That's why I started collecting data more than ten years ago.
The idea, suggested in the twitch stream, that my "data" was biased because I "wrote a book about Gush" makes no sense unless you think I was selectively omitting data or lying about the calculations.
@Smmenen But what is the actual quality of the data?
That's a very odd question. Do you think Wizards has lied about the daily results?
The data is the daily reported MTGO decklists and the MTGO P9 decklists. That is, every day after a daily fires, Wizards of the Coast posts the decklists on their website here. Kevin and I went through every single daily in the Q1, 2016 up to our recording date, and then compiled them. So did the author here.
The raw data is here.
The cleaned data is here.
And then then aggregate data is here.
While it's possible that Wizards is lying about the decks that actually performed as they claimed, that seems very unlikely. Asking about the quality of the data is odd because it's the information that Wizards publishes. We simply collected it from their website.
The data is presented as a 'random sample', but is it?
Huh? No it's not presented as a random sample. It's almost the entire population. You don't sound like you understand what is presented here.
Sampling is a statistical technique to draw inferences about a population when the population is too large to count. In this case, we have almost the entire population, so sampling is unnecessary. The only thing missing are daily results in which there were two dailies fired in the same day. According to Wizards, that is less than 20% of dailies.
I'm not accusing you of bias, but I believe the data you present is.
I think you misunderstand the data. You believe I was presenting a sample rather than the whole population of data.
You make a compelling argument based on it, but it's not really the 'whole story'. The Q1 data says that Shops was 30%, but, on 1 hand, the March data shows a drop to ~22% (if my cursory review of the data is correct). That implies that Shops was as much as ~34% of the pre-March meta. It seems disingenuous to use Q1 when Jan-Feb is so much different than March. You criticize the author for treating a 6% difference as approximately equivalent, but a 12% month-to-month variance is represented as non-significant...
First of all, that's the "opposite" of the whole story. Removing data makes is less than "the whole story."
In any case, there is tremendous month to month variance in vintage and always has been going back to the earliest data sets we've ever collected. Look at the % of Gush decks in the Premier events. There was only one is the January Top 16, 2 in the February Top 16, and 7 in the March Top 16. That doesn't make "data" biased. It just means that there is tremendous variance.
That could have easily explained why the author's data differed from that I collected, except that when he shared his data set, it becomes clear that that's not the case here.
The selection of quarterly data is not "disingenuous" even in the remotest. First of all, it's consistent with historical approaches:
Secondly, we know the DCI makes it's decisions a month in advance, so they largely didn't have the benefit of March data to make their decision, so, if anything, Jan-Feb is the most relevant period.
In any case, my criticism of the author here has nothing to do with variance or date range for selected input - it has to do with the fact that, according to his own data, Mentor and Shops decks are not even close to "basically the same" amount of the metagame. 16% v. 22% is a pretty enormous metagame representation difference that is equal to 32 decks in his data set and a larger percentage than most archetypes in the metagame.
The second biasing is not looking at or accounting for non-randomness. Dailies (and to an extent P9 Challenges) fire on the same days at the same times - you can 'count' on them. It's like looking at data from a single shop and expecting new people to come in every day - it just doesn't happen. Montolio has 11 finishes on Shops in Q1, BlackLotusT1 has 11 finishes on Shops in Q1, and The Atog Lord has 7 finishes on Shops in Q1 - that's almost 40% of the Q1 Shops decks in the data! Are good players the problem or is the deck really over-powered?
The same thing happens in paper data. If Brian Kelly plays in 5 tournaments out of a 20 tournament data set in a single quarter and makes top 8 each time, he shows up 5 times in the paper data set. No one has ever objected that we should be concerned about this problem in Vintage or Magic paper data sets to my knowledge.
I would say, in defense of the article in question, a 6% difference in decks being presented as approximately equal is probably more reasonable than you argue given the quality of the underlying statistics (the error bars are likely enormous). If the MTGO data were a true random sample (it's not)
That's right. It's not a random sample. It's almost the entire population. Sampling is a statistical technique to draw inferences about a population when the population is too large to count. In this case, we have almost the entire population, so sampling is unnecessary.
If I were sampling, the idea that the reported data was more "biased" would have merit. But these aren't samples.
Gush is pretty close to on par with Shops over Q1. To say otherwise makes the data seem more robust than it really is.
Um, no, actually, it's not. In paper, yes, that's true.
But on Magic Online, it's consistently clear that Shops are about 33% larger part of the reported results.
Just to be clear, here is the breakdown of Q1 Dailies by archetype:
Shops - 32% of all reported daily decks (72 decks out of a 241 reported decks)
Dredge - 15%
Oath - 10%
Mentor - 10%
Delver - 7%
And then everything else is under 5%.
But if you add up all of the Gush decks in our population (all Mentor, Delver, Doomsday, etc), as we did in the tab, you get to 20.74% (21%).
Which happens to be the same percentage as all of the Gush decks in the Top 16 of the premier events.
So, no, Gush decks are not "pretty close to par" with Shops in Q1. Shops are 32% and Gush is 21%. That's not even close to "pretty close." That's not the same vicinity, let alone galaxy.
It has nothing to do with "robustness." A 6% difference is a huge difference when you consider that that is almost as large as all of the Delver decks, for example, in the population. There is no world in which Mentor is "basically" the same amount of Shops.