[Free Article] What Now



  • @Smmenen said:

    Shops are 30% of Q1, 2016 Daily 3-1 or 4-0 decks, and 31% of Top 16, 30% of Top 8, 42% of Top 4, and 50% of Top 2 MTGO P9 challenge tournaments in Q1.

    Gush is 21% of Q1 Daily 3-1s or 4-0 decks, 21% of Top 16, 22% of Top 8, 33% of Top 4, and 17% of Top 8 MTGO P9 challenge tournaments in Q1.

    The statement that was made in this article is simply not true, that Gush decks are, in aggregate, performing about the same as Shops in aggregate. Gush decks are significantly behind Shops by almost every metric on MTGO results.

    If you don't want to "give weight" to MTGO results, that's fine. But it seems pretty obvious that Wizards does. I frankly think they should give weight to both.

    That said, I also believe that the single best data point every month is the P9 challenge as it is much larger than most paper events, more competitive with stronger players, and global. So to ignore MTGO seems foolish.

    As flawed as MTGO is, paper has many flaws as well, such as proxies, budget decks, etc. that don't exist to nearly the same extent on paper.

    I would very much like to know where the author of this article got his data.

    @Smmenen But what is the actual quality of the data? The data is presented as a 'random sample', but is it? I'm not accusing you of bias, but I believe the data you present is. You make a compelling argument based on it, but it's not really the 'whole story'. The Q1 data says that Shops was 30%, but, on 1 hand, the March data shows a drop to ~22% (if my cursory review of the data is correct). That implies that Shops was as much as ~34% of the pre-March meta. It seems disingenuous to use Q1 when Jan-Feb is so much different than March. You criticize the author for treating a 6% difference as approximately equivalent, but a 12% month-to-month variance is represented as non-significant...

    The second biasing is not looking at or accounting for non-randomness. Dailies (and to an extent P9 Challenges) fire on the same days at the same times - you can 'count' on them. It's like looking at data from a single shop and expecting new people to come in every day - it just doesn't happen. Montolio has 11 finishes on Shops in Q1, BlackLotusT1 has 11 finishes on Shops in Q1, and The Atog Lord has 7 finishes on Shops in Q1 - that's almost 40% of the Q1 Shops decks in the data! Are good players the problem or is the deck really over-powered?

    I would say, in defense of the article in question, a 6% difference in decks being presented as approximately equal is probably more reasonable than you argue given the quality of the underlying statistics (the error bars are likely enormous). If the MTGO data were a true random sample (it's not) or the data was pared down to represent a more random sample (bad idea because the data set would get even smaller and error bars would get larger) or the data included all decks played (best idea but is it possible at this point?), Gush is pretty close to on par with Shops over Q1. To say otherwise makes the data seem more robust than it really is.



  • @Ten-Ten said:

    @Smmenen can any of your reaction to this article also be your bias for Gush and possibly your fear it might be on the chopping block next B&R announcement? Just curious. No offense intended.

    I'd prefer we not make this about something it isn't. It's pretty clear that both sides of this most recent restriction are emotionally charged - to say the least.

    From a neutral standpoint it feels as though Steve has been relatively transparent as to his stance both pre and post restriction. For what it's worth I'm not seeing this transparency from the majority of nay sayers.

    On a more aggressive note; data can be interpreted many ways - something I've learned from working in the medical field. That being said does this blog post or many of the outrageous vitrol induced posts that have been slung over the past week back up any of their claims with directed citation? Nope.

    I'm no fan of Steven but this seems pretty clear cut to me.


  • TMD Supporter

    @Ten-Ten said:

    @Smmenen can any of your reaction to this article also be your bias for Gush and possibly your fear it might be on the chopping block next B&R announcement? Just curious. No offense intended.

    It is possible that my sensitivity to Gush could make me more inclined to respond to unfactual or unfounded statements regarding it. I probably would not have responded to a comment made about Dredge in the same way.

    But it's more the fact that Vintage commentators and pundits on websites and social media have a really bad habit of making completely unsupported claims about data far too often, and it's long been a pet peeve of mine. That's why I started collecting data more than ten years ago.

    The idea, suggested in the twitch stream, that my "data" was biased because I "wrote a book about Gush" makes no sense unless you think I was selectively omitting data or lying about the calculations.

    @Fred_Bear said:

    @Smmenen But what is the actual quality of the data?

    That's a very odd question. Do you think Wizards has lied about the daily results?

    The data is the daily reported MTGO decklists and the MTGO P9 decklists. That is, every day after a daily fires, Wizards of the Coast posts the decklists on their website here. Kevin and I went through every single daily in the Q1, 2016 up to our recording date, and then compiled them. So did the author here.

    The raw data is here.

    The cleaned data is here.

    And then then aggregate data is here.

    While it's possible that Wizards is lying about the decks that actually performed as they claimed, that seems very unlikely. Asking about the quality of the data is odd because it's the information that Wizards publishes. We simply collected it from their website.

    The data is presented as a 'random sample', but is it?

    Huh? No it's not presented as a random sample. It's almost the entire population. You don't sound like you understand what is presented here.

    Sampling is a statistical technique to draw inferences about a population when the population is too large to count. In this case, we have almost the entire population, so sampling is unnecessary. The only thing missing are daily results in which there were two dailies fired in the same day. According to Wizards, that is less than 20% of dailies.

    I'm not accusing you of bias, but I believe the data you present is.

    I think you misunderstand the data. You believe I was presenting a sample rather than the whole population of data.

    You make a compelling argument based on it, but it's not really the 'whole story'. The Q1 data says that Shops was 30%, but, on 1 hand, the March data shows a drop to ~22% (if my cursory review of the data is correct). That implies that Shops was as much as ~34% of the pre-March meta. It seems disingenuous to use Q1 when Jan-Feb is so much different than March. You criticize the author for treating a 6% difference as approximately equivalent, but a 12% month-to-month variance is represented as non-significant...

    First of all, that's the "opposite" of the whole story. Removing data makes is less than "the whole story."

    In any case, there is tremendous month to month variance in vintage and always has been going back to the earliest data sets we've ever collected. Look at the % of Gush decks in the Premier events. There was only one is the January Top 16, 2 in the February Top 16, and 7 in the March Top 16. That doesn't make "data" biased. It just means that there is tremendous variance.

    That could have easily explained why the author's data differed from that I collected, except that when he shared his data set, it becomes clear that that's not the case here.

    The selection of quarterly data is not "disingenuous" even in the remotest. First of all, it's consistent with historical approaches:
    http://themanadrain.com/topic/138/vintage-metagame-data-archive

    Secondly, we know the DCI makes it's decisions a month in advance, so they largely didn't have the benefit of March data to make their decision, so, if anything, Jan-Feb is the most relevant period.

    In any case, my criticism of the author here has nothing to do with variance or date range for selected input - it has to do with the fact that, according to his own data, Mentor and Shops decks are not even close to "basically the same" amount of the metagame. 16% v. 22% is a pretty enormous metagame representation difference that is equal to 32 decks in his data set and a larger percentage than most archetypes in the metagame.

    The second biasing is not looking at or accounting for non-randomness. Dailies (and to an extent P9 Challenges) fire on the same days at the same times - you can 'count' on them. It's like looking at data from a single shop and expecting new people to come in every day - it just doesn't happen. Montolio has 11 finishes on Shops in Q1, BlackLotusT1 has 11 finishes on Shops in Q1, and The Atog Lord has 7 finishes on Shops in Q1 - that's almost 40% of the Q1 Shops decks in the data! Are good players the problem or is the deck really over-powered?

    The same thing happens in paper data. If Brian Kelly plays in 5 tournaments out of a 20 tournament data set in a single quarter and makes top 8 each time, he shows up 5 times in the paper data set. No one has ever objected that we should be concerned about this problem in Vintage or Magic paper data sets to my knowledge.

    I would say, in defense of the article in question, a 6% difference in decks being presented as approximately equal is probably more reasonable than you argue given the quality of the underlying statistics (the error bars are likely enormous). If the MTGO data were a true random sample (it's not)

    That's right. It's not a random sample. It's almost the entire population. Sampling is a statistical technique to draw inferences about a population when the population is too large to count. In this case, we have almost the entire population, so sampling is unnecessary.

    If I were sampling, the idea that the reported data was more "biased" would have merit. But these aren't samples.

    Gush is pretty close to on par with Shops over Q1. To say otherwise makes the data seem more robust than it really is.

    Um, no, actually, it's not. In paper, yes, that's true.

    But on Magic Online, it's consistently clear that Shops are about 33% larger part of the reported results.

    Just to be clear, here is the breakdown of Q1 Dailies by archetype:

    1. Shops - 32% of all reported daily decks (72 decks out of a 241 reported decks)

    2. Dredge - 15%

    3. Oath - 10%

    4. Mentor - 10%

    5. Delver - 7%

    And then everything else is under 5%.

    But if you add up all of the Gush decks in our population (all Mentor, Delver, Doomsday, etc), as we did in the tab, you get to 20.74% (21%).

    Which happens to be the same percentage as all of the Gush decks in the Top 16 of the premier events.

    So, no, Gush decks are not "pretty close to par" with Shops in Q1. Shops are 32% and Gush is 21%. That's not even close to "pretty close." That's not the same vicinity, let alone galaxy.

    It has nothing to do with "robustness." A 6% difference is a huge difference when you consider that that is almost as large as all of the Delver decks, for example, in the population. There is no world in which Mentor is "basically" the same amount of Shops.


  • TMD Supporter

    If I may indulge in a slightly off topic aside, what I love about this Vintage community, and TMD in particular, is the level of rigor of the debate. I am grateful for the analysis that Steve, Kevin, and Danny, and several others contributed to this particular debate, and many others.

    The Vintage metagame is an incredibly complex system, and reasonable human minds, which are inevitably biased to some extent, can disagree as to how best to analyze and distill meaning from the incongruous sets of data, particularly with respect to ill-defined concepts as dominance of a particular archetype, or the appropriateness of restricting a card.

    In sum, this thread, stripped of the personal attacks and ruffled feathers, is representative of what I think makes this such a great forum. Keep up the good work gentleman, and keep it clean.



  • @Smmenen said:

    Do you think Wizards has lied about the daily results?

    The data is the daily reported MTGO decklists and the MTGO P9 decklists. That is, every day after a daily fires, Wizards of the Coast posts the decklists on their website here. Kevin and I went through every single daily in the Q1, 2016 up to our recording date, and then compiled them. So did the author here.

    Quick question, then... The data you present represents **EVERY **decklist or every decklist which went 3-1 and 4-0 in the dailies? I am under the impression that your data was ONLY 3-1 and 4-0 decks from the dailies - the data shared with us on mtgo.com and mtggoldfish.com.


  • TMD Supporter

    @Fred_Bear said:

    @Smmenen said:

    Do you think Wizards has lied about the daily results?

    The data is the daily reported MTGO decklists and the MTGO P9 decklists. That is, every day after a daily fires, Wizards of the Coast posts the decklists on their website here. Kevin and I went through every single daily in the Q1, 2016 up to our recording date, and then compiled them. So did the author here.

    Quick question, then... The data you present represents **EVERY **decklist or every decklist which went 3-1 and 4-0 in the dailies? I am under the impression that your data was ONLY 3-1 and 4-0 decks from the dailies - the data shared with us on mtgo.com and mtggoldfish.com.

    Your question doesn't make sense. What's the difference between "every decklist that went 3-1 or better"and decklists those "only went 3-1 or better"? That's the same thing.

    The world "only" and "every" perform the same work in each part of your question by excluding decks that performed worse than 3-1.

    In any case, if you clicked the links I provided for you, you would have seen the answer. It is the complete population of decklists that performed at 3-1 or better. It's not a sample of 3-1 or better decks.

    Again:

    The raw data is here.

    The cleaned data is here.

    And then then aggregate data is here.

    Wizards of the Coast asked MTGGoldfish to cease and desist collecting data, so our data was taken directly from the Wizards website. Had you actually looked at the tabs or read my previous post more carefully, I think that would have been clear.



  • I think maybe he's talking about the fact that we don't have the metagame breakdowns (which isn't your fault, we just don't have that data.)

    If 75% of people are playing Shops decks and making up only 30% of wins, that tells a different story than if 1% are playing Shops decks and make up 30% of the wins. Of course, both of those situations would be a problem.

    I'd love to see if/how the data on shops trended down over time, as I suspect a lot of early shops wins came from players assuming the deck was dead, and seriously underpreparing for it. Many people were super excited to play Storm Combo or Doomsday after the Chalice restriction, and most of those people lost. Of course, WotC made the decision before they could have identified any trend, which is a different problem, but one that every format has to deal with.

    I don't think we have consensus as a format on what a problem metagame even looks like. Personally I would love the top deck to be around 25-30%, (even if, in this case, I don't really enjoy playing the top deck), so I looked at the same numbers and said "obviously not a problem!." If you think an optimal metagame has a top deck at 15-20% wins, those very same numbers say "obviously a problem!" Without any pre-discussed target for what a healthy metagame looks like, it's too easy to postrationalize what the data means, even if the data is complete accurate, which in this case I have to believe. (Note that I'm not saying anyone in this thread is doing that, it's just a peril of the sort of discussion we've been having)


  • TMD Supporter

    @Brass-Man said:

    I think maybe he's talking about the fact that we don't have the metagame breakdowns (which isn't your fault, we just don't have that data.)

    Ah.

    If that's the point he's making, then that would render Danny's statement at issue fundamentally unsupportable - since there is no way that anyone could know that "Mentor decks are basically the same percentage of the metagame as all MIshra's Worksho decks combined."

    The assumption in this discussion is that by "metagame," we are referring to Top X metagame (either 3-1/4-0 decks or Top 16/8).

    In fact, if you go back and look at every single metagame report ever, that's what we are talking about - the Top performing decks.

    That said, although we don't have the entire metagame breakdown for most events, there are many in which we do. For example, the NYSEs, Waterburys, many of the Vintage Championships, and at least three of the MTGO P9 Challenge events are data points in which someone (in some cases me, Jaco, or Matt & Ryan), have gone in and counted every single deck in the metagame.

    For example: http://themanadrain.com/topic/146/january-and-february-mtgo-p9-challenge-data

    And:

    http://www.eternalcentral.com/so-many-insane-plays-magic-online-p9-challenge-metagame-analysis/

    From those data points, we've been able to see what % of the metagame these decks tend to be. In my experience from having closely observed this data over time, Workshops are often around 20-25% of the metagame. I think it was about 22.5% of the NYSE 3 last year. It was 22% of the Feb MTGO p9 but about 20% of the January MTGO P9 event.



  • @Smmenen said:

    If you clicked the links I provided for you, you would have seen the answer. It is the complete population of decklists that performed at 3-1 or better. It's not a sample of 3-1 or better decks.

    Steven, I appreciate the condescension, but, by definition, your data is a sample of the full population. More decks than what went 3-1 or 4-0 were played at each event. I didn't misunderstand anything and I do believe it is disingenuous to present only those decks as the full population. A daily requires a minimum of 12 participants - 48 events (in your sheets - I did look) x 12 decks = 576+ decks in the full population. Your data includes 241 "reported decks". The analysis done by @diophan and @ChubbyRain was a full population.

    My issue with the data still exists.
    #1 - The data does not represent a random sample and results in data with huge variation. What I mean by this is that you are looking at many snapshots rather than a continuous stream of data. This causes high variance on its own. Add to that the high variance (which you acknowledge) in a deck's play month-to-month and add to that the high variance of a small data set and you have data that is probably +/- 1 deck (on the conservative side) in every event. What does that mean? You show 72 Shops decks over 48 events. The high variance means that over the next 48 events, we should see 72 +/-48 decks in your data - just based on the variance in the data. That seems about right, too. The data was weighted higher in Jan/Feb and a drop-off was seen in the March data (and at the March P9). 57 decks in 35 Jan/Feb events and 15 decks in 13 Mar events. High variance, but within expectations. [Note, this is why I have no problem with the author say 16%=22% in his article. Ultimately, 24=120 in terms of Shops level of play over 48 daily events. That's what the 'variance' means in real numbers.]

    #2 - Deck and Win % are not the only correlated variables. 'Huh?' Pilots matter over this data set. The premise of your data is that the deck is the dependent variable that leads to win percentage. It would not hold up to a stronger analysis of correlation. From your data set, we can point to 3 Shops pilots, Montolio, The Atog Lord, & BlackLotusT1, who account for nearly 40% of the decks in the Shops population. If 3 players can distort the data to the point of getting a card restricted, we should be able to agree that the data set is too small to base decisions on.

    @Brass-Man is right. The data you put together is missing a key component and maybe someone at Wizards or the DCI has that information and has done a more in-depth analysis. Based on their explanation that came with the restriction, I'm doubtful.

    I also agree that a top deck being in the range of ~30% should be fine and with the small samples that we are subject to, 30% probably means 20-40%. If it starts to creep from there, we have issues.

    Looking at your most recent post, if Shops is historically 20-25% of large events, to use your terms, the DCI's explanation that Lodestone Golem was 'over-represented' is "fundamentally unsupportable" unless they are willing to outline what the ideal metagame looks like...


  • TMD Supporter

    @Fred_Bear said:

    @Smmenen said:

    If you clicked the links I provided for you, you would have seen the answer. It is the complete population of decklists that performed at 3-1 or better. It's not a sample of 3-1 or better decks.

    Steven, I appreciate the condescension, but, by definition, your data is a sample of the full population. More decks than what went 3-1 or 4-0 were played at each event. I didn't misunderstand anything and I do believe it is disingenuous to present only those decks as the full population.

    First and foremost, if you are defining the full metagame as every deck played, then that renders Danny's claim not only unsubstantiated and unsupported, but fundamentally unsupportable as unknowable.

    The issue being debated is my disputing the claim that "mentor decks are basically" the same portion of the metagame as Shops. I find that provably false, as an empirical matter.

    So, if you wish to redefine the "metagame" as every deck played, then that only strengthens my critique.

    But, based upon Danny's data set and my response, neither Danny nor I were defining "the metagame" as the every deck played. Rather, the "population" was the top performing decks. In MTGO dailies, this was defined as 3-1 or 4-0 decks. In my larger data set, this included Top 8 paper tournament results and Top 16 MTGO premier event results.

    But, to reiterate, the "population" in our data was not every deck, but only the top performing decks. That's how were both defining the population - as the top performing decks.

    This is not a novel concept.

    Take a look at the metagame report archive: http://themanadrain.com/topic/138/vintage-metagame-data-archive

    When, in 2004, Phil Stanton, posted an article titled the "April 2004 Type One Metagame Breakdown," he looked only at Top 8 data.

    Or, when in 2011, Matt Elias posted an article titled "The Q1 Vintage Metagame Report," he looked only at Top 8 data.

    In both cases, the titles of the articles and the discussions used the term "Vintage metagame." Not "Vintage Top 8 Metagame."

    In the context of these discussions, it's well understood that we are discussing top performing decks, not the entire set of decks played. Danny's own data set makes that clear.

    Now, if I had advanced a claim that you were now disputing using that logic, then maybe you would have a leg to stand on. But my only reason for participating in this thread is to dispute a very specific claim presented in the article that this thread is about.

    #1 - The data does not represent a random sample and results in data with huge variation.

    That's not what random sampling is. Random sampling is a statistical method that is used to try to understand a population that is too large to count feasibly So, instead of polling every possible voter, campaign pollsters use samples.

    Not only is it not a sample, there is nothing "random" about this data. It's a complete population of top performing decks.

    What I mean by this is that you are looking at many snapshots rather than a continuous stream of data.

    Uh. No, I'm limiting most of my data to Q1, but within that set, it's a fairly continuous stream of data. Yes, I imposed some parameters on it (the Q1 of 2016), but you have to do that to any data. The notion that it's "merely a snapshot" suggests that it's some sort of inherently biased sample, when it's the exact same methodology that Vintage metagame reporters have used since 2003.

    #2 - Deck and Win % are not the only correlated variables. 'Huh?' Pilots matter over this data set. The premise of your data is that the deck is the dependent variable that leads to win percentage. It would not hold up to a stronger analysis of correlation. From your data set, we can point to 3 Shops pilots, Montolio, The Atog Lord, & BlackLotusT1, who account for nearly 40% of the decks in the Shops population. If 3 players can distort the data to the point of getting a card restricted, we should be able to agree that the data set is too small to base decisions on.

    This is off-topic to the issue I was debating here, but you are calling for a standard for restriction that Wizards is in no way obligated to follow.

    By saying that "a data set is too small to base a decision" you are explicitly saying that Wizards either should not restrict or is unjustified in restricting unless they have a certain quality of data.

    That's just false. That's not to say that I don't think that Wizards shouldn't use data in making decisions. I've been arguing that for years. In fact, I argued that in 2003 in one of my earliest SCG articles.

    But Wizards, like many real world policy makers, have imperfect data sets when making policy decisions.

    Do you think that Federal Reserve has every data set it would like in setting the federal funds rate?
    Do you think that the President has every data set he wants in making military policy (see this month's issue of the Atlantic and the tremendous uncertainties in his Syria policy).

    As I already pointed out, the problem of "individuals" skewing data sets already exists in paper magic, and it's just as true on MTGO. But that doesn't mean that the data can't be used to make banned and restricted list decisions or that doing so is somehow less valid. Wizards is perfectly justified in using imperfect data to make policy decisions, just as much as any other real world policymaker.

    I also agree that a top deck being in the range of ~30% should be fine and with the small samples that we are subject to, 30% probably means 20-40%. If it starts to creep from there, we have issues.

    Looking at your most recent post, if Shops is historically 20-25% of large events, to use your terms, the DCI's explanation that Lodestone Golem was 'over-represented' is "fundamentally unsupportable" unless they are willing to outline what the ideal metagame looks like...

    Complete nonsense. Wizards has no duty to outline what an "ideal" metagame looks like. Moreover, Wizards has access to all of the MTGO data. It certainly is the case that they can look at the overall composition of Workshops in these metagames, and then see how they are performing relative to their metagame presence. There is not a shred of tangible evidence to doubt that's exactly what they did here.

    In any case, that part of the discussion is a non-sequitur. I'm not debating the validity of Wizards decisions. I'm debating the validity of Danny's claim regarding Mentor and Shops.


  • TMD Supporter

    On a different but related topic, I am curious Danny about your different suggested sideboard choices between the Storm and Doomsday, and especially the different anti-dredge, anti-workshop and "insulation packages" (defense grid/city of solitude/xantid swarm).

    It isn't obvious to me why you wouldn't play more similar sideboards, particularly when the effects are similar. What were your thoughts?

    P.S. I'm annoyed at you for mentioning City of Solitude. I've been thinking about that technology for a while, and was looking forward to catching folks off guard.


  • TMD Supporter

    Great datasets and breakdowns. I'm sure we all appreciate the efforts people put into it.

    I do wonder the value of tracking "Top 4" and "Top 2." In such small datasets, is this really relevant, as I think it leads to a warped perception. Top 16 or Top 8, yes. But Top 2/4 seems much less relevant and leads to overimportance of data.

    I also question the dismissal of @Fred_Bear point about how three players made up 40% of the success of one deck. You can't say that 6% is incredibly relevant in one area, and then dismiss the enormous impact that these three players have had on the MTGO 3-1/4-0 population numbers. I think this level of repeat/consistent success is a rarer phenomenon in paper.

    I just don't see Paper results and MTGO results being directly comparable (drops, lack of proxies on MTGO, tournament times, tournament prizes, etc). That said, with what is available, I think everyone is doing admirable work.



  • Look, you're obviously a smart guy and I'm not trying to dispute that, but just as you accuse others of hyperbole, you seem unwilling to relent on your own use of it...

    @Smmenen said:

    First and foremost, if you are defining the full metagame as every deck played, then that renders Danny's claim not only unsubstantiated and unsupported, but fundamentally unsupportable as unknowable.

    No it doesn't. Statistics are used to draw comparisons and conclusions. Danny is looking at the available data and drawing a conclusion based on the expected variance in the data. Can it be known 100%? No. Can it be known for the sake of an editorial? Absolutely.

    By your argument, the statistics are definitions and that's unreasonable in the terms of the original article's intent (at least by my understanding). It should be obvious to a reader that 16 does not equal 22, but the difference between 16 and 22 is not quite as vast as you want us to believe. [32 decks over 48 events is well within the variance between Mentor and Shops]

    The issue being debated is my disputing the claim that "mentor decks are basically" the same portion of the metagame as Shops. I find that provably false, as an empirical matter.

    By this argument, I could look at the data for 3/13 and claim Shops is 0% of the meta as an empirical fact. It's a crap argument. Statistics work when combined and interpreted with some common sense and you know that. What happened on 3/13 must be viewed within a larger discussion, just as March or February or January or Q1.

    There is nothing "random" about this data. It's a complete population of top performing decks.

    For a given time, you mean. This is a problem with statistics. Population and Sample. You've said before, it's the complete population, except when 2 dailies fire, so it's really a pretty comprehensive Sample.

    What I mean by this is that you are looking at many snapshots rather than a continuous stream of data.

    Uh. No, I'm limiting most of my data to Q1, but within that set, it's a fairly continuous stream of data. Yes, I imposed some parameters on it (the Q1 of 2016), but you have to do that to any data. The notion that it's "merely a snapshot" suggests that it's some sort of inherently biased sample, when it's the exact same methodology that Vintage metagame reporters have used since 2003.

    It may be the same methodology, but is that warranted? Look at February 2016 in your spreadsheet as an example. You have paper data - 8 events ranging from 8-43 participants. The online data is 15 events with a minimum of 12 participants, all less than 32 (would result in 2 4-0 decks). The 8 paper events have maybe 2-3 of the same players appearing in the top decks (60) while the 15 online events have many of the same players appearing multiple times, for example BlackLotusT1 had 5 top finishes out of 72 total decks. The data is fundamentally different and those differences should be accommodated. To simply try and view it through the same lens as paper Magic has been viewed for the last decade doesn't seem right.

    I mean, you can certainly do it, but what are the implications?

    By saying that "a data set is too small to base a decision" you are explicitly saying that Wizards either should not restrict or is unjustified in restricting unless they have a certain quality of data.

    That's just false. That's not to say that I don't think that Wizards should use data in making decisions, but Wizards, like many real world policy makers, have imperfect data sets.

    Seriously? Again, I think you know what I mean. The MTGO data can be manipulated to present any number of "empirically correct" arguments (i.e. anything involving FoW, Mental Misstep, and Ingot Chewer). That doesn't make for useful decision making. The Federal Reserve may not have every data set it would like in setting rates, but they don't look at hockey shooting percentages or baseball batting averages. They try to make sense of the data set that they have and work to make it as strong as they can. If a data point doesn't fit, I guarantee that the Federal Reserve doesn't key on that data point to drive decisions.

    This is the issue that many people seem to have. 12/60 (20%) Shops decks in paper (February) compared to 27/72 (37.5%) decks in MTGO (February dailies) could lead to much different decision making. The question becomes - which is more accurate? It's up to them and that's fine, but then either they can explain the reasoning or they can live with us questioning the methodology. I doubt they lose any sleep over whether or not I question something, so they will continue to do what they want...

    As I already pointed out, the problem of "individuals" skewing data sets already exists in paper magic, and it's just as true on MTGO. But that doesn't mean that the data can't be used to make banned and restricted list decisions or that doing so is somehow less valid. Wizards is perfectly justified in using imperfect data to make policy decisions, just as much as any other real world policymaker.

    To use your terminology, this is empirically false. Using February 2016 as an example, the paper Magic data is not nearly as skewed by individuals as the MTGO data over the same time period.

    And you are right, Wizards/DCI can use whatever data they like, but we are also free to question their interpretations and decisions when they conflict with alternate data analysis.

    I appreciate the analysis that you put together. I just don't think it's the whole story because you can easily be led to alternative conclusions. I understand that you are looking at it with the same historical perspective as has always been done, but I think that's skewed by the type of data MTGO generates.

    As for the original argument, using your definitions, Danny probably overstated his case. If we read it and assume Danny is extrapolating the data to describe the metagame in broader terms (as Wizards does in their B/R announcement), I think he's justified. Again, based on the expected variance in the daily events data and the level of play Workshops see in larger events (paper, P9 challenges), it is reasonable to believe that Shops is played about the same amount as Mentor. I guess I should clarify that this can never be 'known' 100%, but based on the data available, I would've bet that for the next P9 Challenge Shops and Mentor would be within 5 decks of one another (post-Golem restriction, I expect Mentor to go up in comparison to Shops).


  • TMD Supporter

    @Fred_Bear & @Smennen

    I'm not sure what you are even debating anymore. The data speaks for itself. We can disect it and criticize analytical methodologies ad nauseum, but at some point the head of the pin becomes overcrowded with angels.


  • TMD Supporter

    @Fred_Bear said:

    @Smmenen said:

    First and foremost, if you are defining the full metagame as every deck played, then that renders Danny's claim not only unsubstantiated and unsupported, but fundamentally unsupportable as unknowable.

    No it doesn't. Statistics are used to draw comparisons and conclusions. Danny is looking at the available data and drawing a conclusion based on the expected variance in the data. Can it be known 100%? No. Can it be known for the sake of an editorial? Absolutely.

    You are now pulling a bait and switch.

    When I presented aggregate MTGO daily data, it's a mere "sample," explicitly calling it a questionably reliable data source.

    But when Danny does the exact same thing, it's simply "statistics" - looking at data and drawing a conclusion with a subjective view of possible variance.

    Look. I took a single sentence out of this article that I found to be objectionable based upon all of the available evidence. My data strongly disputed his claim.

    Now, Danny has presented his data, and by his own terms his statement is not supportable by the available facts. You are willing to credit Danny with a tortured reading of the facts by introducing some notion of variance, but that cuts the other direction as well. It's just as plausible, in your reading, that Shops should be 6% higher, if you are willing to credit that much variance.

    By your argument, the statistics are definitions and that's unreasonable in the terms of the original article's intent (at least by my understanding). It should be obvious to a reader that 16 does not equal 22, but the difference between 16 and 22 is not quite as vast as you want us to believe. [32 decks over 48 events is well within the variance between Mentor and Shops]

    The difference between 16 and 22 % of a Vintage metagame is actually an enormous gulf. In fact, it's probably much larger than "I would have us believe." Consider a few facts that will put this in context:

    1. Very few decks constitute more than 6% of a Vintage metagame in any time period. Usually no more than 5, and sometimes as few as 2.

    2. 6% is the difference between 1% and 7% or 2% and 8%. Would anyone say that the difference between "1 and 7" isn't that big? No. It's enormous when dealing with the kinds of data we are looking at.

    3. 6% is almost equal to the total number of Delver decks in a data set. It's a very large difference.

    The issue being debated is my disputing the claim that "mentor decks are basically" the same portion of the metagame as Shops. I find that provably false, as an empirical matter.

    By this argument, I could look at the data for 3/13 and claim Shops is 0% of the meta as an empirical fact. It's a crap argument. Statistics work when combined and interpreted with some common sense and you know that. What happened on 3/13 must be viewed within a larger discussion, just as March or February or January or Q1.

    Exactly. Your point here is a straw man argument, that actually undermines any use of data.

    The reason that I look at quarterly data or bimonthly data (as Phil Stanton used to ) is that a snapshot of a single month, or a week, or even a day, is less reliable.

    What we are looking for in the data are trends.

    Your point, that any particular snapshot of data is flawed because there is so much variance in Vintage, overshoots the mark & swallows your entire argument. That argument could be made, literally, about any time period, a quarter, 6 months, a year, even a decade. I could apply the exact same point: that ten years of data simply has too much variance when looking at the previous ten years.

    Applied to the logical extreme, your point here renders any time bound data set problematic.

    When considered in reasonable manner, using a couple of months of data, or even a full quarter, has been deemed perfectly reasonable my most commentators, analysts, and participants in discussions such as this.

    In my metagame reports, I used to only include tournament data with a minimum of 33 players to try to reduce the variance. Phil Stanton pegged the cut off at 50 players.

    In any case, when ever using data, we have to impose some limits on what will be included and what won't. In this case, Danny and I looked at exactly the same data: MTGO reported dailies (although I looked at everything else as well).

    By saying that "a data set is too small to base a decision" you are explicitly saying that Wizards either should not restrict or is unjustified in restricting unless they have a certain quality of data.

    That's just false. That's not to say that I don't think that Wizards should use data in making decisions, but Wizards, like many real world policy makers, have imperfect data sets.

    Seriously? Again, I think you know what I mean. The MTGO data can be manipulated to present any number of "empirically correct" arguments (i.e. anything involving FoW, Mental Misstep, and Ingot Chewer).

    Of course it can be. But any thinking person would call out unreasonable use of data. Using your hypothetical, some idiot pointing to 3/13 to say that Shops are 0% of the metagame would be immediately dismissed out of hand.

    On the other hand, aggregating several months worth of data into a quarterly report is generally considered reasonable and sufficiently inclusive.

    This is the issue that many people seem to have. 12/60 (20%) Shops decks in paper (February) compared to 27/72 (37.5%) decks in MTGO (February dailies) could lead to much different decision making. The question becomes - which is more accurate? It's up to them and that's fine, but then either they can explain the reasoning or they can live with us questioning the methodology.

    Finally, a point I agree with. I think this is the place that most reasonable people can disagree and get into a debate about the restriction of Golem. But that's a non-sequitur here, as my focus in this thread is a particular statement in this article.

    As for the original argument, using your definitions, Danny probably overstated his case.

    Yes. Overstatements = false statements. So, after all of this back and forth, you are now conceding that my original statement is now correct. Thank you.

    If we read it and assume Danny is extrapolating the data to describe the metagame in broader terms (as Wizards does in their B/R announcement), I think he's justified.

    I don't. Let's focus on this for just a minute longer.

    Danny's claim, which I dispute, was that Mentor decks were about as prevalent as Shops decks.

    Yet, not only are Mentor decks not even close to as prevalent as Shops decks (according to the Q1 data, Mentor decks were only 10% of MTGO reported daily decklists, whereas Shops were 31% of reported decks), but not even all Gush decks combined are as prevalent as all Shop decks combined.

    If you look at the MTGO Q1 data we compiled, Mentor decks are less than 50% of all Gush decks (23/50).

    So, if not even all Gush decks are equal to the number of all Shops decks (50 Gush decks compared to 72 Shops deck), then the much lesser statement, that Mentor decks basically approximate the number of Shop decks, becomes even more absurd (and, in fact, in the Q1 data set the numbers are 23 Mentor decks (28 if we add the 5 UW landstill decks) to 72 Shops decks).

    Let that sink in for a second.

    If the total number of Gush decks don't even come close to approximating the number of Shops decks, then it's not plausible that Mentor, which is less than 50% of the total number of Gush decks, can come any closer. It's just math.



  • I'm not accusing you of bias, but I believe the data you present is.

    I think you misunderstand the data. You believe I was presenting a sample rather than the whole population of data.

    Collection methods may be biased, but in this case the point is moot as its all the available data (unknown knowns and all that). Grouping criteria may be biased, opinions may be biased—raw, complete datasets not so much.



  • @Smmenen said:

    You are now pulling a bait and switch.

    When I presented aggregate MTGO daily data, it's a mere "sample," explicitly calling it a questionably reliable data source.

    I've pulled no bait and switch. I've consistently said, the data set is a sample. The statistics that I read Danny drew his conclusion on are the same. I read it that he was extrapolating the data set to describe the metagame. When you do this, I find it reasonable to factor in variance to the data set, which over a quarter of the year can result in a significant number of decks.

    Look. I took a single sentence out of this article that I found to be objectionable based upon all of the available evidence. My data strongly disputed his claim.

    I disagree that the data strongly disputes his claim. It can be read that it disputes it, but if you look more broadly, his statement is not as black-and-white true-or-false as you make it seem. Mentor is a heavily played Gush deck - one of the best/most played versions, in fact, in paper it out-represents Shops. I don't find it offensive for him to say they see *about *the same play.

    And you are correct, Shops could see heavier play due to variance, but that does not seem as plausible based on the larger tournament results (i.e. I made my argument self-consistent). I indicated in an earlier post that anywhere between 24 and 120 Shops decks in the data shows *about *the same level of play.

    The difference between 16 and 22 % of a Vintage metagame is actually an enormous gulf. In fact, it's probably much larger than "I would have us believe." Consider a few facts that will put this in context:

    Again, you want it to be true for some and not others. 16 and 22% is a huge gulf, but not spread over 48 events each reporting 4-7 decklists. The application means that you might see 1 more Shops deck than a Mentor deck at a similar event. This is why I don't understand why his statement was so offensive. The difference in play is less than 1 deck per event over nearly 50 events... It's only when you turn it into aggregate data that it turns into an 'enormous gulf'. But that's not what your statistics describe - the statistics describe a small event - where they were generated.

    Exactly. Your point here is a straw man argument, that actually undermines any use of data.

    Which is what I point out. Data has to be used in context and now you flip/flop from the use of data to prove fallacy to 'looking for trends'. That's what I've advocated all along. But to look at trends, you have to look at how the data is generated. Following top-level Shops mages on MTGO is going to artificially skew the data, but you don't seem willing to admit that the data does that.

    Applied to the logical extreme, your point here renders any time bound data set problematic.

    It doesn't, but you just want me to be wrong. The time should be such that the data is meaningful. I think the paper Magic data is excellent. It looks at random metagames, it looks at, seemingly, random participants, it's, seemingly, unbiased data. The MTGO data looks like the same players over and over and over again. Except for the P9 data which plays out like the paper data.

    Of course it can be. But any thinking person would call out unreasonable use of data. Using your hypothetical, some idiot pointing to 3/13 to say that Shops are 0% of the metagame would be immediately dismissed out of hand.

    On the other hand, aggregating several months worth of data into a quarterly report is generally considered reasonable and sufficiently inclusive.

    But why is unreasonable to question data which is not in sync with other data sets? MTGO daily data shows a significant change from paper or from large online tournaments. You are ok with this. I am not. Especially when compounding the abnormal data set seems to be what drives decision making.

    Again, I'll agree to disagree. I think he made an overstatement, but I don't find it offensive. I don't think you could distinguish 50 decks from 72 decks using sound statistical tools to analyze your data set to describe the metagame as a whole over Q1. I think it becomes even more difficult if the data analysis included an analysis based on pilot. If it repeated over another quarter (i.e. a trend), I think you would have a statistical argument, but we'll never know.


  • TMD Supporter

    @Fred_Bear said:

    @Smmenen said:

    You are now pulling a bait and switch.

    When I presented aggregate MTGO daily data, it's a mere "sample," explicitly calling it a questionably reliable data source.

    I've pulled no bait and switch. I've consistently said, the data set is a sample. The statistics that I read Danny drew his conclusion on are the same. I read it that he was extrapolating the data set to describe the metagame. When you do this, I find it reasonable to factor in variance to the data set, which over a quarter of the year can result in a significant number of decks.

    If Danny and I are both using the exact same data source: MTGO daily events, why are you crediting his data, but ignoring mine?

    Moreover, if you are so concerned about sampling and time periods, shouldn't you be much more critical of his data set? If this isn't about just Danny's data, then doesn't his claim become even more tenuous?

    Again, according the data I presented in the first post in this thread, in Q1:

    Shops are 30% of dailies
    Gush is 20.7%
    And Mentor is 10%

    Danny's explanation for why his data set is different, and he calculates Shops at 22% and Mentor at 16%, is that he added October, November, and December to the data. That means that Danny had the exact same data I have, but added three more months to the beginning (and therefore less directly relevant in determining current trends).

    Adding those three months bolsters any claim that Mentor is getting closer to Shops, but looking just at Q1 makes it clear that Shops have pulled far ahead. According to the data, Shops went from 14% of dailies in Q4 to 30% in Q1. Danny knows this or can know this, since his data set encompassed both quarters. By presenting both Q4 and Q1, he is selectively ignoring the "variance" that illustrates a huge increase in Shops in the MTGO data set.

    Look. I took a single sentence out of this article that I found to be objectionable based upon all of the available evidence. My data strongly disputed his claim.

    I disagree that the data strongly disputes his claim. It can be read that it disputes it, but if you look more broadly, his statement is not as black-and-white true-or-false as you make it seem.

    OK, let's look at his statement more broadly. Here is the the sentence and the sentence that precedes it:

    "One could make a very strong argument that Monastery Mentor decks were the best deck in the format prior to the April 4th changes. They occupied basically the same percentage of the metagame as all of the Mishra’s Workshop decks combined, and unlike its artifact based counterpart there is no real good way to combat it. "

    So, he's implicitly arguing that Mentor was the best deck before the most recent restriction, or at a minimum, tentatively endorsing such an argument. And then he is presenting a statistical claim to support that argument. So, for the argument that he is either advancing or implicitly endorsing to be true, the facts upon which it relies must also be true. This is not an editorialization. This is claim. If my staff put such a claim in a report, article or brief, I would demand they support it.

    Mentor is a heavily played Gush deck - one of the best/most played versions, in fact, in paper it out-represents Shops. I don't find it offensive for him to say they see *about *the same play.

    I don't find it offensive; I find it factually untrue.

    And you are correct, Shops could see heavier play due to variance, but that does not seem as plausible based on the larger tournament results

    Really? If you believe that, then you are ignoring the facts. If we look at the larger tournaments, Shops performs just as well, if not better. See below.

    Recall again that Danny's data includes Q4, whereas mine is just Q1. If Danny's data is accurate, that Q4 and Q1 combine to make Shops only 22% of the MTGO daily results, and Q1 data has Shops at 30%, then for Danny's data to be true, Shops must have been around 14% in Q4 in order for that to average out.

    That means that Shops more than doubled between Q4 and Q1. So, if we are going to look at "larger" tournament results and more tournaments, and we really care about trends, the trend is clear: Shops had a dramatic increase in Q1.

    After all, you are now arguing that what we should care about is trends. The variance argument actually plays into my critique. Shops trended dramatically upwards in Q1, and any variance over time is variance that should be interpreted, if we credit trend data, towards Shops increasing frequency.

    The difference between 16 and 22 % of a Vintage metagame is actually an enormous gulf. In fact, it's probably much larger than "I would have us believe." Consider a few facts that will put this in context:

    Again, you want it to be true for some and not others. 16 and 22% is a huge gulf, but not spread over 48 events each reporting 4-7 decklists. The application means that you might see 1 more Shops deck than a Mentor deck at a similar event. This is why I don't understand why his statement was so offensive. The difference in play is less than 1 deck per event over nearly 50 events... It's only when you turn it into aggregate data that it turns into an 'enormous gulf'. But that's not what your statistics describe - the statistics describe a small event - where they were generated.

    Your argument is tantamount to an argument against aggregation. It's absurd on it's own terms, but taking it at face value, my point is true whether we look at dailies or Premier events.

    Let's look at Premier events. In Q1, there were only 8 Mentor decks in the MTGO P9 Top 16s. That's for an overall percentage of 16.66% of decks. In contrast, Shops were 31% of those decks.

    So, if we look at Q1, using a smaller data of 3 events with 16 decklists per event with much less bias for particular player overrepresentation, then how preposterous is it to claim that 16.66% is about the same as 31%?

    Exactly. Your point here is a straw man argument, that actually undermines any use of data.

    Which is what I point out. Data has to be used in context and now you flip/flop from the use of data to prove fallacy to 'looking for trends'. That's what I've advocated all along. But to look at trends, you have to look at how the data is generated. Following top-level Shops mages on MTGO is going to artificially skew the data, but you don't seem willing to admit that the data does that.

    sigh

    While I'm glad we are now on the same page regarding "looking at trends," if you go back and look at any article I've ever published on Metagame Analysis that I've linked to earlier, it's clear that's the entire goal.

    The point of looking at aggregate data is to discern trends. That's not flipflopping, that's the essence of what analyzing the metagame is for.

    If MTGO dailies were as skewed as you suggest, then the data for Q1 dailies and MTGO P9 challenges wouldn't be virtually the same. Yet, the top 16 and top 8 data from the premier events is almost statistically identical. So, this "skew" effect that you keep harping on in an attempt to undermine the validity of the dailies - is not evident if we compared the dailies to the premiers. It's the same stats. The same numbers.

    Again, I'll agree to disagree. I think he made an overstatement, but I don't find it offensive. I don't think you could distinguish 50 decks from 72 decks using sound statistical tools to analyze your data set to describe the metagame as a whole over Q1. I think it becomes even more difficult if the data analysis included an analysis based on pilot. If it repeated over another quarter (i.e. a trend), I think you would have a statistical argument, but we'll never know.

    It's not 50. It's 28 compared to 72. That's the number of Mentor decks in the data set compare to the number of Shop decks.

    I was using the 50 Gush decks to illustrate that not even the total number of Gush decks comes close to the number of Shop decks, and only 23 of the 50 Gush decks were Mentor decks.

    To accept your overall argument concerning Danny's claim, one would have to believe that 28 is "reasonably" close enough to 72, given variance. That's just nonsense. It's miles away, not inches.



  • @Smmenen said:

    If Danny and I are both using the exact same data source: MTGO daily events, why are you crediting his data, but ignoring mine?

    No need to make it personal. I'm not choosing his data over yours. I believe that, objectively, there is truth to his statement. It's not black-and-white. It's not as simple as true-and-false. He said Mentor is a good deck which sees roughly as much play as Shops. You have repeatedly tried to paint the only available data as that from top decks played in the MTGO dailies and I still believe that is a disingenuous representation of all the available data.

    I'm not going to argue with you - you're an attorney. I'm a process engineer, though, so I know how to interpret data. I'm trying to explain what I see in the data.

    The metagame data from the P9 Challenges in Q1 show that Gush is just as played as Shops (not Mentor specifically). The paper data analysis that you provide for Q1 shows that Mentor shows up more than Shops by a bit. On the other hand, the Dailies data shows that Shops has a sizeable lead. My question remains the same - Why is the data so much different in the Dailies? The data from the dailies suggests it to be an outlier compared to all other available data.

    I think 1 reason is the people playing. They get a single representation in each large P9 Challenge while they have multiple top finishes in the Dailies. This skews the data towards good players who play good decks, e.g. Montolio, BlackLotusT1, The Atog Lord, etc. I think a 2nd reason is that we do not see every deck played in the dailies, only the top finishers for an event. The hope with MTG data is that this evens itself out over large samples, i.e. a deck which went 2-2 on Tuesday will bounce back and go 3-1 on Wednesday, but the Vintage dailies are not a large sample. As I've repeatedly tried to point out, a variance of even 1 deck per event over your Q1 data is equivalent to only 24 Shops decks passing muster (1 less deck in 48 events) or a whopping 120 Shops decks making the cut (1 more deck going 3-1). Some will argue that the difference between 2-2 and 3-1 is a dice roll in Vintage. Those numbers swing the Daily data wildly without any additional Shops decks getting played. [Note: This is why it's important to know all the decks in an event. If every Shops deck being played is making the top decks list, that's important.]

    I'm not going to argue that 24=120, but in the world of data analysis, sometimes it's hard to tell the two apart definitively.

    Here's an alternate analysis of the dailies...

    Using a simple Run Chart approach:
    Average # of Shops decks finishing 3-1 or better - 1.45 +/-0.86
    Average # of Decks finishing 3-1 or better - 5.00 +/- 1.00

    Standard Statistical analysis would say to use 3 sigma, but even limiting ourselves to 1 sigma (68% confidence), we could see as few as 28 decks and 192 top finishes (14.5%) and as many as 111 decks over 288 (38.5%) and we shouldn't be "surprised". Those numbers could still 'represent' Shops only being ~30% of the meta. [Note: You could also use median and variance, but I think that under-represents Shops] [Note 2: This actually explains the Jan-to-Feb-to-Mar variance in the Q1 data and likely explains the variance back to Q4 of 2015.]

    Looking at it from this angle, #1) I'm willing to believe that Mentor falls somewhere within those numbers. So, to the degree I trust the data, Mentor is roughly on par with Shops. Is it 100% played as much as Shops? No, probably not. Is it close? It's not an unreasonable statement. and #2) The huge variance in data from the dailies suggests that the data requires further monitoring/analysis before relying on it. The small sample sizes and limited player pool is not providing a consistent representation of the meta as compared to other data.

    I'm sorry I disagree with the conclusion you want to draw. You're obviously quite passionate and invested in it. I really wish we could've gotten another month or quarter of data to see where the trend really was going.


  • TMD Supporter

    @Fred_Bear said:

    @Smmenen said:

    If Danny and I are both using the exact same data source: MTGO daily events, why are you crediting his data, but ignoring mine?

    No need to make it personal. I'm not choosing his data over yours. I believe that, objectively, there is truth to his statement. It's not black-and-white. It's not as simple as true-and-false.

    Of course it is.

    Imagine I said to you: "There are about the same number of Apples in the United States as Bananas."

    Is that a true or false statement?

    Of course it is.

    It's empirical and quantifiable. It's precisely the statement that you'll find in the first chapter of a logic or a social science textbook as a contestable claim.

    Now replace "apples" with "Mentor decks," "United States" with Vintage metagame, and "Bananas" with Shop decks, and it becomes clear as crystal that Danny's claim is the same.

    Let's be absolutely clear about this. Danny’s statement is an empirical claim. It's a factual claim.

    He claimed that Mentor decks were “basically” the same proportion of the Vintage metagame as Shop decks. This is either true or false.

    Since it is a factual claim, it is amenable to empirical inquiry. It is precisely the kind of claim that is susceptible to factual analysis. It's inherently provable or falsifiable.

    It’s not a subjective claim (e.g. “I like Ice Cream). It’s an objective claim.

    Nor is it an inherently unprovable claim that lay in the realm of philosophy (e.g "God exists.")

    Danny's statement is numerical and quantifiable.

    If Danny's statement is not amenable to truth or falsity, then it's hard to imagine a claim that is. It’s the paradigmatic example of a empirically falsifiable claim.

    Now, the only question is, what data set should we use to either prove or disprove the statement.

    Thus far, there have only been developed two basic data sets:

    1. Danny’s MTGO Daily results (Q4 & Q1)

    2. My and Kevin Cron’s data sets, which include:
      a. Q1 Paper
      b. Q1 MTGO Daily reported results
      c. 5 Months of MTGO Premier Events
      i. A subset that features just Q1 Premier Events

    Each of those data sets produce slightly different results.

    1. Danny’s MTGO Daily results show that Mentor is 16% of top performing decks compared to 22% for Shops
    2. Danny relied on my and Kevin’s paper results, which show Mentor and Shops to be roughly the same.
    3. My and Kevin’s Q1 MTGO daily results show Mentor to be 28 decks (11.6%) compared to 72 Shop decks (30%)
    4. My and Kevin’s MTGO Premier Events show Mentor to be 14% or 16% respectively, and Shops to be 30% and 31% respectively.

    Let's summarize:

    • The Premier event data suggests that Danny’s claim is completely wrong. Shops are twice as prevalent as Mentor.

    • The Q1 daily results have Shops decks almost THREE TIMES as prevalent as Mentor.

    Aside from the paper results, which no one disputes (and which isn’t relevant by itself, since we are talking in the aggregate), there is no universe in which Mentor is even close to the same % as Shops.

    What about Danny's data? There are huge flaws with Dannys data:

    1. He completely ignored Premiere events.

    2. Despite having the data in his set, he conveniently ignored or elided the fact that Shops doubled its representation from Q4 and Q1, making it, in my opinion, unreasonable to include Q4.

    Danny’s data misrepresents Shops proportion of the metagame “before April 4,” because he includes a period, October-December, that looks nothing like Q1.

    But even if, in the “best case,” we accept Danny’s data, there is still a difference of 6%. While you don’t feel that’s a big deal, I think it’s an enormous gap. It’s the difference between 1% and 7%. It’s 33 decklists in his sample.

    Your entire argument, that Danny’s claim is valid, rests on the following assumptions:

    1. Acknowledging a 6% difference, by arguing that that difference is close enough, in light of variance, to justify a claim that Mentors are within range of Shops.
    2. Ignoring the premier events is fine in drawing your conclusion
    3. Including the Q4 data, despite Workshop being half as present, is in no way problematic

    I don’t think any of those assumptions are reasonable. But Danny's argument isn't valid unless we accept all three of them.

    I think the argument is over once you acknowledge the 6% difference. That 6% difference means that Mentor decks are not the roughly the same as Shops. That's the end of the story.

    He said Mentor is a good deck which sees roughly as much play as Shops. You have repeatedly tried to paint the only available data as that from top decks played in the MTGO dailies and I still believe that is a disingenuous representation of all the available data.

    That’s all of the data we have. Neither Danny nor I have all of the MTGO daily decklists. So, your point here is completely besides the point.

    I'm not going to argue with you - you're an attorney. I'm a process engineer, though, so I know how to interpret data. I'm trying to explain what I see in the data.

    I also know how to interpret data. I am the Director of Research at a research institute, and write social science reports and file social science briefs in the Supreme Court at least once a year. What you are saying makes no sense. I wouldn't let my staff publish reports with the kinds of claims you are advancing in them.

    Why is the data so much different in the Dailies? The data from the dailies suggests it to be an outlier compared to all other available data.

    This is totally false. It's just completely untrue. You’ve already said before, and I’ve already refuted it. The daily top performers align almost perfectly with the premier event top performers.

    As I said before:

    “If MTGO dailies were as skewed as you suggest, then the data for Q1 dailies and MTGO P9 challenges wouldn't be virtually the same. Yet, the top 16 and top 8 data from the premier events is almost statistically identical. So, this "skew" effect that you keep harping on in an attempt to undermine the validity of the dailies - is not evident if we compared the dailies to the premiers. It's the same stats. The same numbers.

    Let me repeat: The Daily Results are not an outlier. The Daily and Premiere Top performing Shops and Gush data are almost identical.

    Yet, the premier events do not suffer the flaw you keep talking about "The skew of overrepresentation of specific individuals."

    This is a great example of you just ignoring the facts.

    I think 1 reason is the people playing.
    Except that your premise isn’t true. The premier events show the same top performance stats as the dailies. It’s remarkable how well they line up. The Gush and Shops data is almost identical for Q1.

    I'm not going to argue that 24=120, but in the world of data analysis, sometimes it's hard to tell the two apart definitively.

    I'm just going to let readers ruminate on that statement, as I think it speaks for itself


 

WAF/WHF