NYSE #4 - Complete Metagame Report

@vaughnbros said:

The online meta and paper metas are still 30-40% Gush (depending on your source and time frame), and every major tournament since the Lodestone restriction has been around 40% Gush.

"Depending on your time frame" is a pretty big caveat.

Gush was 60% of the MTGO dailies in April, 40% in May, and looks to be under 30% in June.

That's a huge downward trend. alt text

One of the main criticisms of restricting Golem was that it was premature, and that trend data showed that it was being addressed. It makes no sense to restrict cards that decline every month.

The case for restriction pretty much collapsed with this data set: restricting a card that fuels a deck that has 4 bad matchups is pretty much the textbook case for an unnecessary restriction.

You can't claim to be a "dominant deck" when you lose to Eldrazi, Workshops, Dredge, and Storm. You can't be a dominant deck when you have a sub 50% win percentage.

Also, you can't claim to be a dominant deck when you haven't won the last three most important tournaments.

The meta itself, understandably, has begun to warp itself even more so around this card as its play Gush or a deck specifically designed to beat it. Does this remind anyone of anything in Vintage recently?

It should because these are the exact reasons for Lodestones restriction.

Uhh... no it wasn't.

You are confusing and conflating metagame presence with tournament performance.

A deck can be 50% of a metagame, and 0% of a Top 8. Hell a deck could be 80% of the metagame, but 0% of the Top 8. No one would think a deck like that should be restricted. We care about performance, not presence.

How did Gush do at the NYSE?

By match win percentage alone it was the 5th best performing archetype.

The best performing archetypes were this order:

  1. Shops, with a 68.3% win percentage
  2. Eldrazi, with a 59% win percentage (11% of the field)
  3. Dredge, with a 51.3% win percentage (7% of the field)
  4. Oath, with a 50.7% win percentage (7% of the field)
  5. Gush, with a 46.5% win percentage (32.5% of the field)

Yeah, we should definitely restrict 5th best performing achetypes. (eye roll)

The case for restricted Gush began to collapse with the last MTGO p9 event, when Gush was the most played archetype, but only had one deck in the Top 8, and got crushed by Eldrazi and Dredge, the NYSE just completed the circle:


Gush was literally 38.5% of the field. Yet, here was how well decks did by presence and performance:

  1. Dredge was 3% of the metagame, but had a 70% win percentage
  2. Eldrazi was 14% of the metagame, but had a 64% win percentage
  3. Shops were 3% of the metagame, but had a 62% win percentage
  4. Gush was 38.5% of the metagame (less if you take out Doomsday, etc.), but only had a 51.6% win percentage.

Dominant decks don't get crushed by other matchups, let alone multiple matchups. Gush's win % against Shops - despite multiple restrictions - was 29% in the NYSE, and 28% against Eldrazi in the MTGO P9 event.

In contrast, Lodestone Golem was actually a borderline dominant deck. Unfortunately, despite two restrictions in 6 months, Shops appear still to be the best deck.

The only empirical argument for restricting Gush is that there are too many Gush decks in the metagame, but they are getting crushed by Shops, Eldrazi, Dredge, and are soft to Storm and Humans. And the trend lines are all downward. No wonder. Who wants to play a deck that has a 29% win percentage against Shops and Eldrazi? Even if those numbers can be improved, it's clear that the metagame is undergoing massive upheaval, and restricting a card in the midst of massive change would be premature, to say the least.

last edited by Smmenen


Says you with a bunch buzz words.

@Smmenen What am I reading? A downward trend? After a start of 60% of the meta!?!?!?? You realize how absurd that number was and it clearly was not sustainable, and extrapolation is really really poor science.

Then we go into some fallacies like a deck can be 50% of a metagame and be 0% of a top 8. When has this ever happened? Percentage of the metagame is a serious issue and in general has been the determining factor for most bans/restrictions throughout history.

Let me give an example of something a little more substantive, and extreme example of the fallacy you are trying to put forward as fact:
Cancer: 22.5% of deaths, Kills ONLY 50% of those afflicted.
Alzheimer's: 3.26% of deaths. Kills 100% of those afflicted.
Suicide: 1.58% of deaths. Kills 100% of those afflicted.
Cancer is not a problem! We should ignore cancer and focus on other causes of death!

last edited by vaughnbros

@vaughnbros said:

Then we go into some fallacies like a deck can be 50% of a metagame and be 0% of a top 8. When has this ever happened?

It's not a "fallacy" (you are misusing that term).

It's called a "counterfactual" to illustrate the point that metagame presence isn't what motivates B&R list policy - but performance.

Percentage of the metagame is a serious issue and in general has been the determining factor for most bans/restrictions throughout history.

That's absolutely not true. You are literally just making things up.

For most Vintage tournaments in history, we never had complete metagame breakdowns.

How could metagame presence be used to determine most bans restrictions in history when that data wasn't available???? That's a serious question.

TOs didn't report the metagame breakdowns. Not even for most Vintage Championships. Jaco was kind enough to type up the top 100 or so decklists from the last one, but we don't even have a complete metagame breakdown for almost any other one (Ben Bleiweiss typed up the breakdown for the first one, in 2003).

We only have metagame breakdowns for the MTGO events because Matt and Ryan went in and collected it (and I did so for the first one last year)

Go read the old Vintage metagame breakdowns. E.g: http://www.starcitygames.com/magic/vintage/8912_The_December_and_January_Vintage_Metagame_Report.html

All we had was Top 8 appearances for 99% of tournaments. That's what sites like morphling.de collected.

It is Top 8 appearances, not metagame presence, that has been used "throughout history" to justify restrictions.

Through most of Vintage history, we've used a short hand for % of Top 8s as "% of the metagame," but we were actually talking about the Top performing deck metagame, not the actual complete metagame in the tournament hall.

Since Matt and Ryan have collected the total metagame results, and done something we've never had before, calculating matchup win %.

It would be absurd to restrict a deck that is the 5th best performing archetype, and has 4-5 statistically weak or bad matchups.

last edited by Smmenen

@Smmenen Please don't use the word Statistically while simultaneously trying to extrapolate results. Thank you!

@vaughnbros said:

@Smmenen Please don't use the word Statistically while simultaneously trying to extrapolate results. Thank you!

Those were two separate points.

One of my points was trend data regarding Gush as the % of the metagame, showing steady and pronounced decline. The other main point was that Gush isn't performing dominantly in the metagame.

I said tha: "It would be absurd to restrict a deck that is the 5th best performing archetype, and has 4-5 statistically weak or bad matchups." That's not extrapolation - that's descriptive of what happened in the OP table.

That's the only time i used the word "statistically."

last edited by Smmenen

@Smmenen Let me rephrase. You do not understand the word. Do not use it. Thank you!

Please explain how this statement (a paraphrase of what I already said) is false or misuses the word "statistically":

According to the data presented in the original post regarding the NYSE results, Gush decks (as classified by chubbyrain) are statistically the 5th best performing archetype in terms of matchup win percentage.


last edited by Smmenen

@vaughnbros Just to be clear here... you are claiming that 1: Steve doesn't understand what the word statistically means, and 2: Should therefor not use the word statistically?... Steve Menendian?... is so ignorant of the word statistically, that the only remedy is for him to stop using the word? Really? That's where we're at in this conversation?

Again, let me say that complaining about restrictions tends to... (statistically?) have very little to do with actual necessity for restrictions and far more to do with a self-fulfilling cycle of complaint.

Using very little data, I can tell you the following facts for certain. When the metagame spiked with high Gush numbers a little over a month ago, some people began developing decks to beat Gush like Humans and Eldrazi. Some people also spent time complaining about Gush and demanding it's restriction on sites such as this one. I leave it to the collective imagination, how much of an overlap those two groups would have on a venn-diagram.

@Topical_Island I tried to place a logical argument and move away from derailment of the thread. Since he wanted to derail again on the semantics between fallacy and counterfactually, I am done trying to debate him on the actual matter at hand.

Yes, Menendian has no idea what the word statistically means. No statistics tests have been conducted, you can not extrapolate past your data, and in general sample sizes are too small for any decision theory to be used in this case. Does he have an advanced degree in statistics that I am unaware of?

@vaughnbros I mean, I don't want to be disrespectful to you personally, but having read back through the thread, I have to say I disagree with a lot of what you're saying in here, if it can be synthesized into a single thrust. What is the logical argument that you were going for? Something like the metagame is unhealthy?

Ok... fine. So what statistics actually is, the essence of it so far as I'm concerned, and the definition that I'm operating on, is that statistics is the effort to make information out of data... to collect the data well, and to then make and honest and accurate effort understand what the data means. Are there a lot more details, and ins and outs? No doubt. But that is the gist of it. (Keep in mind that I don't have an advanced degree in statistics, so I might not be authorized to possess this definition, nonetheless, that is in fact the essence of statistics.)

So I hope we agree that the very colorful top of the thread constitutes "statistics"... again, no expert here. (Recently in this thread someone indicated that no "statistical tests" have been conducted... be that as it may...)

So, do those stats up there indicate that the metagame is healthy or unhealthy? Ah HA! Trick question. No Stats 101 prof is gonna sucker me in on that one. That's the small sample size question they warned me about in the study session! Drawing conclusions about a metagame from a single tournament would be a really bad conclusion in statistical terms, an amateur one that no expert in statistics would ever be guilty of.

Or drawing outrageous conclusions like saying that only three decks are viable in the format based on the outcome of a single tourney. That would be weird, right? Drawing conclusions like that would be bad statisitical work, right? I mean, the last major tourney was won, outright, by a deck that was .6% of this field. I mean, I didn't do a T test to come up with that or anything, so I could be wrong, but that just seems like an unsupportable statement. Only three decks that are even viable in the whole format... that sure would be terrible, if it were reality.

Outrageous, hyperbolic sentences beget more of the same man. Let's just pump the breaks on the wild gesticulating and black and white statements. I really don't want to talk about restrictions, or metagame health... because I'm just pretty tired of it, since it's pretty much a lot of popping off, and people who've decided what their conclusion is based on their own personal experience in games, before they ever even start to dabble in data... but I'll do it. If that's what we're really doing.

That's not what we're doing.

last edited by Topical_Island

@Topical_Island As the only person in this conversation with an advanced degree and years of experience in the field of statistics, the word is thrown around too much. Giving a single percentage, i.e. 60%, is what we call a point estimate. It is exactly that a single point in time and space. It has very little meaning without context added to it, and even less without considering the sample size from which it was derived. Smaller samples, generally mean larger confidence intervals and less statistical significance. In this case using ztest of proportions and null hypothesis of p=.5 returns, pvalues of 0.057 and 0.126 for Gush vs Shops and Gush vs Eldrazi respectively. Both of which would be statistically insignificant at a the standard alpha level of 0.05.

The sample size is simply too small. So lets hold off on demeaning the word of statistically unless we actually run these tests and they actually come out significant. In general though as I am not a frequentist, even I personally avoid using the word statistically if I can.

My figure of Gush being 30-40% was pulled from mtgtop8 and mtggoldfish, both of which are comparable in terms of collection as historical top 8 data. This is not a made up figure go for yourself and add up all of the decks playing Gush on those websites.

My statement about decks altering themselves to beat Gush? I was deriving that based on my experience, and the discussion here, primarily actually made by Mr. Menendian.

In my opinion, the data here says there are 3 decks. I was simply interpreting the results as they are laid out. The data could certainly be interpreted differently with different aggregations, different breakpoints of what you consider viable, ect. As much as other statisticians and scientists would have you believe there is much subjectivity in data analyses.

@vaughnbros In the reasoning business, we call what you just did, authority bias. Also, when someone allows themselves enough leeway to casually claim that the format has three viable archetypes, then nitpicks someone else's viewpoint by focusing on whether or not they conformed to a strict definition of the word "statistical" (and forbidding them from using a word?... I'm still wrapping my head around that one)... we call that hypocrisy... in the biz... as it were. (If you think misusing a single word is annoying, just imagine what the above scenario looks like to someone in communication/natural language...)

If you don't like Steve's argument, then address his argument. (It very well might be wrong.) The subtext of your last post is that you understand statistical computation. Stipulated. Leave your resume in the drawer man. I agree he can be peevishly tactical in his own arguments sometimes, but address the larger ideas, please, I beg you. I swear I'll never use the word statistics again.

For the record. I think the Metagame is great right now. I love it... but of course the finals of my last tourney was a split between two unviable Landstill decks... (which I of course never meant to imply was significant or meaningful in any way... oh no... I've been bad... I have to go wash my mouth out with soap now...)

@Topical_Island Ah I forgot Steve is always right! My bad. Your worship of him is certainly something else...

I'll check my resume at the door the second it's 1. Not relevant to the discussion. 2. When Steve does the same.

@vaughnbros Did you just pull a strawman followed by a "he did it first" on the Mana Drain? Well played Sir... well played indeed. He did do it first, person who isn't one of my middle school students... he did do it first. You are right.

Leave it in the Drawer... a drawer... resumes go in... ahhh never mind.

Anyway. I disagree with your Ideas... if those ideas are, or include, the belief that this metagame is "unhealthy" or anything like that. (speaking of conflating terms here, I think it's probably more accurate to talk about the play environment... a metagame being a little different) I think that this play environment is pretty normal if one accepts that the population of decks types in a healthy environment should follow a Zipf's curve, which is what I believe.

last edited by Topical_Island

@Topical_Island I'm assuming you mean Zipf's Curve? Which has a smoothing parameter so it can take many forms... Thats the equivalent of saying I think the metagame should consist of magic cards.

@vaughnbros That's the one, and I'm sure you know what I mean. A play environment that's behaving as it should (ie. being balanced by metagaming forces), will end up in a Zipf's curve... maybe I should say long tail distribution here. I don't really know. I'm fine taking your word for whichever in the mean time before I check... in the meantime I'm going to... you know... continue trying to use words to create meaning. That sort of thing.

Guys, I was enjoying a brilliant work from Matt and Ryan, and you are making it a bit less enjoyable 😞 I respect you all and I understand how difficult is to explain in a forum and to agree, but please don't keep with this.

I have one question for people here: is Eldrazi as a deck going to stay, or is going to be a quick trend and nobody will remember in a year?

@vaughnbros your dislike of Steve was apprent in the old TMD as well... (As it was with me, but that is not relivent here). I am not saying Smemmycakes is right or wrog, and I am not saying you are right or wrong, but this constant argument is annoying. I am more interested in interpretting on my own the data presented, and hearing others opinions to create what I envision the result to be. In business and in important life matters things like statistical analysis and stop-gap etc is more important... I am a payroll business analyst/implementation specialist and definitely understand how to use the data to get to a clear and refined end result.... But this is a GAME and If I come to a different result than what you do based on the data provided, then why is my being wrong necessarily a bad thing? Wouldn't that help you since I would be ill-prepared for the next major event?

@xouman yeah, sorry for my part, and yeah...

That is a great question. If I dare throw another two cents in here, I actually put way more faith in deck testing rather than stats on the play environment for questions like this. I personally try to use %s of decks played to tune and determine what to play myself, and use testing to determine what is good. But if the question is, will this type of deck still be winning games in a year, rather than will it be widely played...? Then lets test. (One thing I find a little amusing, but to each their own, is watching someone board in a bunch of cards for game 2 against Gush... why aren't they mainboard?... why not play a different deck that can mainboard things that beat Gush... why not play Eldrazi? Hey, I just figured out why so many people played Eldrazi!)

So I am saying, yes... And by yes I mean that so long as Gush is unrestricted, then the suite of Thorn, Cavern, Thalia, Thought-Knot Seer, and less close to the core, Eye of Ugin, Eldrazi Temple, Vryn Wingmare and Reality Smasher... those all seem great. They will win games against decks running Gush.

If Gush is around it is going to keep pushing down numbers of big blue decks, and lesser blue drawing cards like Thirst, and so long as Gush is around, highly aggro taxing decks are going to be around too. (The rebirth of Null Rod aggro will hurt cards like Tezz, Time Vault, Key, and Top anyway.)

We'll see when we print out White Eldrazi on laserjet later today, but I can sure vouch for the Cavern/Thalia/Wingmare part of the equation.

last edited by Topical_Island
  • 70
  • 60915