NYSE #4 - Complete Metagame Report

@ChubbyRain said:

And here lies another thread, victim of Steve's ego...

No matter how much I may disagree with people, I don't believe I've ever stooped to the kinds of personal attacks you've lobbed my way in the last few weeks.

You can disagree with people without having to resort to insults, name calling, and ad hominem attacks.

I actually think that the data/arguments/ideas presented in my last post is a worthwhile dialogue, not a display of ego or chest-pounding. I'm sorry you don't see it that way.

@Smmenen said:

@ChubbyRain said:

@Smmenen

  • The problem with the "good players" argument is that it discounts those players' skills at deck selection. Rich didn't pick a deck at random and neither did Brian - they chose to play Gush for a reason. If you are going to suggest that these are the best players in the room (and I agree with you on that), then you also should assume they are skilled when it comes to deck selection.

What you say is true, but my point was that I think these players would have done either better or just as well had they played the better performing deck. You played Gush, but your teammates who played Shop Eldrazi got 1st and 9th. Why didn't you? Empirically, Workshop Eldrazi was the better deck choice, right?

Possibly one reason is that you don't have as much confidence in yourself as a Shop pilot as Montolio or Brian - which suggests the limits of your "skill in deck selection" argument.

I piloted Sylvan Mentor to 6-2 record, was tied for the highest game win percentage in the field at 73.68% and left with a Mana Drain for my efforts. I really think my deck selection was fine for the event, thank you very much. If you actually cared about my reasoning for not playing the Ravager list, I was uncertain what percentage of the field would be Oath and Null Rods, which the deck has some difficulty against, but I really suspect you are trying to "win the argument" and don't really care what my reply was.

No, I actually was interested. But that wasn't the point.

In the weeds of this kind of discussion, it's easy to lose the forest for the trees. You pointed out a paradox in the OP: that Gush had a sub 50% match win %, but seemed to do well for a number of players. You tried to explain the paradox by suggesting that "experience with Gush" was the difference.

While I think that may have been a factor, I suggested that "vintage skill" or format skill may have actually been the more likely explanation, and that these players probably would have done the same or possibly better with a better performing deck, like Eldrazi Shops.

Your response was that selecting Gush was indicative of Vintage skill - but that wasn't responsive to my point, and hence my response.

As for the deck being difficult to play, how difficult is it to rules lawyer your opponent out of multiple Scab-clan Berserker triggers?

While it's sad that you've pivoted a point bout DPS to ad homimen attacks, what you say here isn't even accurate.

My opponent in one of the early rounds missed a beneficial trigger with Scab Clan. And a turn later, they asked about the damage, and I said that I thought that Scab Clan, in paper magic, is a beneficial (quasi-optional) trigger, like Dark Confidant or Chalice of the Void. I called the judge, and they confirmed this.

That was the only trigger they missed, so it was singular "trigger" they missed, not multiple or plural.

And, while it's easy to "smear" someone for taking advantage of an opponent missing a beneficial trigger, I've watched many players, including Rich Shay, intentionally play cards into Chalice of the Void, seeing if their opponent catches it. I don't like that rule that makes beneficial triggers optional - but that's the game we live with, and that's not "rules lawyering"; it would be playing for your opponent.

My opponent was confused at another juncture, where I put a storm spell and storm trigger on the stack, and I was waiting for them to put the SCB trigger on the stack, but they didn't do it.

Since my opponent seemed confused, I was the one who told them to call a judge (which a nearby player echoed), but I wasn't about to sit there and tell them exactly what to do. All they had to say was "scab clan trigger" or represent the trigger explicitly, and I would have scooped. Instead, they didn't say anything or indicate anything like that.

I'm truly sorry my opponent felt like I was trying to "rules lawyer" them, but nothing could be further from the truth.

It's too bad that you viewed that situation through the lens of me "being a douchebag" or a "rules lawyer." If you had actually witnessed what happened and watched objectively, I think it would have been apparently that nothing could be further from the truth. It's no different than making someone attack you for the final point rather than just scooping before a lethal attack.

The irony is that while accusing me of derailing a thread - everything we were talking about until your last post was directly relevant to the topic. Your insulting digressions are derailing the thread, not "my ego."

Seriously, you are the part of the Vintage community I want nothing to do with..

And the amazing thing about life in the free world is that you can choose who you want to associate with. You can ignore me or avoid me all you like.

But you have a habit of getting into conversations with me, replying to each reply, and then get upset with me for continuing a dialogue... If you want to ignore me, then ignore me...

For the record, I have nothing against you personally, beyond your personal attacks, and have been impressed by many things you've brought to Vintage, including being the first to recognize and fully exploit the power of JVP, and other novel deckbuilding ideas (like the match we played at the 2014 Vintage Champs - which was awesome). And I have also been explicitly complimentary and appreciative of the data analysis you've done in the last few months. When I congratulated you on your trophy, I meant it.

I just feel bad that everyone else is made more uncomfortable with your vitriol and this unfortunate digression...

last edited by Smmenen

@JACO said:

@ChubbyRain @Smmenen we all appreciate you quoting each other to death in threads wherever Gush is mentioned, but you're both better than any insults. Please be the gentlemen we all know you to be, and save the mano e mano for our streaming Greco Roman wrestling match, live from Vintage Champs! 😉

I'll get to work on the jello pit.

@jaco for the record I'd pay money to watch @Smmenen and @ChubbyRain go three for three in a ring.

I have to agree with Steve about the situation because not being there I currently only have his account; if his opponent wasn't triggering the Scab-Clan, I'm just going to play my game.

So now that the lover's spat is over...

The online meta and paper metas are still 30-40% Gush (depending on your source and time frame), and every major tournament since the Lodestone restriction has been around 40% Gush. The meta itself, understandably, has begun to warp itself even more so around this card as its play Gush or a deck specifically designed to beat it. Does this remind anyone of anything in Vintage recently?

It should because these are the exact reasons for Lodestones restriction. Unless we are going to start applying double standards, Gush needs to be restricted ASAP.

@vaughnbros said:

The online meta and paper metas are still 30-40% Gush (depending on your source and time frame), and every major tournament since the Lodestone restriction has been around 40% Gush.

"Depending on your time frame" is a pretty big caveat.

Gush was 60% of the MTGO dailies in April, 40% in May, and looks to be under 30% in June.

That's a huge downward trend. alt text

One of the main criticisms of restricting Golem was that it was premature, and that trend data showed that it was being addressed. It makes no sense to restrict cards that decline every month.

The case for restriction pretty much collapsed with this data set: restricting a card that fuels a deck that has 4 bad matchups is pretty much the textbook case for an unnecessary restriction.

You can't claim to be a "dominant deck" when you lose to Eldrazi, Workshops, Dredge, and Storm. You can't be a dominant deck when you have a sub 50% win percentage.

Also, you can't claim to be a dominant deck when you haven't won the last three most important tournaments.

The meta itself, understandably, has begun to warp itself even more so around this card as its play Gush or a deck specifically designed to beat it. Does this remind anyone of anything in Vintage recently?

It should because these are the exact reasons for Lodestones restriction.

Uhh... no it wasn't.

You are confusing and conflating metagame presence with tournament performance.

A deck can be 50% of a metagame, and 0% of a Top 8. Hell a deck could be 80% of the metagame, but 0% of the Top 8. No one would think a deck like that should be restricted. We care about performance, not presence.

How did Gush do at the NYSE?

By match win percentage alone it was the 5th best performing archetype.

The best performing archetypes were this order:

  1. Shops, with a 68.3% win percentage
  2. Eldrazi, with a 59% win percentage (11% of the field)
  3. Dredge, with a 51.3% win percentage (7% of the field)
  4. Oath, with a 50.7% win percentage (7% of the field)
  5. Gush, with a 46.5% win percentage (32.5% of the field)

Yeah, we should definitely restrict 5th best performing achetypes. (eye roll)

The case for restricted Gush began to collapse with the last MTGO p9 event, when Gush was the most played archetype, but only had one deck in the Top 8, and got crushed by Eldrazi and Dredge, the NYSE just completed the circle:

http://themanadrain.com/topic/329/mtgo-may-2016-power-9-challenge

Gush was literally 38.5% of the field. Yet, here was how well decks did by presence and performance:

  1. Dredge was 3% of the metagame, but had a 70% win percentage
  2. Eldrazi was 14% of the metagame, but had a 64% win percentage
  3. Shops were 3% of the metagame, but had a 62% win percentage
  4. Gush was 38.5% of the metagame (less if you take out Doomsday, etc.), but only had a 51.6% win percentage.

Dominant decks don't get crushed by other matchups, let alone multiple matchups. Gush's win % against Shops - despite multiple restrictions - was 29% in the NYSE, and 28% against Eldrazi in the MTGO P9 event.

In contrast, Lodestone Golem was actually a borderline dominant deck. Unfortunately, despite two restrictions in 6 months, Shops appear still to be the best deck.

The only empirical argument for restricting Gush is that there are too many Gush decks in the metagame, but they are getting crushed by Shops, Eldrazi, Dredge, and are soft to Storm and Humans. And the trend lines are all downward. No wonder. Who wants to play a deck that has a 29% win percentage against Shops and Eldrazi? Even if those numbers can be improved, it's clear that the metagame is undergoing massive upheaval, and restricting a card in the midst of massive change would be premature, to say the least.

last edited by Smmenen

@vaughnbros

Says you with a bunch buzz words.

@Smmenen What am I reading? A downward trend? After a start of 60% of the meta!?!?!?? You realize how absurd that number was and it clearly was not sustainable, and extrapolation is really really poor science.

Then we go into some fallacies like a deck can be 50% of a metagame and be 0% of a top 8. When has this ever happened? Percentage of the metagame is a serious issue and in general has been the determining factor for most bans/restrictions throughout history.

Let me give an example of something a little more substantive, and extreme example of the fallacy you are trying to put forward as fact:
Cancer: 22.5% of deaths, Kills ONLY 50% of those afflicted.
Alzheimer's: 3.26% of deaths. Kills 100% of those afflicted.
Suicide: 1.58% of deaths. Kills 100% of those afflicted.
Cancer is not a problem! We should ignore cancer and focus on other causes of death!

last edited by vaughnbros

@vaughnbros said:

Then we go into some fallacies like a deck can be 50% of a metagame and be 0% of a top 8. When has this ever happened?

It's not a "fallacy" (you are misusing that term).

It's called a "counterfactual" to illustrate the point that metagame presence isn't what motivates B&R list policy - but performance.

Percentage of the metagame is a serious issue and in general has been the determining factor for most bans/restrictions throughout history.

That's absolutely not true. You are literally just making things up.

For most Vintage tournaments in history, we never had complete metagame breakdowns.

How could metagame presence be used to determine most bans restrictions in history when that data wasn't available???? That's a serious question.

TOs didn't report the metagame breakdowns. Not even for most Vintage Championships. Jaco was kind enough to type up the top 100 or so decklists from the last one, but we don't even have a complete metagame breakdown for almost any other one (Ben Bleiweiss typed up the breakdown for the first one, in 2003).

We only have metagame breakdowns for the MTGO events because Matt and Ryan went in and collected it (and I did so for the first one last year)

Go read the old Vintage metagame breakdowns. E.g: http://www.starcitygames.com/magic/vintage/8912_The_December_and_January_Vintage_Metagame_Report.html

All we had was Top 8 appearances for 99% of tournaments. That's what sites like morphling.de collected.

It is Top 8 appearances, not metagame presence, that has been used "throughout history" to justify restrictions.

Through most of Vintage history, we've used a short hand for % of Top 8s as "% of the metagame," but we were actually talking about the Top performing deck metagame, not the actual complete metagame in the tournament hall.

Since Matt and Ryan have collected the total metagame results, and done something we've never had before, calculating matchup win %.

It would be absurd to restrict a deck that is the 5th best performing archetype, and has 4-5 statistically weak or bad matchups.

last edited by Smmenen

@Smmenen Please don't use the word Statistically while simultaneously trying to extrapolate results. Thank you!

@vaughnbros said:

@Smmenen Please don't use the word Statistically while simultaneously trying to extrapolate results. Thank you!

Those were two separate points.

One of my points was trend data regarding Gush as the % of the metagame, showing steady and pronounced decline. The other main point was that Gush isn't performing dominantly in the metagame.

I said tha: "It would be absurd to restrict a deck that is the 5th best performing archetype, and has 4-5 statistically weak or bad matchups." That's not extrapolation - that's descriptive of what happened in the OP table.

That's the only time i used the word "statistically."

last edited by Smmenen

@Smmenen Let me rephrase. You do not understand the word. Do not use it. Thank you!

Please explain how this statement (a paraphrase of what I already said) is false or misuses the word "statistically":

According to the data presented in the original post regarding the NYSE results, Gush decks (as classified by chubbyrain) are statistically the 5th best performing archetype in terms of matchup win percentage.

Thanks!

last edited by Smmenen

@vaughnbros Just to be clear here... you are claiming that 1: Steve doesn't understand what the word statistically means, and 2: Should therefor not use the word statistically?... Steve Menendian?... is so ignorant of the word statistically, that the only remedy is for him to stop using the word? Really? That's where we're at in this conversation?

Again, let me say that complaining about restrictions tends to... (statistically?) have very little to do with actual necessity for restrictions and far more to do with a self-fulfilling cycle of complaint.

Using very little data, I can tell you the following facts for certain. When the metagame spiked with high Gush numbers a little over a month ago, some people began developing decks to beat Gush like Humans and Eldrazi. Some people also spent time complaining about Gush and demanding it's restriction on sites such as this one. I leave it to the collective imagination, how much of an overlap those two groups would have on a venn-diagram.

@Topical_Island I tried to place a logical argument and move away from derailment of the thread. Since he wanted to derail again on the semantics between fallacy and counterfactually, I am done trying to debate him on the actual matter at hand.

Yes, Menendian has no idea what the word statistically means. No statistics tests have been conducted, you can not extrapolate past your data, and in general sample sizes are too small for any decision theory to be used in this case. Does he have an advanced degree in statistics that I am unaware of?

@vaughnbros I mean, I don't want to be disrespectful to you personally, but having read back through the thread, I have to say I disagree with a lot of what you're saying in here, if it can be synthesized into a single thrust. What is the logical argument that you were going for? Something like the metagame is unhealthy?

Ok... fine. So what statistics actually is, the essence of it so far as I'm concerned, and the definition that I'm operating on, is that statistics is the effort to make information out of data... to collect the data well, and to then make and honest and accurate effort understand what the data means. Are there a lot more details, and ins and outs? No doubt. But that is the gist of it. (Keep in mind that I don't have an advanced degree in statistics, so I might not be authorized to possess this definition, nonetheless, that is in fact the essence of statistics.)

So I hope we agree that the very colorful top of the thread constitutes "statistics"... again, no expert here. (Recently in this thread someone indicated that no "statistical tests" have been conducted... be that as it may...)

So, do those stats up there indicate that the metagame is healthy or unhealthy? Ah HA! Trick question. No Stats 101 prof is gonna sucker me in on that one. That's the small sample size question they warned me about in the study session! Drawing conclusions about a metagame from a single tournament would be a really bad conclusion in statistical terms, an amateur one that no expert in statistics would ever be guilty of.

Or drawing outrageous conclusions like saying that only three decks are viable in the format based on the outcome of a single tourney. That would be weird, right? Drawing conclusions like that would be bad statisitical work, right? I mean, the last major tourney was won, outright, by a deck that was .6% of this field. I mean, I didn't do a T test to come up with that or anything, so I could be wrong, but that just seems like an unsupportable statement. Only three decks that are even viable in the whole format... that sure would be terrible, if it were reality.

Outrageous, hyperbolic sentences beget more of the same man. Let's just pump the breaks on the wild gesticulating and black and white statements. I really don't want to talk about restrictions, or metagame health... because I'm just pretty tired of it, since it's pretty much a lot of popping off, and people who've decided what their conclusion is based on their own personal experience in games, before they ever even start to dabble in data... but I'll do it. If that's what we're really doing.

That's not what we're doing.

last edited by Topical_Island

@Topical_Island As the only person in this conversation with an advanced degree and years of experience in the field of statistics, the word is thrown around too much. Giving a single percentage, i.e. 60%, is what we call a point estimate. It is exactly that a single point in time and space. It has very little meaning without context added to it, and even less without considering the sample size from which it was derived. Smaller samples, generally mean larger confidence intervals and less statistical significance. In this case using ztest of proportions and null hypothesis of p=.5 returns, pvalues of 0.057 and 0.126 for Gush vs Shops and Gush vs Eldrazi respectively. Both of which would be statistically insignificant at a the standard alpha level of 0.05.

The sample size is simply too small. So lets hold off on demeaning the word of statistically unless we actually run these tests and they actually come out significant. In general though as I am not a frequentist, even I personally avoid using the word statistically if I can.

My figure of Gush being 30-40% was pulled from mtgtop8 and mtggoldfish, both of which are comparable in terms of collection as historical top 8 data. This is not a made up figure go for yourself and add up all of the decks playing Gush on those websites.

My statement about decks altering themselves to beat Gush? I was deriving that based on my experience, and the discussion here, primarily actually made by Mr. Menendian.

In my opinion, the data here says there are 3 decks. I was simply interpreting the results as they are laid out. The data could certainly be interpreted differently with different aggregations, different breakpoints of what you consider viable, ect. As much as other statisticians and scientists would have you believe there is much subjectivity in data analyses.

@vaughnbros In the reasoning business, we call what you just did, authority bias. Also, when someone allows themselves enough leeway to casually claim that the format has three viable archetypes, then nitpicks someone else's viewpoint by focusing on whether or not they conformed to a strict definition of the word "statistical" (and forbidding them from using a word?... I'm still wrapping my head around that one)... we call that hypocrisy... in the biz... as it were. (If you think misusing a single word is annoying, just imagine what the above scenario looks like to someone in communication/natural language...)

If you don't like Steve's argument, then address his argument. (It very well might be wrong.) The subtext of your last post is that you understand statistical computation. Stipulated. Leave your resume in the drawer man. I agree he can be peevishly tactical in his own arguments sometimes, but address the larger ideas, please, I beg you. I swear I'll never use the word statistics again.

For the record. I think the Metagame is great right now. I love it... but of course the finals of my last tourney was a split between two unviable Landstill decks... (which I of course never meant to imply was significant or meaningful in any way... oh no... I've been bad... I have to go wash my mouth out with soap now...)

@Topical_Island Ah I forgot Steve is always right! My bad. Your worship of him is certainly something else...

I'll check my resume at the door the second it's 1. Not relevant to the discussion. 2. When Steve does the same.

@vaughnbros Did you just pull a strawman followed by a "he did it first" on the Mana Drain? Well played Sir... well played indeed. He did do it first, person who isn't one of my middle school students... he did do it first. You are right.

Leave it in the Drawer... a drawer... resumes go in... ahhh never mind.

Anyway. I disagree with your Ideas... if those ideas are, or include, the belief that this metagame is "unhealthy" or anything like that. (speaking of conflating terms here, I think it's probably more accurate to talk about the play environment... a metagame being a little different) I think that this play environment is pretty normal if one accepts that the population of decks types in a healthy environment should follow a Zipf's curve, which is what I believe.

last edited by Topical_Island

@Topical_Island I'm assuming you mean Zipf's Curve? Which has a smoothing parameter so it can take many forms... Thats the equivalent of saying I think the metagame should consist of magic cards.

  • 70
    Posts
  • 60919
    Views