Those aren't mechanical/reflex-based in the same sense. I can forget how much I should mulligan without enhanced RNG, that in paper my beginning of combat doesn't happen without intervention, to present my deck to my opponent, and announce triggers, but realistically I'm not going to forget that I can't put ancestral recall in my legacy deck or that I should put 40 cards in my limited deck. Granted remembering what the normal number of lands in a 40 vs. 60 card deck is a slight annoyance, but not something I'm going get burned on. I'll grant you that the match structure could lend itself to different risk analysis though.
Thanks for the episode. As @ChubbyRain alluded to above, it's nice to have a level discussion about this instead of forecasting the sky to fall.
I do have an objection to Stephen's comment at the end that, paraphrasing "it's almost inevitable and surely beneficial to have different mulligan rules for different formats". Having the mechanics of Magic be constant is a principle that should only be violated as a last resort. Avoiding "gotcha" moments where the way one usually plays Magic forces them to make an error should be avoided. Most of us have missed the ability to scry after mull'ing when that rule was first introduced. Me playing BO1 on Arena assuredly messes with my ability to mulligan in paper without "enhanced" Arena RNG. Hell, I play almost exclusively on modo and the mechanics of having to deal with physical cards is already an extreme distraction for me when playing in paper. WOTC benefits from encouraging transitions across platforms and formats being as seamless as possible. While this can and is violated, the reasons for it need to be extremely compelling.
Thanks for taking up the mantle of this!
I noticed for your archetype and tags you have a comment that this "needs an algorithm". When I did this with Matt I indeed used a python script to parse WOTC's webpage and made a determination of tags and archetypes based on the presence of various cards. I can send it to you if you'd like.
Thanks for the yearly review as always.
When counting top8 appearances, it's a bit of work but it would be more fair to scale the number of top8s by the number of months of the year the card was legal to play. Even better would be to scale by the number of tournaments it was legal in, but that seems like too much work.
If you apply this scaling, assassin's trophy was just as prevalent as Teferi.
@smmenen There are a variety of ways to use a related notion, https://en.wikipedia.org/wiki/Decision_tree_learning#Gini_impurity . One way would be to simply compute it for the top 8 archetype numbers.
If you have more data, say the top32 lists from every vintage challenge, you could compute it for the top 32, then top 16, and so forth, and see how much the impurity decreases as you zoom in on the winningest slice of the metagame.
@stuart Other than the single powered colorless list, including the Eldrazi tag and excluding budget does the trick in the "Tag Breakdown" tab. I get a 50.88% winrate.
@spook The tags are not necessarily subsets of a particularly archetype. Almost all of the Survival lists have taxing elements (most notably Thalia), but I don't believe Vasu's BUG list had any.
If you'd like to get to the bottom of where the classification differences in the Eldrazi decks are, let us know (I need a break from staring at decklists).
376 players showed up in Pittsburgh to play in Vintage's most prestigious event of the year. The tournament ended with the matchup most emblematic of the past few months: Paradoxical Outcome versus Ravager Shops. Congratulations to Brian Coval for becoming our new North American Vintage Champion!
- Brian Coval - Esper Paradoxical
- Nam Tran - Ravager Shops
- Kyle Dorgan - Esper Paradoxical
- Rich Shay - Ravager Shops
- Matt Sperling - RUG Pyromancer
- Eliot Burk - UW Landstill
- Marshall Arthur - Bant Survival
- Cosmo Kwok - Grixis Thieves
Although I have my complaints about the current state of vintage, that's quite the diverse top 8 after last year's!
Even though it didn't win the event, the story of the tournament for me was Survival solidifying itself as a real force. There are at least 3 builds (Bant, Bant+R, and BUG) all with their advantages. The winrate here likely overstates its power slightly: a small group of dedicated pilots came with it, and due to the price of the deck I imagine many skimped on hate and practice against such a small expected share of the metagame.
Classification notes: "Other" was mostly monored, but there weren't quite enough to warrant breaking it out as a separate archetype. If you have any objections to how we classified your deck, or spot an error, send us a PM.
@smmenen Perhaps my reply was too terse. Your final point about thorn and (potentially) chalice being unrestricted is what I was getting at. These restrictions were anything but tailored--they nuked archetypes to save workshop from restriction. That's what makes the situation different from restricting Trinisphere instead of shop.
The most broken reasonable-to-achieve hand from modern workshops is shop, inspector, mox, and lock piece. The odds of having 4 mana for this sequence go down significantly if you restrict workshop. I don't have a problem with a fast affinity aggro deck in vintage. The problem is that the workshop buys the deck too much tempo with a lock piece or two to kill the opponent before they can regain control. The fact that it makes bigger ballistas to mow down a board of helpless humans is the other half of the problem.
Thanks for the article Stephen. My two cents:
Based on your criteria I don't see how workshop can be unrestricted. It chokes out any other strategy trying to play creatures and the entire metagame is warped around it. When we're at the point where we're restricting cards that are fine in modern we've really gone too far shielding a sacred cow.
This might be a bit of an idiosyncratic criticism, but I listen to most of your podcasts while on multiple hour driving stints, so I was fairly disheartened with your "please have a copy of the set open on your computer" advice. Despite this, I really appreciated your trip down Arabian Nights memory lane as well as the similar "fun" topics at the end of recent podcasts.
@vaughnbros I think you're making some unfair assumptions about our classification, which to be fair could be attributed to not knowing how we make them. Sure, Matt and I are primarily blue players. However we both know that there are different types of Shops, Dredge, Eldrazi, and Thalia decks. We've often asked if it's worth splitting these archetypes into Aggro Shops vs. Stax, Pitch vs. Anti-Hate, etc. As far as archetypes are concerned, Stax and distinct subarchetypes of dredge have never been a large enough percentage of the metagame that we felt that separating out the 1 person playing Stax at a tournament has been worth it. Matt and I originally started collecting data to help prepare for tournaments ourselves. We really don't care if a single person had a 45% or 60% winrate.
When we have access to the decklists, we've used the "Tags" idea that we came up with about a year ago. It was to us a good compromise between breaking archetypes into such small percentages of the metagame that it becomes meaningless and capturing the nuances of different broad brush stroke categories. To be honest, when we don't have decklists it's a pain in the ass to do this sort of nuanced categorization. Matt and I have done almost all this data collection by ourselves, and watching replays for an additional hour so we can check if the shops deck ever casts a smokestack or the dredge deck has a FOW somewhere is often unpalatable, especially if we're trying to convince someone else to do the work because we wanted to have a Saturday off. I also think it says something about the differentiation of blue decks, and not Matt and I, that we can almost immediately tell the difference between Xerox and Oath but need to watch 3 rounds of a shops deck to tell if it eventually casts a non aggro shops card.
@pugsuperstar Watching the replays requires paying the 250 play point entry fee. Anyone used to be able to watch replays, but they imposed this restriction to stop bots from collecting too much data and "solving" formats. In particular, mtggoldfish used to have insightful articles on how well the color combinations fared in draft, how well correlated playing various cards was with winning the game, and so forth.
The Vintage Challenges are great EV though. You get your entry fee back for top 32 (over half the players) and prizes increase for top 16, top 8, etc.
After the vintage challenges rotated to a weekly schedule, Matt and I have been trying to enter as many as possible to collect data. We've had several people, most frequently Shawn Anthony, help us out. This is greatly appreciated. However, this workload has been weighing us both down. After having a miserable time playing through a challenge I entered solely to collect data, I decided to reach out to the community for some help.
What we need from you (these don't need to be done by the same person):
Watch replays to determine what deck people are playing who played below the top32. This is typically between 8 and 18 people. This should not take more than an hour. This is the most essential part since otherwise we don't know the metagame breakdown. This is also the only task that requires entering the tournament.
Record who played whom in each round. This can be up while the tournament is running, up to several hours after it ends, or you can take a screenshot for someone to input in later.
That's it! The rest we can either set up ahead of time or figure out afterwards. If you value the data that Matt and I collect we would greatly appreciate you helping us out on occasion. If we can get several people to step up, it won't be much work for any individual.
We would like to thank Montolio and desolutionist for helping with collecting data this month. Unfortunately this month and next will be a bit rough because of holidays, but we will try to collect metagame results for as many of these as we can.
Thanks to Matt for help with all of these reports.
EDIT: Included Top32 metagame + 2 archetypes I knew for 11/4.
When I test for Champs, I do a ton of playtesting. Most of my testing goes into swapping around cards that improve one matchup at the expense of another. I played as few as 2 missteps, balance, ancient grudges in the main, and a huge variety of creatures. When I test I keep track of my winrate against various archetypes and look at my prediction for the expected metagame.
The deck I played at Champs had 4 missteps and 0 grudges in the maindeck. The reason was that I could not consistently beat delver, which both had a high metagame representation online and I expected to be well represented at Champs. Sitting on the pulpit and telling blue players how they should be building decks strikes me as assuming that the deck that one arrives at was without proper testing.
In my 10 rounds at Champs I played 1 dredge, 1 shops, 1 mono red deck, and 7 blue decks. Perhaps a more expected result would be 2 shops decks, but I can't understand how anyone thinks I am going to improve my 8-2 record (losing to dredge with 8 sideboard cards and mono red) by cutting missteps and switching my abrade for an ancient grudge in the maindeck.
We don't get to engineer the metagame for a tournament. I can't force 50% of room to be on shops. I can't, as Matt also alluded to, make a pact with the other blue players not to play misstep. If you want to play blue and want to maximize your overall winrate you need to construct your deck in a certain way. If misstep gets restricted I am not replacing them with 3 grudges; likely they become flusterstorms, mindbreak traps, or pyroblasts. If someone can build a blue deck that has a decent winrate against the 70% of the meta that is blue and consistently beats shops, then sign me up. If such a deck exists, however, it would have an incredible winrate and unbalanced.
Just to be clear I do agree with several points in the article. I think we should give the metagame some time to sort itself out (look at the crazy metagame online right now). The "reusable black lotus" line doesn't hold water when you can't use your lotus to cast ancestral recall. The constant complaining about B&R is incredibly annoying and frankly toxic to the community when every conversation devolves into that.