EDT 02:33 CEST 08:33 KST 15:33

Streams: 102 live
24235 total viewers

Active: 6192
Pizza Meter
WCS America Ro16 Media
AMD Premier League S1 Gran…
[WCS KR] Semis: Soulkey vs…
[WCS AM] Premier League Ro…
TI3 East Qualifier Preview
Papa John's explores EG, L…
TeamLiquid Map Contest Fin…
Liquid wins the Curse Invi…
mousesports defeats DD.Dot…
Dragon joins Clarity Gaming
Code S Group of Death, Par…
Axiom.Miya Retires
New Get 50% off Papa Joh…
TL Advertising Features
Korean Music Discussion
Korean Drama Discussion
[TV] HBO Game of Thrones
Massive Tornado rips thr…
[SFW] May Desktop Thread
2013 Philadelphia Starcraf…
TL.net Ten Commandments
BarCraft Ames!
The Automated Ban List
The Closed Thread Lounge
IMMvp Fan Club
The Snute Fanclub
[Stream] Mystery
[Stream] Honeybear
Chosing a laptop for SC2 h…
Mac OS X 10.8 Mountain Lio…
Mechanical Keyboard Guide
Any fix for Twitch tv lag?
Help me to buy an ultrabook
Designated Balance Discu…
Papa John's explores EG,…
Post your Papa John's Es…
Quantic JaeLee Pizza Giv…
Team & Clan Recruitment …
WCS America Ro16 Media
[SPL] SK Telecom T1 vs. CJ…
[Code S] RO4 Day 1 WCS Kor…
[SPL] Team 8 vs. KT Rolste…
GomTV to hold/broadcast Up…
[WCS AM] RO16 Group A Prem…
[G] Boss TvP Allin
The HotS Terran Help Me Th…
Simple Questions Simple An…
[G] TheCore - Advanced Key…
The HotS Protoss Help Me T…
[M] (4) TPW Strangewood Mire
[D] Map Contest Finalists,…
[D] Favorite Maps that Did…
OneGoal: A better SC2 [Pro…
TeamLiquid Map Contest Fin…
General Discussion
Dota 2 QQ thread
AMD Premier League S1 Gr…
Inhouse Dota
The Great Dota 2 Key Req…
Liquid wins the Curse In…
[The International] Easter…
[TPL S5] Demon Edge Cup
Liquid Pasture Community L…
[D2L] Na'Vi vs. Dignitas
Starladder Season 6
Drafting, my thoughts on it
Simple Questions, Simple A…
Solo Mid - Who? What? How?
A guide to Krobelus, the D…
Newly ported Hero discussi…
Pucca Comeback?
DES Sonic Interview 5/18…
[Update] itemBay SSL Gra…
Better Server Registry F…
snipealots 24/7 afreeca …
Making an Online Broodwa…
Torenhire Starleague II
[TLS2] Qualifier #4
D Ranks Teamleague Season 4
[GC S3] Gambit's Cup Semif…
2x Speed Hack Perversion T…
Increasing APM/EAPM
Challenger map on Starcraf…
Simple Questions, Simple A…
Tips and tricks: Defilers …
2012 - 2013 Football Thr…
NBA Playoffs 2013
Don't Starve - Survival …
Magic: The Gathering Onl…
FINALLY! - The 2013 NHL …
[Patch 3.07: Nerf Everythi…
Anyone Diamond want to joi…
Ranked team looking for pl…
[LoL] General Stream Thread
[D] Pro Scene Evolution
[OGN] Olympus The Champion…
[Guide] Montegomery's Supe…
[Champion] Ezreal
[TL R&D] T.R.O.L.L.S.
The: What is my item worth?
D3 Hardcore Community
Witch Doctor Discussion
[M][N] Les Mafia
Doctor Who Mafia
Carnival Cruise Mafia
Running Thread
The 2013 Weightlifting Pro…
Questions & Answers
Leta - Movie
Michael - skyline
Anytime - Beast
By.Hero - Shuttle
Anytime - Pusan

Website Feedback

Closed Threads

IRC Chat
irc.quakenet.org #teamliquid

IRC Web Client

TeamSpeak 3 (62 users)

SC2 Ladder Analysis: Part 2

Forum Index > StarCraft 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 All
  Excalibur_Z   United States. August 08 2010 08:59. Posts 10287Profile # 
Following my previous ladder analysis post located here: http://www.teamliquid.net/forum/viewmessage.php?topic_id=118212 Vanick has developed a more in-depth theory regarding the inner workings of the SC2 ladder system.

Introduction

This post is a followup to the original ladder analysis post, which shall go into further detail regarding the system. Please note that much of the content contained within this post is of a more speculative nature, and if a detail here is wrong it should not reflect poorly on the original analysis. I will be delving deeper into the mathematical underpinnings, though it should not be excessively complex and I will try to make it easy to follow.

Overview

To start with, we assumed that Blizzard used a system quite similar to their WoW Arena matchmaking system, albeit with refinements. The Arena system uses a Bayesian inference model to create its ladder and do its matchmaking. What this means in essence is that the rating used to represent your skill is easily updated after each match. For more details, see: http://en.wikipedia.org/wiki/Bayesian_analysis

In conjunction with this, the MMR is actually one part of the skill probability distribution. Blizzard also uses an “uncertainty” factor. That is, when you first start in Arena there is a lot of uncertainty in your rating. As you play more games, that uncertainty decreases and the system is more “confident” in the rating it has assigned to you. I will be referring to this uncertainty factor as sigma, and it is the inverse of the system's confidence. This forms a bell curve, also known as a Gaussian, or normal, distribution. For more details, see: http://en.wikipedia.org/wiki/Gaussian_distribution . The curve represents a couple related ideas: the range in which your skill may truly fall, as well as the fact that you do not play at exactly the same skill level every game. A more consistent player would have a narrower curve, for example.

This class of ladder and matchmaking is not new. The first system using a method similar to this is the Glicko system, used to rank chess players, and is arguably better than the famous ELO system which encourages some strange behavior (e.g. it is better to draw in ELO than risk a loss in many cases). Another well-known system is Microsoft TrueSkill, used in every Xbox 360 game for matchmaking and ranking, as well as PC games such as Dawn of War 2.

The published data on TrueSkill gives a glimpse at the underpinnings of a modern Bayesian ranking system designed for videogames. Blizzard’s implementations are obviously different from TrueSkill, though we can infer much from what we know about TrueSkill, and what we know about the SC2 ladder.
For a layman’s primer on TrueSkill: http://research.microsoft.com/en-us/projects/trueskill/details.aspx
For an in-depth description of TrueSkill: http://research.microsoft.com/apps/pubs/default.aspx?id=67956

Matchmaking

The short version of what the links above show is that it is possible (and computationally efficient) to take the MMR and uncertainty factor (also known as sigma, or standard deviation) for both players. The MMR and sigma form a bell curve per player. It is possible to combine the bell curves into a 3D probability distribution. This is done by combining the data to form a shape like this:

[image loading]

It may help to think of it as combining the two 2D curves perpendicularly and forming this 3D shape. This shape is centered on a point in the (x,y) plane, where x represents player 1’s skill, and y represents the skill of player 2. Intuitively, the best matches will be between ratings where x=y. Thus, Blizzard attempts to keep it as close as possible. Looking at this same shape top-down (try to visualize it as a topographical map):

[image loading]

Run a line along x=y, and you will split the shape into 2 pieces. If you sum the volume under the shape on each side of this split, and compare their relative size you will get the probability of a player victory. If the curve is contained wholly within one side of the graph then clearly that player is overwhelmingly favored by the system (Note: this is NOT the same thing as the “Favored” display on the loading screen!). Also note that this does not need to be circular when looking at a top-down section. If players have different confidence values it will look like an ellipse.

Note that this figure is taken from a TrueSkill presentation, and is copyright Microsoft. TrueSkill incorporates the possibility of a draw. More intuitively, it can be thought of as the “matchmaking sweet spot”, and something similar is likely used by SC2’s ladder to provide the system some wiggle room in matchmaking.

After a match finishes, the system needs to update the MMR and sigma for both players. Displayed rating will be discussed later in this post. Whenever a match finishes the winner’s MMR increases and the loser’s decreases. More interesting is what happens to the sigmas. If the match finished as expected with the MMR favored player winning (and remember, the loading screen “favored” display is NOT this) then both players' sigmas will decrease. That is, the system gains confidence in the ratings it has assigned to the players. If the match finishes in an upset and both players' sigmas are small, then the sigmas for both players will increase as the system thinks it may have an incorrect rating assigned to both. The change in sigma scales based upon the difference in MMR and the difference in sigmas. That is, losing to someone close to your own rank will not change your sigma too much (though it will over the course of several games).

If a lower-MMR player wins then what happens depends a lot more on their precise equations they are using. If a player's sigma is large in an upset (whether he's the winner or loser) it can decrease. That is because, given the right MMR and sigma values, it's possible in theory for the system to learn about that player's skill and rate him more accurately. If a player's sigma is small, however, it can become larger after an upset if that upset was truly unexpected.

To summarize: combining the MMR and uncertainty factor of a player creates a curve. Take two of these curves and form a 3D shape. This shape shows the probability of victory when split along x=y. Matchmaking tries to have x=y, but will expand the search if no match is found quickly.

Promotion

As initially theorized, promotion requires your MMR to be above a certain league threshold. However, because MMR changes greatly after each match and the opponent variation is so wide, often spanning multiple leagues, the system requires a particular degree of confidence before it allows promotion. Our initial theory assumed that sigma just needed to be small enough to allow promotion, but it's been confirmed that sigma never gets this small. Instead, it does this by a moving average. Here's an example:

[image loading]

MMR is erratic. A moving average seeks to smooth out the rapidly changing data points over time by evaluating your progress over X number of games. As we previously estimated, the system doesn't use your full match history because if it did, you would eventually get stuck in a league. Once your moving average crosses a particular league threshold, that's when you'll get promoted.

Players like CauthonLuck and Ret who had obscene win ratios had their MMR data points skyrocket. However, the moving average lags behind. In the cases of those players, it will take much longer for the moving average to reach that required threshold. This is why players like IdrA who were affected by this problem have decided to intentionally throw games in order to get promoted, because it allows the moving average to catch up more quickly.

Possibly related is players that aren't getting promoted or demoted properly despite a high likelihood that their moving average would have crossed the confidence threshold. Blizzard has said that this is indeed a bug and will be fixed by moving the affected players to new divisions.

Displayed Rating

Ok, how does all of this tie into displayed rating and the whole “favored” deal? If you remember back to WoW, ratings changed based on a direct comparison of your displayed rating to the other team’s MMR. So if your current rating was 500 and you were playing people with MMRs of 2000, your rating would jump significantly after every win because of the wide disparity. Now, we’ve identified that on the loading screen quite often players are seeing the other person as favored and the opponent (who is nominally “favored”) also sees his opponent as favored! How can this be? The theory put forth here is the system is again comparing your displayed rating to your opponent’s hidden MMR.

The reason for this is so that the system brings you toward your MMR more quickly. kzn explains:


On August 08 2010 14:30 kzn wrote:
How it works was like this: Say you've got a MMR of 2500, and you start a new team. It starts at 0 rating, but the matchmaking system will match you with other players of MMR 2500. If you lose a game, your team rating would not change at all. If you won, it would increase by 47 (a hard cap that was in place at least when I played). This was not explained as arising due to an interaction between the team rating and the opponent's MMR, however - it was explained as the system trying to get your team's rating as close as possible to your team's MMR rapidly.


Therefore, a corollary here is that when determining rating increase, the hidden threshold value for your league is added to your displayed rating, then compared to your opponent’s MMR, for purposes of computing the gain/loss to your displayed rating.

Example: ExcaliburZ and I play a game. His MMR: 2600, sigma: 100, displayed rating: 300. My MMR: 2500, sigma: 50, rating: 150. Diamond’s MMR threshold: 2300. Excal wins because he rules. What happens?
- His MMR will increase
- My MMR will decrease
- Both of our sigmas will decrease
- His rating will increase. How? By comparing my MMR (2500) against his rating + diamond’s MMR threshold: 300 + 2300 = 2600, his gain is thus off 2600 vs my MMR of 2500
- My rating will decrease. In the same way: his MMR: 2600. My rating + threshold: 150 + 2300. Thus I lose points proportionally to 2450 vs 2600.

Conclusions

SC2 uses a Bayesian inference system for its skill determination which forms an MMR and a confidence value for each player. These form a Gaussian distribution useful in determining win probability. Promotions/demotions occur when a player exceeds/drops below a threshold with sufficient confidence. Displayed rating changes according to a combination of the rating itself combined with the hidden MMR and league thresholds.

More clarifications from Vanick:


On August 08 2010 11:33 vanick wrote:
To be clear, the player's skill is never pinpointed. The sigma is never 0. All players vary in their performance from game to game and over time as their skill increases (or decreases!).

I left a point out in my writeup that I probably should have included. TrueSkill, and likely SC2's ladder, have a factor based off the time since your last game that increases the player's uncertainty level (sigma) by an amount related to that. Even if you're playing games back to back this factor will have a minimum value that will still increase sigma. This allows the system to adapt to a player whose skill increases over time.


Questions

Some of these have answers. Some are open questions. You can add on; I will answer them as best I can.

Q: So how do bonus points affect the display rating changes? If the displayed rating change is based upon the comparison of the opponent's MMR with the player's displayed rating + the player's league cutoff, then wouldn't bonus points inflate the displayed rating and cause problems?
A: I'm not sure how they account for this. One possibility is they keep track of bonus points that make up your displayed rating, and ignore them when performing the calculation in the back-end.

Excal: It seems more likely that the bonus pool is only used to increase the displayed rating for division ranking purposes and ignored in back-end calculation because the bonus pool increases at the same rate for all players. This introduces a constant that is easily discarded when assessing actual skill within the system. Furthermore, if bonus points were considered in the process of point calculation, it would present an unfair advantage for players who have not yet used up their bonus pool (because their rating is therefore inflated giving them more to lose).

Q: Would it take longer to get promoted if you've played lots of games? Assuming someone played a large amount of games (say 100 with a 50% win/loss ratio). If he were to start winning 70% of his games, would it be harder for him to get promoted than someone with similar percentages but fewer games played?
A: It would take longer, yes. The moving average trails behind sharp increases in skill.

Antiquated or Incorrect Information for Archival Purposes
+ Show Spoiler +

EDIT 10/25/2010: Made crucial updates to several sections in light of new information acquired from Blizzcon 2010.

EDIT 8/11/2010: Made an important clarification to the Matchmaking section.

EDIT 8/10/2010: Added a third question related to promotion opportunity.

EDIT 8/9/2010: Added extra information to the first question about the circumstances under which sigma may increase or decrease. Also removed a misleading sentence regarding ideal matches.

EDIT 8/7//2010: Modified the second question to make it less vague, and removed incorrect information from the Displayed Ratings section.

_________
Thanks to myself for proofreading, editing, and analytical input (hehhh self-credit).
Last edit: 2012-07-25 16:04:59
 
Old Post

  heyoka       Administrator       August 08 2010 09:07.Profile Blog # 
This is so awesome, thanks for taking the time to put it together.
@RealHeyoka
Old Post

 
 Surrealz   United States. August 08 2010 09:13. Posts 432
Profile Blog # 
epic applied mathematics, thanks for this.
1a2a3a
Old Post

 
 Dionyseus   United States. August 08 2010 09:14. Posts 2062
Profile Blog # 
Interesting read, thanks.
9/5/10 P acct: NA D 10,683 651pts 69w56L http://sc2ranks.com/char/us/290365/LetoAtreides T acct: NA D 16,137 553pts 70w67L http://sc2ranks.com/char/us/1560008/Khrone Z: NA G 16,058 465pts 28w26L http://www.sc2ranks.com/us/1997354/Omnius
Old Post

 
 Kollapse   United States. August 08 2010 09:14. Posts 125
Profile # 
very interesting read. thanks for taking the time
Talent hits a target no one else can hit; Genius hits a target no one else can see.
Old Post

 
 NuKedUFirst   Canada. August 08 2010 09:17. Posts 3138
Profile Blog # 
Wow! Very interesting read, thanks for putting this together
FrostedMiniWeet wrote: I like winning because it validates all the bloody time I waste playing SC2.
Old Post

 
 vanick   United States. August 08 2010 09:17. Posts 53
Profile # 
Just to be clear since I am afraid I was inconsistent in my naming: the "confidence" value is referring to the uncertainty factor (sigma). It is often easier to think of it in terms of confidence, even though what is stored and used for the distribution is the uncertainty. High confidence merely refers to low uncertainty while low confidence would refer to high uncertainty.
Old Post

 
 gerundium   Netherlands. August 08 2010 10:05. Posts 785
Profile # 

On August 08 2010 08:59 Excalibur_Z wrote:

Q: So what’s the deal with people stuck in Platinum who can’t get promoted to Diamond despite clearly belonging there?
A: Short answer? It’s a bug. Longer answer: a lot of people have suggested that the system requires you to lose in order to build its confidence factor. This is almost certainly incorrect. The system in theory learns enough about you from your wins to promote you. Intuitively, if your record is 60-5 against diamond players, you ought to be in Diamond. The TrueSkill system can determine this, and I would be dollars to donuts that Blizzard’s system can too, as designed anyways. Implementation may have introduced bugs that certain players hit under certain conditions. We don’t have enough evidence to flat out state that the system requires you to lose. It may be a workaround to the bug, however.

_________
Thanks to myself for proofreading, editing, and analytical input (hehhh self-credit).


This happened in Halo 3 as well ( it uses a modified Trueskill system fit for 4v4 matches so not entirely the same.), you'd have to look up which Bungie weekly update it is discussed in. In general though it was a case of a few friends playing together and getting stuck due to the certainty factor i believe, they ended up in level 26 or so (ouf of 50) where they proceeded to go 46-0 in games or something retarded like that without ranking up.

Edit: very well done btw, i was reading a lot about trueskill when halo 3 came around and the ranking system was a hot topic. It really hit some points home for me, especially the 3d distribution is very enlightening.
Last edit: 2010-08-08 10:06:51
Old Post

 
 jamesr12   United States. August 08 2010 10:07. Posts 1546
Profile Blog # 
Very nice right up, well done. Math major?
http://www.teamliquid.net/forum/viewmessage.php?topic_id=306479
Old Post

 
 Integra   Sweden. August 08 2010 10:51. Posts 4924
Profile Blog # 
Someone who actually know what he's talking about.Didn't think such people existed on TL
"Dark Pleasure"
Old Post

 
 s.a.y   Croatia. August 08 2010 10:59. Posts 3580
Profile Blog # 
Are you a rocket scientist?
hot_bid / R1CH fan | www.sc2croatia.com | Croatian SC2 scene
Old Post

  Synwave   United States. August 08 2010 11:05. Posts 2738Profile # 
Holy crap man, alot of work. I will need to reread this a few times. Awesome v2.0 explanation though!
I read the heck out of your first version btw.
♞Nerdrage is the cause of global warming♞
Old Post

 
 vanick   United States. August 08 2010 11:07. Posts 53
Profile # 

On August 08 2010 10:07 jamesr12 wrote:
Very nice right up, well done. Math major?

Computer science.

And, gerundium, it's interesting to hear that about Halo 3. From my understanding even though the system receives less information from people (or groups of people) never losing it does get enough information to in theory promote them. TrueSkill has its own functionality that is supposed to allow it to rank individuals who play in random teams (as does SC2, see 2v2 random etc.). Perhaps there was a bug there?
Old Post

 
 theqat   United States. August 08 2010 11:09. Posts 2098
Profile Blog # 
Cool thread! Thanks for the hard work.
 
Old Post

  virgozero   Canada. August 08 2010 11:23. Posts 412Profile # 

More interesting is what happens to the sigmas. If the match finished as expected with the MMR favored player winning (and remember, the loading screen “favored” display is NOT this) then both players' sigmas will decrease. That is, the system gains confidence in the ratings it has assigned to the players. If the match finishes in an upset then the sigmas for both players will increase as the system thinks it may have an incorrect rating assigned to both. The change in sigma scales based upon the difference in MMR. That is, losing to someone close to your own rank will not change your sigma too much (though it will over the course of several games).

I think there is a huge problem with this system.
The system is basically setting every player as an average joe and have it continually play games and use its MMR&Sigma to determine its skill level.

However, it has already been said that in order for this to work, it requires the player to play a course a game. The system can assign a player as a GOLD level and then when it looses to a silver, the uncertainty increases. This will have to continue to happen until a consistency is reached. The problem lies in the fact that player GETS BETTER and over the course of the games necessary to pinpoint the players skill level. By the time the system can safely assume a players skill level, the player skill level has already changed.

Meaning the first 3 games used to pinpoint a players skill level is now negligible because that player is not longer the same player as he was 3 games ago.

Now all this is assuming the player is a fast learner. However the rate @ which players learn is completely random, so i wonder how they can utilize math to incorporate this into their system (which imo is impossible).

For this example I used 3 games but I am sure for the system to reach any sort of consistency it may take at least 20 games or so (which is possibly why most people get into diamond league in 20 games or so). And I don't know about you guys but my 21st game and my 3rd game of Sc2 are in fact different. After each game a person gets better, be it big or small. The difference is enough to change the W/L expected (considering the system puts you at some sort of equal setting)
Last edit: 2010-08-08 11:24:50
Old Post

 
 vanick   United States. August 08 2010 11:33. Posts 53
Profile # 
To be clear, the player's skill is never pinpointed. The sigma is never 0. All players vary in their performance from game to game and over time as their skill increases (or decreases!).

I left a point out in my writeup that I probably should have included. TrueSkill, and likely SC2's ladder, have a factor based off the time since your last game that increases the player's uncertainty level (sigma) by an amount related to that. Even if you're playing games back to back this factor will have a minimum value that will still increase sigma. This allows the system to adapt to a player whose skill increases over time.
Last edit: 2010-08-08 11:36:11
Old Post

  Excalibur_Z   United States. August 08 2010 11:42. Posts 10287Profile # 
Updated the original post with that.
 
Old Post

 
 sYz-Adrenaline   United States. August 08 2010 11:52. Posts 1688
Profile Blog # 
my brain hurts
Can you feel the rush?
Old Post

  virgozero   Canada. August 08 2010 11:56. Posts 412Profile # 

On August 08 2010 11:33 vanick wrote:

I left a point out in my writeup that I probably should have included. TrueSkill, and likely SC2's ladder, have a factor based off the time since your last game that increases the player's uncertainty level (sigma) by an amount related to that.

yes but thats also very icky because we have no idea how accurate that is and how that differentiates from person to person. I am assuming it is a constant which would assume all players learn @ the same rate which they don't. Sure you can get a general consensus that in 1 week time a player should be X better and therefor we would adjust our system in accordance with X by mutliplying certain varaibles by Y or w/e but it still won't be accurate or anything near accurate.



Even if you're playing games back to back this factor will have a minimum value that will still increase sigma. This allows the system to adapt to a player whose skill increases over time.

Again I dont quite understand how this can be accurate though, this minimum value? Can you explain a lil more.
Old Post

 
 Rinrun   Canada. August 08 2010 12:08. Posts 3486
Profile # 
My goodness this was an intriguing post, due to the fact that I actually understand the stuff going on! Great write up, great read.
MBC/Liquid/TSM for now, and for always.
Old Post

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 All
Please log in or register to reply.
 
Refresh
StarCraft: Brood War
StarCraft 2
Dota 2
Other Notable Streams
[ Show 90 non-featured ]

» Recent SC2 Results
» Premier SC2 Tournaments
Sidebar Settings...

The Little App Factory



The opinions expressed by our users do not reflect the official position of TeamLiquid.net or its staff.

Advertising | Jobs | Privacy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren. Ad tag: TF_US.
The contents of this webpage are copyright © 2002-2013 Teamliquid.net. All Rights Reserved