Also, I'd like to rewrite something I found at S1 and S2, while analysing region LA data (which was really easy to analyse it!).
Most platinums players that get promoted to diamond mid-late season would be promoted to a Tier E, skipping a tier F completly. Probably 85%, but I am just guessing, I didn't make the math. the others 14% would then be promoted to the tier D. Players on rank F are 99% players that have been placed there at the beginning of the season.
At the beginning of a season, most players would be placed at rank F though, then E, then D... just like following the graph Exc made in his guide.
The same was true for gold -> platinums, most gold players would be promoted to tier A. Platinum rank S was the tier that got the less people from promotion and also the tier that got the less people for placement.
Gold tier was different though, it seemed al the time like both tiers of gold were mirrors, always with the exact same number of people, I could never tell which tier is which looking at the number of divisions from each tier, they are like mirrors and that proves that gold is the "0" for me.
Ps.: rank = tier in my mind, so I bet I made the confusion hundred times just in this post >.<
There's something peculiar with the 73 DMMR cap. For a player below Master, his opponents DMMR will be capped at 73. I initially expected this to mean that due to the range F provides, we would expect players with 73 - ~89 (73 plus half the deviation of F for about even games) or so to be actually capped. In the data however what I'm seeing is players that are 73 -103 to be capped.
First, here's the DMMR distribution of opponents of Master players:
No cap here.
Here's the DMMR distribution of the opponents of players in Bronze-Diamond leagues:
73 cap clearly visible. Let's zoom in on relevant part:
Notice how the line breaks at around 102. This is because the 73-102 range has players that are capped and their DMMR has been artificially boosted for points calculation.
At first I thought this may be due to the league tiers, but a similar graph is seen in all leagues below Master.
Now that I gave it much thought, it makes sense. If an opponent has 72 DMMR and the player was supposed to win 12 points for beating him, Blizzard will instead give the player 13 points for beating him. When we extrapolate the opponent DMMR based on the player winning 13 points, we come up with 102 and not 72.
Edit: This makes my method for finding offsets have less usable data, sigh...
Last edit: 2012-05-24 20:28:20
Excalibur_Z United States. May 25 2012 02:00. Posts 10293
Capped on the low end, not the high end, right? That is, 73 is effectively the baseline.
So what does this mean? After every league change, points are reset to 73 + spent bonus pool, or 73 adj.pts. If a player has less than 73 adj.pts, the points they earn for a game are based on if they had 73? I'm confused about what is being reset to 73 here, and not sure what DMMR represents (though I've seen the definition, I don't follow). If 73 is their new MMR value relative to their given division then that's even more bizarre.
If you are below master league, and your opponent has DMMR < 73 in relation to your tier offset, then Blizzard will add on points to your win amount (or lower points taken in case you lost) until it appears as though your opponent has DMMR >= 73. I'm sure they use a simpler rule than this on their side, I'm pretty sure the rule is: "if DMMR < 73 then DMMR = 73" and it just appears the way it does on our side since our only way to observe DMMR is through reversing F function.
Edit for clarification: This does not only affect players with adj pts below 73. You could have 200 adj pts and play an opponent with 0 DMMR in relation to your tier offset. You won and were supposed to win F(0-200) = 6 points, but instead Blizzard will give you F(73-200) = 8 points.
From what I understand this is a change to the system introduced at season 3, and the reason for this change is to help players use up their bonus pool. The idea is that if your MMR is low in comparison to your tier offset, then you will get similarly low opponents. Beating these opponents will award you little points for a win and therefor you will have to play a significant amount of games in order to use up your bonus pool. Bonus pool grows large -> players get discouraged -> players stop playing. Another possibility is that Blizzard doesn't want players too close to 0, but I doubt this the reason due to bonus pool keeping ladder points sufficiently above 0 anyway.
It's a pity this 73 DMMR bottom cap hurts our quest to finding the offsets. It's pretty significant actually.
I completely gave up on the min/max method I've been trying out. After accounting for the fact that DMMR between 73-~103 is all possibly capped, I have no more 'maximum' data points remaining in nearly 5k games. This is due to the fact that when players play opponents from 2 different leagues, the opponent from the higher league usually has low DMMR rating. Using the min/max method taking 73 DMMR cap into consideration, this is what I have left:
Not enough to work with.
It's time for plan B.
I'm currently trying to see if I can find the function which adjusts MMR. If we can figure out how MMR changes after a game, we would have a simple way to find offsets. I'm testing glicko system to see if I can come up with RD values that make sense, but there are a few problems:
1) We only know the MMR of the player down to a certain range - Solvable by looking at enough consecutive games of the player. 2) We only know the MMR of the opponent down to a certain range - Should not be too bad I hope, since +-14-20 MMR points may not be that much. Not sure. 3) We don't know the RD value of opponent. This may be a problem IF the ladder system uses an uncertainty value that grows larger after an inactivity period. I suspect this may be the case, but not sure.
BTW Excalibur (or anyone else who knows), how did you guys get the Diamond offsets using the +12/-12 method? How many +12/-12 games did you have to work with? Seems like it would take quite a lot, because every +12/-12 game has upwards of the variance of both players combined in it, which should be +-14 twice so upwards of 28 difference. If you were able to just look at enough data and average the variance and come up with 63 (which I believe statistically should be possible with little game count due to noise cancelling itself out on average), then the same should also be possible using non +12/-12 games now that we know F. Anywhere from +8 to +16 points is nearly indistinguishable from +12 in terms of variance of the opponent DMMR. +12 gives you +- 14, while +16 gives you +- 16
If anyone has ideas on good ways to analyze the data please do share. The more heads we put on the problem the better. Feel free to post ideas here, PM me, find me on teamspeak. You can use the current file that I'm working with here, it should be easier than to use the CSV version.
Last edit: 2012-05-25 03:51:40
Excalibur_Z United States. May 25 2012 03:10. Posts 10293
The 63 and 150 values were proven based on the old weekly Top 200 posts on the Battle.net blog. They posted a list of the Top 200 along with their wins/losses and the timestamp, and I copied all the player names in order into a spreadsheet. The player names were all linked for convenience, so I logged their division names and worked backward along their match history until their wins and losses matched what was reported on the timestamp, then I recorded their points at that time. Some players had a tied ranking (say, two #25s), so I knew that if I worked backward and those players had different point values at the time of the snapshot, the difference between those values would be the tier offset. Eventually, patterns emerged and certain players in one tier were consistently X points higher than players in another tier, and from there I was able to narrow down which division belonged to which tier. The 150 value was found a couple of weeks after Master league came out, where Master players were in the same Top 200 list as Diamond players (so their offsets could be easily determined). The offsets were not determined using the +12/-12 method, and I didn't log nearly the amount of data points I needed to in order to craft reliable results (certainly not the 5k+ you're working with).
Ok, so the cap is actually 88, isn't it? You don't need to exclude 89-103 (if they are the minimum number you find thanks to F).
The reason that it appears to be 73 is because that is the minimum range from F for 88, am I wrong? Originally I thought it would be 85 (73+12), so I thought it weird that the cap would be exactly 73 when you said it.
When we calculate dMMR, F makes it so we have a minimum and maximum value. If this value doesn't reach 88, then it doesn't actually reach the cap in any way and we can use this number, no?
I really don't understand all the formulaes you are using, so I could be so wrong, but I always had the feeling that the cap was closer to 85 than 73, 88 could be the answer.
Also, the plugin ignores some logic. Our MMR only decreases after a losing streak, but I lost 8 in a row 2 days ago and the graph didn't adjust to that simple logic. should he readjust the tiers knowing that? I am no programmer, so I have no idea if that's even possible to do.
Maybe we should focus on finding which division is what tier first and make a list. Can't you cross all the data, so, if you are sure that division X is t0, then if that division appears again it already know for sure the tier of that division and then maybe it discover everything fast? If we have the tiers for sure we can work with the data way better.
It can't be that hard to find out which division is which tier, I did that for LA region it's easy.
How divisions are born?
A division is born when a fitting player is placed/promoted/demoted to that skill/MMR range.
If that division reaches 100 players and a 101 player required placement/promotion, then a new division will be born.
If we have 5 Code E Diamond divisions with 100 players, but some of them lose players to promotion/demotions, one having 90 players, other 98, other 99, other 2 100, then the new players that need to be placed in that skill will be first allocated at the division with 90 players, till it reaches 98, then it will distribute the players in these 2 divisions till they reach 99 players, then finally they are full and a new division will only be create if every division in that tier is 100% full.
At LA region, there was only 3 rank S platinum divisions and they would all of the 3 have only 80 players. Because, for some reason, 2 of them has once reached 100 players and the 3th one was born, then, after some point, people there were promoted way more frequently then people getting into that tier and the end result was these 3 divisions of the same tier having only 80 players each! That was pretty interesting to see.
Do you realized what I am talking here? NA, EU, KO moves too fast to keep track, but we can easilly keep track of SEA region and discovery 100% of the tiers just by watching sc2ranks site and the increasing of the divisions.
I think we can explore a lot more than we are doing, it there a way to track the creation/born of new divisions so we could automatically know for sure which tier was it?
Let's take plat divisions. If we know we only have 3 rank S and they have only 80 players each, then we have 5 rank B and they have 96 players each, then finally we have 10 rank A with 100 players each, that means that the next division being born must be a rank A, can't we make a way to have this automated?
If we know the tiers, it will be so easier to especulate on everything else.
On May 25 2012 03:49 SDream wrote: Ok, so the cap is actually 88, isn't it? You don't need to exclude 89-103 (if they are the minimum number you find thanks to F).
We see in the data that 73 DMMR is the first value that appears (ignoring 18 data points that have less, probably mostly / entirely due to bad data that slipped by). This is when calculating the middle of the range using F. This means that Blizzard does the same, and 73 is the first number that they don't modify, hence the cap is there. These 73 numbers all have +- uncertainty values attached to them, from +- 14 (minimum for +12 adj pts games) to +- 17.5 (for +7 game). However we know that for these numbers only the + part is possible. I made a graph of the uncertainty values and it looks like this:
You can see that all the ranges are centered around the DMMR that we calculate. We don't see a flock of the minimal amounts to 73 or anything like this. I hope I made things clearer.
About your second post, a question:
Suppose we knew all the divisions in SEA, what would be the next step? I'm asking because in a way we are already in a similar scenario. We know the offset of Master because it's one tier (we can define Master offset to be whatever we want). How do we use Master known offset to calculate Diamond offsets? It's not that clear how do we go about it. Estimating a player's division tier is doable with reasonable accuracy given enough games. The question is how do you use that to determine offset.
Last edit: 2012-05-25 04:18:15
Excalibur_Z United States. May 25 2012 04:17. Posts 10293
Using SEA as a control may produce strange results because haven't we already seen from some initial scraping unexpected information? Even in the days of discovering division tiers, SEA had a larger-than-expected collection of F-Rank Diamond divisions, if memory serves. I'm sure it would be a bit faster to pick up on trends happening there since it's smaller, but will the information we get be easily translated to other regions?
If a player with 69 adjusted points wins 13 points in his victory, don't that tell you that the cap isn't 73? Wouldn't he win 12 points instead? Edit: well, the cap doesn't matter here >.> anyway, I am pretty sure this number I found is based in the cap though.
If a player with 106 AP wins 11 points in his victory, don't that tell you that the cap isn't 103? If it were, would he win 12 points instead?
To me it is obvious that the number here is only one, it's 88, and that a range. In our data it appears as a range, because we rely on F.
As for SEA data, well, it's like science, you can't just want to find the good stuff, you need to investigate stuff and hope to find something cool there.
Remember all the bugs I would find with LA region Excalibur? And that would turn out to be true on NA region as well, I think that COULD help, or not. But I think it's an approach that could potentially help.
Not necessarily. If a player with 69 adj pts wins 13 points, that means the opponent DMMR was 84-112, which is above the 73 cap. If a player with 73 adj pts wins 12 points, the opponent DMMR was 59-87. Just because this includes numbers below 73 DMMR in the range doesn't mean one of them is actually possible considering the cap. A player with 72 adj points can't win 12 points though (vs opponent below Master), because that would mean his opponent DMMR is given as below 73 by F. The middle of the range can't be below 73.
I think we may be thinking of the same thing but calling it by a different name. The reason I consider 73 as the cap instead of 87 (the minimum number that can have a +- range entirely above 72) is that I think of the value that F gives as the middle of the range and throw a +- range on top of it. If we defined F as returning maximal or minimal possible DMMR value then we would have been calling some other number as the 'cap'. Is that what Blizzard does? Possibly. But I think I'll stick with my convention until I see a good reason to change it. And unless we have a mistake somewhere in F, the only difference it makes is semantics.
My points is that someone that gets dmmr of ex: "85-120" is bad data, because it reaches the 88, it's very small chance of it being bad data, but we can't rely on chances, but a guy that reaches the dmmr of "89-150" is 100% good data, because there is no chance it can reach the center of the cap.
I gave the matter more thought and you are right SDream. 88 is the DMMR cap and not 73. Indeed any DMMR we calculate by reversing F that has 88 DMMR as a possibility is a potentially capped value.
Anyone who wishes to be convinced this is the case just enter adj pts and DMMR values into an F function (not the reverse), and then reverse it and you will see similar data as we have in our data base.
Excalibur_Z United States. May 26 2012 01:41. Posts 10293
Do your tabled values change if you plug in 88? 88 is a weird value to choose... maybe it used to be 0 and it was a sweeping change put into effect to prevent the old problem low Bronze players had with getting stuck at 0 points?
What needs to be the new criteria for discovering offsets? Exclude all games with players who have <88 adj.pts? Filter non-+12/-12 games from that list?
88 could also be statistically significant if it was a change put into place to resolve Bronze Zero. What if that's the difference between -24 and -20 (or maybe -21, I forget if a zero MMR player earns a minimum of 3 or 4 adj.pts for a victory now)?
Until now I considered DMMR values that are <= 103 to be potentially capped. Now I consider DMMR values that have 88 in their range as potentially capped. It's about the same but the latter is more accurate, especially if a game comes up that has a big variance.
Example: Player with 161 adj pts beats opponent and gets 10 points. We calculate using reverse F DMMR value of 102 +- 14.5. Both methods capture that this DMMR value is potentially capped.
Player with 461 adj pts beats opponent and gets 3 points. We calculate using reverse F DMMR value of 120 +- 32.5. This is a bit of an extreme example but it should be clear why the second method is more accurate. This DMMR value is potentially capped and the old method didn't spot it.
+12/-12 games are no different than any other game in this regard. One player may be capped, both or none.
Whether or not 88 is derived from an offset I don't know. Maybe some data from the coming league lock period will tell us additional information.
Excalibur_Z United States. May 26 2012 02:28. Posts 10293
There is one thing left that I really want to understand but I get confused among all that data o.o
A Rank S (t6) Diamond Player at S1-S2 that has a solid 50% win ratio and is not in a winning nor losing streak and is pretty active (a solid number of games)... if that player had 150 adjusted points, that would put him at the exactly border of diamond and master league.
Now the question is, at current season, what would this player (Tier S diamond) need to stand at the exact same point (0 master dmmr)? 150 AP ou 150+88 AP? Or something different?
If we can answer this one question, chances are this will show true for every other league/tier (bronze-dia) and it will make it easier to understand the offsets.
------------------------- Additionally, I've forgot about the league lock. There's always someone at bronze-platinum that suddenly gets good and reach master league, if he could use the plugin for us, this data would help so much *-*
Another good data will be S8 placement. If my MMR is X and I get placed into diamond, then we can understand that the offset was lower than we thought. The placement phase could be pretty interesting, any way to get this data in the database?
Good question. Here's a graph from our data of the DMMR of Diamond players who play against Masters opponents. Keep in mind when playing against Master the Diamond guy's DMMR is uncapped and in relation to the Master offset, so it's all 1 tier.
-40 DMMR average.
Around 120-124 DMMR seems to be the highest populated area, with few dots above it in rare cases. If we keep in mind the system waits until your moving average MMR has passed the promotion line into masters, I think 73 may be the barrier and soon after your MMR passes it you get promoted into Master. Then again you could ask from the other angle how far does a Master player have to drop in order to be demoted into Diamond. It's a bit trickier to calculate the DMMR of the master players in relation to Master offset, but if we assume for a moment the Master players are generally close but slightly higher than their Diamond opponents, then it's possible 0 may be the number. Or it may even be a single number, but I doubt it, a range seems more likely to me.