|
On July 12 2012 09:14 VediVeci wrote: In his defence it's not Bayesian inference either, its Gaussian Density filtering (I don't believe that the latter is a subset of the former though I could be wrong, Gaussian Density filtering is over my head right now).
The language is a little confusing, because Bayes defined a particular optimization technique for making probabilistic guesses about a situation's outcome, but a much wider class of techniques are referred to as "Bayesian" because they are mathematically or philosophically similar. In fact, I'd call Elo "Bayesian" by the latter definition, because it's trying to converge on the prediction that's most likely to be accurate, but it's just not using an explicit error function to do so.
Anyway, from the guy's talk at UCI, I got the impression that Gaussian density filtering was just a particular technique applied in the context of a Bayesian algorithm, and not actually the name for the entire matching technique. However, I don't really know about that.
|
On July 12 2012 19:18 skeldark wrote: Searched throw the new post: -Not a singe valid argument why the numbers are wrong. -Not a singe calculation over my source data. ( You can ignore my results i published the source data you can analyse it yourself)
MMR: You tell me to watch information that you only know because me or not_that discovered it. You talk about that my the mmr i analyse is not 100% correct without understanding its race independent. Besite the fact, that no one of you know how i analyse MMR, how i correct derivation and that the method is working flawless for 100.000 games by now. Every possible mistake i do in mmr calculation don't affect the result of this calculation because my MMR calculation is race independent. A simple fact, i point out in the op and most here ignore.
Definition of imbalance: I am not responsible for people who misinterpret my data. Many of you "statistic guys" do so too! You complain that my definition of imbalance is not the one that people on TL use.
I thought this is clear to people with statistic background but to point it clearly out: I detect unbalance in MMR values. Not the reason because this one is not mathematical traceable. Not for me not for blizzard, for no one.
I didn't read much of this thread because it seems to be mostly arguments about definitions or your methods and such... but unless the datafile you posted is wrong, you have a sample of 5592 players which makes any analysis useless.
Get 10 times as much data and there might be a statistical value in it.
|
On July 12 2012 19:18 skeldark wrote: Searched throw the new post: -Not a singe valid argument why the numbers are wrong.
The point isn't your numbers, it's your technique. You can accidentally apply the wrong technique and get the right numbers, but the point is that if you do that, we'll never know. If your technique isn't correct, the whole thing is untrustworthy.
-Not a singe calculation over my source data. ( You can ignore my results i published the source data you can analyse it yourself)
You're the one who did the work, you really need to correct these problems yourself or have your work disregarded by people who understand the statistics involved.
Edit: I can't even make a half-assed attempt to look at your data until the weekend, sorry. Even then, I can't promise being able to put in the time to do a proper analysis.
|
On July 12 2012 19:33 Morfildur wrote: I didn't read much of this thread because it seems to be mostly arguments about definitions or your methods and such... but unless the datafile you posted is wrong, you have a sample of 5592 players which makes any analysis useless.
That's not really a fair criticism. 5592 players is a huge sample, even by the standards of much more complicated studies in fields like medicine. Only Blizzard will ever do better than that.
|
On July 12 2012 19:36 Lysenko wrote:Show nested quote +On July 12 2012 19:33 Morfildur wrote: I didn't read much of this thread because it seems to be mostly arguments about definitions or your methods and such... but unless the datafile you posted is wrong, you have a sample of 5592 players which makes any analysis useless.
That's not really a fair criticism. 5592 players is a huge sample, even by the standards of much more complicated studies in fields like medicine. Only Blizzard will ever do better than that.
It's less than 2% of the total player base, it would be about a fourth of the masters players. That's not a huge sample...
|
On July 12 2012 19:34 Lysenko wrote:
You're the one who did the work, you really need to correct these problems yourself or have your work disregarded by people who understand the statistics involved. No im not.
There are no problems. You just misunderstand what i did. I show a unbalance in mmr values per race that is not possible to explain with random mistake. Thats all i did. If you want something else, do it yourself. Arguments about the skill-system have NOTHING to do with my calculation. If you dont understand this fact than you dont understand what i did.
OFFTOPIC: the video is old. Whatever method they use i use too because i back-engineer their mmr value. I dont calculate it on my own.
|
On July 12 2012 19:38 Morfildur wrote:Show nested quote +On July 12 2012 19:36 Lysenko wrote:On July 12 2012 19:33 Morfildur wrote: I didn't read much of this thread because it seems to be mostly arguments about definitions or your methods and such... but unless the datafile you posted is wrong, you have a sample of 5592 players which makes any analysis useless.
That's not really a fair criticism. 5592 players is a huge sample, even by the standards of much more complicated studies in fields like medicine. Only Blizzard will ever do better than that. It's less than 2% of the total player base, it would be about a fourth of the masters players. That's not a huge sample... My data is biased towards master (near 40% i think), but its only a question of time.
I get 5k games per day with 5k potential new accounts.
|
On July 12 2012 19:38 Morfildur wrote: It's less than 2% of the total player base, it would be about a fourth of the masters players. That's not a huge sample...
It's far more than enough to draw inferences about the population as a whole. Generally the uncertainty of uncorrelated aggregate data from a given sample size improves with 1/sqrt(n), so his sample can be analyzed to maybe 99% accuracy and the entire Starcraft population to about 99.8% accuracy. Not a big difference.
|
On July 12 2012 19:38 skeldark wrote: I show a unbalance in mmr values per race that is not possible to explain with random mistake.
The mistakes we're pointing out in your analysis are systematic, not random. If they were random, they wouldn't be problematic. Systematic errors can often produce a result where you THINK you have pinned things down to a certain accuracy but you're actually off by a much larger amount. That's why it's such a big deal.
Arguments about the skill-system have NOTHING to do with my calculation.
If the skill system you're using (which I gather is Elo) produces a different distribution of results than you're assuming in your calculations (and the difference between a logistic distribution and a normal distribution is small but real), then you could EASILY be off by 20 or 30 Elo points in your estimates of the uncertainty of your average.
the video is old. Whatever method they use i use too because i back-engineer their mmr value. I dont calculate it on my own.
That makes no sense. Scores between these different skill rating systems don't translate from one to the other. Also, we're all pretty sure the information in that video has not changed in a long time.
|
On July 12 2012 20:07 Lysenko wrote:Show nested quote +On July 12 2012 19:38 skeldark wrote: I show a unbalance in mmr values per race that is not possible to explain with random mistake. The mistakes we're pointing out in your analysis are systematic, not random. If they were random, they wouldn't be problematic. Systematic errors can often produce a result where you THINK you have pinned things down to a certain accuracy but you're actually off by a much larger amount. That's why it's such a big deal. If the skill system you're using (which I gather is Elo) produces a different distribution of results than you're assuming in your calculations (and the difference between a logistic distribution and a normal distribution is small but real), then you could EASILY be off by 20 or 30 Elo points in your estimates of the uncertainty of your average. Show nested quote +the video is old. Whatever method they use i use too because i back-engineer their mmr value. I dont calculate it on my own. That makes no sense. Scores between these different skill rating systems don't translate from one to the other. Also, we're all pretty sure the information in that video has not changed in a long time. I think i finaly understand your problem. You think i run my own skill-system!
I dont use any skill system! I dont wrote a skill-system and blindly assume it is the same blizzard use! Im NOT calculating the skil, i back-engenier it. I dont care what system generates the number. Forget all the technical details and just think about i have direct access to blizzard ladder db. No need to know the function if you know the result of the function!
I am of by +-25 elo points most likely even more. And it does not care!
I can add Random numbers to ANY MMR point and my argument still stands! Thats what im talking about. The mistake in mmr dont affect the result! I take the MMR nr of BLIZZARD i dont calculate the MMR number my self!
|
skeldark i feel so sorry for you...
i can only hope that i am just part of a silent majority who got it from the op. The part that usually doesn't feel the need to write anything.
You have done interesting work!
@Imbalance How can you not get that? If a race is picked by all casuals (reducing their av mmr) than this itself is a form of imbalance - maybe they took this race because it looks uber-awesome, so a graphical imbalance, or got tons of tutorial, so an information imbalance.
And no, this doesn't tell us if one race is stronger in a theoretical game-design-scenario, but no one claimed that to begin with.
|
OK I went back and re-read in detail your writeup of how your actual tool works. I had mistakenly believed that you were actually calculating your own Elo scores for players.
What you're reverse engineering is the adjusted point value. Trying to infer something from this about actual skill ratings has some problems:
1) You are assuming the MMR is Elo, when it's absolutely not. This is explained clearly in the UCI video.
2) There is a 1:1 conversion between MMR and adjusted point score, which are the units you're backing out with your tool. Seeing backed-out adjusted point scores is interesting, but that 1:1 conversion is not necessarily linear, and if it's not linear then you can't necessarily make assumptions about the distributions of the underling MMR. I mean, you can't do that AT ALL. You don't know how that conversion works.
3) The more complex skill rating systems use the uncertainty value as well as the skill number to adjust a player's score. The MMR can move by a different amount than adjusted points over the short term, because the use of difference between MMR and adjusted points provides long term pressure for adjusted points to catch up with a changing MMR. So, what your tool is doing only works for relatively stable MMR numbers.
4) Your monte carlo simulation doesn't capture actual uncertainty of the underlying MMR. The fact that adjusted points tracks MMR with a lag (as I mentioned in 3) means that fully random walks of adjusted points don't capture what the real system will do in any case. It's very possible that the differences you see between races would be much smaller than typical MMR uncertainties (which you can't see or measure) yet very unlikely with your randomly-generated pseudo-matchups.
Bottom line is that while your data regarding league boundaries in terms of adjusted points makes some sense, analyzing this data for racial differences is simply impossible because there's not necessarily a definable relationship (in the absence of information we're missing) between adjusted points and win likelihood.
|
On July 12 2012 20:13 skeldark wrote: I dont use any skill system! I dont wrote a skill-system and blindly assume it is the same blizzard use!
You can't make ANY statistical analysis of a skill system you don't know the details of. Different skill systems produce different distributions of player skill ratings. For example, if I took all the players and rated them from 1 to 1,000,000 or whatever I'd have a flat distribution. Elo produces a logistic distribution. Ideally the distribution is normal, but in the absence of real information you don't know that.
Edit: This is a small issue. You're probably not going all too wrong by assuming a normal distribution. The bigger problem in the analysis is that changes in adjusted points don't track MMR as fast as MMR moves, so you have no way to estimate or take into account the accuracy of individual MMR numbers. The short version is that adjusted points will SEEM more accurate than MMR values would because they change less fast.
|
Can't believe this thread is still going on.
+ Show Spoiler +Problem statement: OP wants to measure e-peen size and compare between different races.
Data gathering: OP constructs a e-peen measuring tool that user directly inserts on a voluntary basis. User base are high hormone individuals who are very interested to know how big their e-peen is. Therefore, these individuals are most likely already at a higher percentile of e-peen length compared to the general population.
Measurement: Users measure e-peen whenever they are playing by themselves. This play will be contested with another user. The winner will have longer e-peen and vice versa thus the true size of e-peen is estimated based on such repetition. Some e-peen have been observed to fluctuate in size up to 1000 inches in a few days. How can it be? Can the true size of e-peen be so volatile? Should it not be stable? Seems like some measurement are taken while in the state of flaccidity.
If indeed the measurement needs multiple observation to settle on a true e-peen then does that mean any single observations is then unreliable and not credible? But we like e-peen so the more is better. Let's not care about that.
Methodology: The statistics earlier is based on the average over a long period of time before the individual has truly established his true e-peen thus if there are any upwards biased (likely because they are all looking to gain the next level of e-peen recognition), the value will be severely underestimated. Not to mention those single observations from more outdated time. Later it is changed to be the latest e-peen measures. Pssh.. we should ignore testing anyway let alone use the correct test tool.
Summary: Human have the shortest e-peens. Humanoid aliens bits and bug tentacles have imbalanced in size.
|
Korea (South)1936 Posts
This is pretty cool. I wonder how accurate it actually is and what Blizzard uses for their own methods
|
1) You are assuming the MMR is Elo, when it's absolutely not. This is explained clearly in the UCI video. you know that he guy who found the video is the same guy that wrote the f - function about the adjusted points? not_that! and we have more source about it than the video. We use the f- function and we can SEE it fits. We can prove it fits! What do you think we did in last month. We validate the f-function and other parts of the back-engeniering, We did not just come up with it. We looked on 100.000 games and analysed them for many month!
2) There is a 1:1 conversion between MMR and adjusted point score, which are the units you're backing out with your tool. No its not! adjusted points is a part of it together with other values. no 1:1 ratio. You did not understand the f function. Also the f function is only 10% of the work to find out the MMR. Its way more complicated than that.
3) The more complex skill rating systems use the uncertainty value as well as the skill number to adjust a player's score. No.We thought so to but we found no evidence of the data for it. It act like predicted.
4) Your monte carlo simulation doesn't capture actual uncertainty of the underlying MMR. It dont even have to. It dont need mmr .
i could come up with the color of my coffee instead of mmr. If it produce the result i publish than the race in sc2 affect the color of my coffee! The fact that you are still talking about disputation of skill functions tell me that you did not understand what im doing here. Because it have nothing to do with skill-functions!
My MMR calculation dont prove the result! the result prove my MMR calculation! Thats the main point you dont understand!
|
On July 12 2012 20:23 Lysenko wrote: What you're reverse engineering is the adjusted point value. No, you are not giving him enough credit. He is reverse engineering the actual MMR value, and as far as I can tell he has succeeded. It is quite revolutionary work (and not very simple).
|
On July 12 2012 20:34 Mendelfist wrote:Show nested quote +On July 12 2012 20:23 Lysenko wrote: What you're reverse engineering is the adjusted point value. No, you are not giving him enough credit. He is reverse engineering the actual MMR value, and as far as I can tell he has succeeded. It is quite revolutionary work (and not very simple).
What he's reverse engineered are MMR values mapped back into adjusted points and then mapped from there into an Elo-like point system. The problem is that the mapping between MMR and adjusted points may not behave well for the case where a player's not in equilibrium. That may not be a problem for a player with stable MMR, but across a large population many of the players won't be in equilibrium at any particular time, and the interesting information is in those players.
This difference may not affect the averages very much, but it definitely will affect an estimate of how likely a particular difference between scores is in random play. He's doing this monte carlo simulation to guess how likely those differences between races are, but his monte carlo simulation doesn't capture nonlinearity in the MMR -> adjusted points relationship when a player is NOT in equilibrium.
|
I like how protoss is the only race to show up at 3000+ mmr.
It's really hard to quantify, "Imbalance" because of how difficult it is to factor in individual player skill. I really appreciate your effort even if your sample size is somewhat small, but then again I can imagine how large a pain in the ass it is to collect that many replays.
|
On July 12 2012 20:42 Lysenko wrote:Show nested quote +On July 12 2012 20:34 Mendelfist wrote:On July 12 2012 20:23 Lysenko wrote: What you're reverse engineering is the adjusted point value. No, you are not giving him enough credit. He is reverse engineering the actual MMR value, and as far as I can tell he has succeeded. It is quite revolutionary work (and not very simple). What he's reverse engineered are MMR values mapped back into adjusted points and then mapped from there into an Elo-like point system. The problem is that the mapping between MMR and adjusted points may not behave well for the case where a player's not in equilibrium. That may not be a problem for a player with stable MMR, but across a large population many of the players won't be in equilibrium at any particular time, and the interesting information is in those players. You change topic but what you discribe is not the case. I try to explain that earlier. We searched for this because we thought the exact same. The strange thing is : we did not found it. The players act like predicted without it!+ You try to argue that the mmr calculation of us is wrong. You can do so in the thread about the mmr calculation! But all my graphes and collected data of last month prove you wrong!
But this is offtopic and have NOTHING to do with what i did here. I dont know how to explain it else to you than i did
On July 12 2012 20:44 cydial wrote: I like how protoss is the only race to show up at 3000+ mmr.
It's really hard to quantify, "Imbalance" because of how difficult it is to factor in individual player skill. I really appreciate your effort even if your sample size is somewhat small, but then again I can imagine how large a pain in the ass it is to collect that many replays.
I wrote this program the rest is automatic: http://www.teamliquid.net/forum/viewmessage.php?topic_id=334561
the top top player have nothing to do with the analyse. Its just that i only know few top players race. But this are 1-3 players and dont affect the result. I have data of way more but not their races yet.
|
|
|
|