WHR is a Bayesian rating system, which there have already been some examples here on TL. The thing WHR adds is taking into account that player's skill changes with time, and keeps track of those changes, adjusting them as it gets new information (more game results).
Looking at the paper. It seems like this algorithm is a incremental improvement (all the rating systems rated within 1% in predictive power of each other) in predictive power, but not one that justified the significant increase in calculation complexity. Although I really like that it doesn't assume skill level is a time-invariant figure like most modern rating algorithms.
Another concern I have is how complex and effort it is to pharse game results into format readable by this program. Seems like it wasn't a trivial task putting together Polt's dataset.
IMO WHR is the best rating system I've seen so far. I've been long thinking about implementing it myself. Do you hardcode the parameters, or does your program automatically optimize some of them(for example the rank-time correlation parameter).
An interesting way to improve this might be introducing a per-matchup rating. Of course with some force pushing them towards their average.
Another might be a map imbalance parameter that represents a boost to the player skill difference. Then you optimize the map imbalance parameters like all other parameters in your iteration step.
Looking at the paper. It seems like this algorithm is a incremental improvement (all the rating systems rated within 1% in predictive power of each other) in predictive power, but not one that justified the significant increase in calculation complexity.
The input set for this were even KGS matches. The KGS ranking system is already pretty good especially for players who don't improve rapidly or who play many games. I assume the differences become bigger when the sample size becomes smaller.
On August 31 2011 19:03 Primadog wrote: Another concern I have is how complex and effort it is to pharse game results into format readable by this program. Seems like it wasn't a trivial task putting together Polt's dataset.
Its the same effort you need for all ranking systems. You need a list of (Date,Player1,Player2, Result) tuples. Or if you with my suggestions (Date, Map, Player1, Player2, Race1, Race2, Result) tuples. All of these should be available in TLPD.
The list yoyoma posted is Polt's rating history which is part of the output of the program.
On August 31 2011 20:28 Sina92 wrote: why are people so obsessed with rating players?
because people kind of like to know who's the best at something? and actual single tournament results are not a reliable method of measurement for sample size reasons?
On August 31 2011 20:28 Sina92 wrote: why are people so obsessed with rating players?
Because it is fundamental for people who are either betting on stuff or running betting sites. As for me, I just like the mathematical challenge inside of it, how to calculate a player's "skill" just through his results.
On August 31 2011 19:33 MasterOfChaos wrote: IMO WHR is the best rating system I've seen so far. I've been long thinking about implementing it myself. Do you hardcode the parameters, or does your program automatically optimize some of them(for example the rank-time correlation parameter).
An interesting way to improve this might be introducing a per-matchup rating. Of course with some force pushing them towards their average.
Another might be a map imbalance parameter that represents a boost to the player skill difference. Then you optimize the map imbalance parameters like all other parameters in your iteration step.
I just manually tuned the parameters by looking at the results and adjusting them. If you're interested in looking at the code the key parameters are: PRIOR_WEIGHT = 2.0 # How strong players are assumed to be average 2000 skill LINK_STRENGTH = 500.0 # How fast players skill can change in time
On August 31 2011 20:25 Toppp wrote: Seal is 11.... -_-
He hasn't even qualified for Code A or been to any tournaments... (except GSTL if you count that)
Yes I noticed that too, what happened for Seal is he played a few games in 2010, and then didn't have any results for almost a year. So when he started back in July 2011, his rating is very uncertain and therefore moves very quickly. And he has done very well in his games since July 2011, going 5-1. I will look into accounting for uncertainty. See below a more detailed view of his results and how the algorithm reacts.
On August 31 2011 20:02 kenkou wrote: BitByBit at 97. Something is wrong. I'm guessing it doesn't take into account how long the player hasn't played?
Yes that's about what happened. BitByBit's made a deep run in an early GSL and then disappeared from major LANs (GSL, GSTL). If he still played I assume he's doing not so well, but I don't have those results in here. Another way to deal with this would be similar to Seal, by accounting for uncertainty.
Why?
Well it's just a fun hobby for me. Fellow ratings math nerds understand. ;-)
On September 01 2011 00:55 Montana[TK] wrote: July at 9 DongRaeGu at 17
makes no sense whatsoever whichever way you look at it.
Do you not grasp this concept?
I understand the concept, but a few recent wins against the likes of Hongun and Ensnare shouldn't count more than the months of utter dominance DRG displayed, especially considering he just all-killed Prime and came 3rd in MLG.
All I'm saying is I wanna see how the ranking system came to those conclusions and if I'm overseeing anything.