|
I'm just in the last stages of creating a mod which can be added to any map and will keep track of the players Elo rating.
For those of you who do not know what an Elo rating is it is the ratings system used in chess and many other competitions (including League of Legends).
This is so that people in clans who practice against eachother in custom games already can have a rating to compare each other with. Simply looking at things like ladder ranking will not be an effective way to compare players. This mod was origionally a request by one of the best clans from SEA but I decided to make it more general purpose so it could be used by many different clans/communities. In fact, many people have been requesting that Blizzards public ladder be an Elo system from the very release of SC2.
How it works is: # join the game with your opponent. # you can create a new tournament and provide a password if you wish it to be private # select which tournaments you would like to be rated in. # click on "compete" to activate a tournament or "opt out" to deactivate it # When either player clicks on "ready" the game will begin. # each players score will be recorded based on win/loss and updated for the next game.
Once it has been set up how you like it the first game simply click on the "ready" button each time you play and you don't even have to think about it other than that. It will just keep tracking your Elo rating across all games and all maps with this mod in it providing a usefull and meaningful comparisson of players in your clan/friends group/community.
I have included some pics to demonstrate the creation/selection of tournaments
at the start of the game: + Show Spoiler +
One player inviting the other to a tournament + Show Spoiler +
Players click on "compete" to enter the tournament + Show Spoiler +
Demonstration of me trying to cheat by editing my bank file to artificially increase my rating in "Global" (notice that it has been erased from my profile) and providing a false password to try and join the "Evil Geniuses" group. + Show Spoiler +
Notice that in the "active tournament" list each tournament has the percentage chance for each player to win calculated by their ratings.
NOTE: To demonstrate I just picked some well known names randomly from the EG/TL clans. I don't mean to imply anything about their relative skill levels.
Currently this will not work on screen resolutions < 1280x720. That will be one of the first things I will be fixing.
I am currently looking for people to help test the mod or to just provide general feedback. at the moment there is only 1 map released with this mod which is "ELO Cloud Kingdom". Just search for Elo on NA server and it should be listed. Thanks and I look forward to hearing what you all think.
Thanks for your time. Turtles.
Elo details: + Show Spoiler + In the near future more customization will be possible but for the moment players start with a rating of 1000. K is set at 15 and there is a minimum limit set on ratings to prevent deflation. I will soon be putting in measures to stop inflation as well as maybe giving the computer AI a fixed Elo rating for each difficulty level to help stabalize scores.
Anti-cheating measures: + Show Spoiler +"a chain is only as strong as it's weakest link". Currently all maps can be cracked and any methods to secure data in a bank file reverse engineered (there is nothing anyone but Blizzard can do about this). Players Elo scores are protected with a one way hash function such that some joe-schmoe can't simply just go and edit them directly. However, I have no doubt that some bright cookie with WWAAAAYYYY too much time on his hands would be able to cheat and give himself a higher rating + Show Spoiler +
|
The UI looks good.
Is Elo the best rating system for a game like Starcraft? Systems like TrueSkill or Glicko are maybe better. The main difference is that they use more than one value, so one can also see the estimated confidence (accuracy) in the rating.
(By the way it is Elo, not ELO.)
|
ELO is atleast better than the actual system. Glicko is ELO with deviation?
|
Is Elo the best rating system for a game like Starcraft? Systems like TrueSkill or Glicko are maybe better.
I'm glad you asked
I will be putting a check box in the "create new tournament" dialog so you can chose Elo or Glicko (maybe other systems such as TrueSkill). I have had this intention from the begining so it will not be hard to do.
The hard bit is getting the dialogs set up so that the two players can interact with the lists of tournaments simultaniously which I think I have mostly bugless by now (or at least only small squishy bugs).
Elo is the most widely known algorithm and what was origionally requested so thats how I started it.
Glicko is ELO with deviation? In a nut shell, yes.
By the way it is Elo, not ELO. Thanks. Fixed.
|
About the Elo name: The system was devised by Arpad Elo, "Elo" is no acronym.
I don't know about the details of how you store the points. Maybe a solution could be, to – unless one opts out – calculate the points for any system which is implemented.
I am no rating system expert, so the following should be taken with that in mind. If we want to have a meaning full skill value instead of just a number which somehow is intended to reflect the skill, we should keep in mind that a win or loss streak maybe doesn't mean that much in Starcraft, but most rating systems react quite sensible.
The test of the rating accuracy could be to convert ranking difference in winning probability and check how good the prediction is. To lower the noise factor of course a good number of games has to be considered.
But unless a ranking system (be it Elo or Glicko) does allow predictions, I think the rating would be just a mathimatical play with no application.
|
These are all bridges to be crossed when I get to them :D
I want to make sure that the basic system works well before expanding on it and it still needs a little work.
I would like to eventually accumulate data on how accurate the predictions are. Without getting into the tricky details I can think of ways to do this while preventing duplicates of data. But it would be a tricky task.
Elo assumes that a players performance will be normally distributed which might not be the case for SC2. If I were to gues I would think player performance would have a gradual slope on the low side and a steep slope on the higher side. Meaning that a player will usually play within a certain level of ability. On a good day they can get a slight edge and perform a bit better than normal but if their play is off then they can play really poorly at a much lower level than their normal play.
If I accumulate data that shows a different pattern then the algorithm could be adjusted. However I think people trust systems like Elo and Glicko and would prefer it to some custom ranking scheme I created even if it was more accurate.
|
I fully back the Turtles system.
Honestly, the Turtles system would be the most hilarious system for tracking rank.
"Hey iNcontroL, what's your Turtles rank?" "OMG, Fantasy just got more Turtles than Flash!" "You can't really compare SC2 Turtles to BW Turtles, that just doesn't make sense."
|
Obviously the turtles rank would be percentage of how good you are compared to others ranging from 0% to 99.9% which would be the highest available score.
I would just set it so that I was the only person in the world who was 100% turtles
+ Show Spoiler + j/k
Pretty sure existing ranking systems are plenty "good enough" for the task. They also have the advantage that a lot of people already know how they work and have been shown to be useful in other fields
|
Canada1009 Posts
Hm I'm intrigued, gonna try this out tonight.
|
I'm not 100% comfortable with the higher level theoretical implications of trueskill, but one thing about trueskill is that it was created to deal with games involving multiple teams, each team with one or more players. That is where the main advantage of Trueskill lies for Microsoft, and trueskill isn't necessarily all that much better for head-to-head, 1v1 games. Further, since you don't have a matchmaking system implemented, the trueskill algorithm would be dealing with much larger discrepancies in skill and uncertainty than it is intended to, which I think may cause some wacky results (I'd have to double check this, I haven't looked over the math behind it in a few months).
|
|
|
|