"Deep Blue" for SC2? - Page 12
Forum Index > SC2 General |
Account252508
3454 Posts
| ||
Caelestis
United States58 Posts
On November 23 2012 04:53 rathe wrote: That's really interesting. How you know the solution exists and how could you prove it was optimal in an SC2 context? Starcraft is an example of an extensive form game. The only questionable requirement that the game might not meet is the well-ordering of players' actions since both players make actions at the same time. But as people have figured out, you can simply break down time into a series of discrete time steps where players simply "pass" on that time step if they didn't make a move. Once you have formulated a game in this way, Nash proved that there exists an optimal strategy (a set of strategies) such that no rational player could unilaterally deviate from this strategy and benefit. This optimal strategy is what he called the Nash equalibrium. He won the Nobel prize for this proof because of its wide-sweeping consequences in many different types of games. So what does this strategy look like? It is a rooted tree, where each level of the tree represents either an action taken by the player, or a "pass" in which case he does nothing. The nodes represent the a game state. The edges represent the action taken. Some nodes belong to you, others belong to your opponent. You know the game state of your node since you have perfect knowledge of your units/economy. Your opponents game state is modeled as a probability distribution. At every point in the game where you encounter a decision based on imperfect information, your next possible states are also described as a probability distribution. So the AI makes a decision at every time-step by taking a random sample out of a probability distribution, and executing the action based on that result. (This is why you cannot guarantee 100% win rate, but you can guarantee the best win rate possible). To be more specific without being wrong: A edge in this game-tree for SC might be "move a unit to position (x,y)" or "use X unit to build Y building in location Z". After you do this, the edges that come out of this node are your opponents actions (remember we discretized time properly to allow this). He will either pass or not. Assuming he does not pass, his actions are unknown to you (if not scouted), so you enumerate all possible actions and assign them probabilities. From there, it is your turn again and you proceed to expand this game tree like so until you reach the final node where you or your opponent win. Based on this complete tree, you can calculate the benefit of taking each action and take the action corresponding to the largest benefit for you. A note about exploitation: The nash strategy is immune to exploitation because it is the best. However, it is possible to "win by a larger margin" than the nash strategy if you recognize the opportunity to exploit a specific opponent. When you do this, you open yourself up to exploitation because you deviate from the nash strategy. But this is done routinely and with great effect such as in poker, where winning isn't what counts, but rather how much you win by. To exploit someone, you simply input an a-priori probability distribution rather than sampling from a uniform distribution. At a high level, this could mean "even though my opponent has 10 strategies he can use, I predict he will use 1 more than 2" or somehting like that. This is the precise mechanism known to many as "metagaming". As you can see, metagaming might help you exploit a particular strategy harder, but also opens yourself up to exploitation. In the game of starcraft, if people could actually figure out the nash strategy, no one would metagame since only winning and losing matters, not by how much. | ||
CruelZeratul
Germany4588 Posts
| ||
Caelestis
United States58 Posts
On November 23 2012 05:21 monkybone wrote: Theoretically it's not actually certain that a solution exists. It's possible to conceive of a game where any person who attacks necessarily loses, and damage can be dealt and healed in order to deny a draw. This is not an extensive form game as there are no terminal nodes. Starcraft also doesn't work like this game as it must be possible for a person to win. So yes, in the hypothetical game you described, there is no optimal solution since no one can win. But in all extensive form games including starcraft, there *provably* exists an optimal solution. For the exact details of the proof, you should read the Nash equalibrium paper or at least the wiki. | ||
Morphs
Netherlands645 Posts
| ||
Account252508
3454 Posts
| ||
Account252508
3454 Posts
| ||
Singularity
Sweden142 Posts
| ||
Caelestis
United States58 Posts
On November 23 2012 06:49 monkybone wrote: Considering you have a finite number of options in every unit of time, isn't it obvious that an optimal strategy exists? No it's not obvious. In fact, in Chess, if there was no special rule that said a series of repeated moves forced a stalemate, then an optimal strategy wouldn't exist. There are still only a finite number of moves you can make for each piece at each time step, but it was a special rule that allowed it to satisfy finite extensive-game form. Other examples of games exist where there are no optimal solutions as extensive-form games are not the only class of games. The key is that a finite number of actions must progress the game to an end state and that a player must somehow be forced to make such a move until the game ends. | ||
Caelestis
United States58 Posts
On November 23 2012 06:52 monkybone wrote: By the way, is it within the rules to do several things in the same increment of time? Like building a marine while moving a marine in the same instant. That's not possible (even theoretically) to do with a mouse and keyboard, so stuff like automaton micro might just not be possible, on grand scales at least. I'm not exactly sure how the transformation works but it's something like this: Take a game where 2 players just make actions at arbitrary times. These actions, at some fine enough granularity, become ordered. Now, view Starcraft as a turn-based game, where the turn switches between each player at each time interval. A player's action then can be viewed as a series of actions and passes. I have no idea how automaton works, but at least from the tank dodging demo, I know that it would be impossible for an AI to accomplish that feat in a real game. Namely because in general, you should have no idea which unit the enemy tank chooses to target (and instantly hits). You can definitely preempt projectiles though, and I'm sure that optimal bot does lots of that as good players even do that (for example in blink vs. roach micro). =) | ||
Account252508
3454 Posts
| ||
Rassy
Netherlands2308 Posts
That picture doesnt realy say everything nor is it correct, though its a nice picture to give some idea. The reason why computers can beat human players with starcraft is because the have better mechanics (the marines avoiding the banelings is purely mechanical, the human knows he needs to split all his marines, he is just mechanically not able to do this), not because they have a better strategy! It only prooves that the mechanics in sc are to hard for humans, it says little about the true depth of sc. sc will never be solved, it is way more complex then go for example (in terms of calculations), that doesnt mean computers wont beat humans, they will but mostly because of better mechanics, not strategy. | ||
Matt_D_
Australia14 Posts
On November 23 2012 06:49 monkybone wrote: Considering you have a finite number of options in every unit of time, isn't it obvious that an optimal strategy exists? yes, an optimal strategy exists. there's a catch however. this class of problems is known as NP (http://en.wikipedia.org/wiki/NP_(complexity)) there's no way of simplifying the problem, the only way to actually work out the "optimal" solution is to actually look at every possible solution and then work out which is the best. this is however not something which we can call "timely". (see: postman problem (http://en.wikipedia.org/wiki/Postman_problem)) you could pre-process every possible solution for every possible moment of time for every possible combination, but it should be obvious by now that this would be such a large data set that its unlikely we'd be able to store even if we used all the atoms in the universe. this is why you'll find that most AI (and humans) use specific "break points" in their strategy thinking. we take our experiences (or our pre-processing) and create points where things commonly diverge. think "oh you're making reapers, so ill leave some units near my mineral line", or " you're making brood lords, so ill make start viking production" note that those are very very high level "concepts". we can call them goals. whether we can actually act on those goals is another problem entirely. take for example "hey move some units to my mineral line to protect from reapers". well, do you actually HAVE units? if you dont, can you afford to build them? do you need to build structures to build units? how does this effect the rest of your strategy? (this implies a feedback loop between goals, and the mechanics of achieving goals, tricker than it sounds). you, being human, do this on the fly, all the time. its part of what being human is. we're fantastic at pattern matching, at historically evaluating experiences. for computers though, and really, computers are just really really fast calculators, this is a pretty awkward thing to implement. a lot of what we do as humans is "fuzzy" to avoid the complexity of solving the actual problem (see cognitive biases (http://en.wikipedia.org/wiki/Cognitive_bias) and logical fallacies ( http://en.wikipedia.org/wiki/Logical_fallacy)). we also get a lot of stuff utterly wrong because of this being aware of what your brain is doing is utterly fascinating. anyway, i doubt this was what you were expecting as a response. AI is fun. i'd love to have a crack at writing some actual sc2 AI, but its far far harder than most people think, | ||
GoldenH
1115 Posts
On November 23 2012 06:58 Caelestis wrote: No it's not obvious. In fact, in Chess, if there was no special rule that said a series of repeated moves forced a stalemate, then an optimal strategy wouldn't exist. There are still only a finite number of moves you can make for each piece at each time step, but it was a special rule that allowed it to satisfy finite extensive-game form. Other examples of games exist where there are no optimal solutions as extensive-form games are not the only class of games. The key is that a finite number of actions must progress the game to an end state and that a player must somehow be forced to make such a move until the game ends. Such a solution does exist in Starcraft 2, though, where after you have mined all the minerals on the map the game triggers stalemate detection. The rules for drawing in Starcraft 2 and Chess are equally rediculous, so I don't see a problem here.. | ||
infinity21
Canada6683 Posts
If there is an AI that has the decision making skills of professional players, the micro difference alone will be massive enough to ensure that the AI will win almost every game. | ||
Dekoth
United States527 Posts
On November 11 2012 13:09 TheGreenMachine wrote: this is automaton 2000 using zerglings to micro against seige tanks, its really awesome The problem is this isn't really micro..While it certainly is cool and it was neat when it first came out. The AI has a distinctly unfair advantage here. It knows before the shot is fired which individual ling is targeted and all units move away from it. A Human player has no possible means of knowing which individual unit is targeted by the take until it actually hits. So while neat to watch, it is extremely misleading. I wouldn't call this perfect micro or even micro. It is just units instantly fleeing from the unit that is targeted. | ||
Caelestis
United States58 Posts
On November 23 2012 09:10 Matt_D_ wrote: yes, an optimal strategy exists. there's a catch however. this class of problems is known as NP (http://en.wikipedia.org/wiki/NP_(complexity)) Finding the optimal strategy is not in NP. It is EXPTIME-complete. A polynomial-time oracle cannot verify the correct solution because the size of the solution is not even polynomial. I explained in a previous post why the size of the solution is exponential. | ||
Account252508
3454 Posts
| ||
Caelestis
United States58 Posts
On November 23 2012 11:25 monkybone wrote: Maybe a layman explanation why polynomial time is so significantly preferable to exponential time would benefit the thread? It's not very consequential either way. I just wanted to correct a factual mistake for those who do understand complexity theory. People know that this solution exists somewhere out there but it is going to take MASSIVE computing power to find it. Whether or not it is in NP or EXPTIME is just a matter of how many more billions of years it will take using today's technology. =) | ||
Incanus
Canada695 Posts
On November 23 2012 07:08 Caelestis wrote: I have no idea how automaton works, but at least from the tank dodging demo, I know that it would be impossible for an AI to accomplish that feat in a real game. Namely because in general, you should have no idea which unit the enemy tank chooses to target (and instantly hits). You can definitely preempt projectiles though, and I'm sure that optimal bot does lots of that as good players even do that (for example in blink vs. roach micro). =) As long as there is no randomness involved, which I'm pretty sure there isn't, it's possible to simulate the targetting algorithm tanks use. Since tanks are stationary well ahead of time and you're in control of the other variables (the movement of the lings) you have all the information you need to dodge tank shots. Of course, unless they're doing manual targetting. On November 23 2012 10:09 Dekoth wrote: The problem is this isn't really micro..While it certainly is cool and it was neat when it first came out. The AI has a distinctly unfair advantage here. It knows before the shot is fired which individual ling is targeted and all units move away from it. A Human player has no possible means of knowing which individual unit is targeted by the take until it actually hits. So while neat to watch, it is extremely misleading. I wouldn't call this perfect micro or even micro. It is just units instantly fleeing from the unit that is targeted. The same could be said of human players. You have access to all the variables that you need. Although it would be practically impossible to calculate fast enough for sure (not to mention execution), human players could do what they always do, make imperfect guesses. | ||
| ||