Iterated Prisoner's Trilemma
up vote
18
down vote
favorite
CHALLENGE STATUS: OPEN
Comment, open a PR, or otherwise yell at me if I'm missing your bot.
Prisoner's dilemma ... with three choices. Crazy, huh?
Here's our payoff matrix. Player A on the left, B on the top
A,B| C | N | D
---|---|---|---
C |3,3|4,1|0,5
N |1,4|2,2|3,2
D |5,0|2,3|1,1
The payoff matrix is engineered so that it's best for both players to always cooperate, but you can gain (usually) by choosing Neutral or Defection.
Here's some (competing) example bots.
# turns out if you don't actually have to implement __init__(). TIL!
class AllC:
def round(self, _): return "C"
class AllN:
def round(self, _): return "N"
class AllD:
def round(self, _): return "D"
class RandomBot:
def round(self, _): return random.choice(["C", "N", "D"])
# Actually using an identically-behaving "FastGrudger".
class Grudger:
def __init__(self):
self.history =
def round(self, last):
if(last):
self.history.append(last)
if(self.history.count("D") > 0):
return "D"
return "C"
class TitForTat:
def round(self, last):
if(last == "D"):
return "D"
return "C"
Your bot is a Python3 class. A new instance is created for every game, and round()
is called each round, with your opponent's choice from last round (or None, if it's the first round)
There's a 50 rep bounty for the winner in like a month.
Specifics
- Every bot plays every other bot (1v1), including itself, in [REDACTED] rounds.
- Standard loopholes disallowed.
- No messing with anything outside your class or other underhanded shenanigains.
- You may submit up to five bots.
- Yes, you can implement handshake.
- Any response other than
C
,N
, orD
will be silently taken asN
. - Each bot's points from every game they play will be totaled up and compared.
Controller
Check!
Other Languages
I'll throw together an API if anyone needs it.
Scores: 2018-11-13 11:15 EST
20 bots, 400 games.
name | score/bot/round
----------------|----------------
PatternFinder | 6.408
EvaluaterBot | 5.807
Ensemble | 5.463
Nash2 | 5.400
LastOptimalBot | 5.283
FastGrudger | 5.236
Number6 | 5.180
WeightedAverage | 5.114
HistoricAverage | 5.049
OldTitForTat | 4.849
TitForTat | 4.641
Nash | 4.238
AllD | 4.092
Jade | 4.064
CopyCat | 4.008
RandomBot | 3.975
TatForTit | 3.715
AllC | 3.456
NeverCOOP | 3.241
AllN | 3.112
king-of-the-hill python
|
show 8 more comments
up vote
18
down vote
favorite
CHALLENGE STATUS: OPEN
Comment, open a PR, or otherwise yell at me if I'm missing your bot.
Prisoner's dilemma ... with three choices. Crazy, huh?
Here's our payoff matrix. Player A on the left, B on the top
A,B| C | N | D
---|---|---|---
C |3,3|4,1|0,5
N |1,4|2,2|3,2
D |5,0|2,3|1,1
The payoff matrix is engineered so that it's best for both players to always cooperate, but you can gain (usually) by choosing Neutral or Defection.
Here's some (competing) example bots.
# turns out if you don't actually have to implement __init__(). TIL!
class AllC:
def round(self, _): return "C"
class AllN:
def round(self, _): return "N"
class AllD:
def round(self, _): return "D"
class RandomBot:
def round(self, _): return random.choice(["C", "N", "D"])
# Actually using an identically-behaving "FastGrudger".
class Grudger:
def __init__(self):
self.history =
def round(self, last):
if(last):
self.history.append(last)
if(self.history.count("D") > 0):
return "D"
return "C"
class TitForTat:
def round(self, last):
if(last == "D"):
return "D"
return "C"
Your bot is a Python3 class. A new instance is created for every game, and round()
is called each round, with your opponent's choice from last round (or None, if it's the first round)
There's a 50 rep bounty for the winner in like a month.
Specifics
- Every bot plays every other bot (1v1), including itself, in [REDACTED] rounds.
- Standard loopholes disallowed.
- No messing with anything outside your class or other underhanded shenanigains.
- You may submit up to five bots.
- Yes, you can implement handshake.
- Any response other than
C
,N
, orD
will be silently taken asN
. - Each bot's points from every game they play will be totaled up and compared.
Controller
Check!
Other Languages
I'll throw together an API if anyone needs it.
Scores: 2018-11-13 11:15 EST
20 bots, 400 games.
name | score/bot/round
----------------|----------------
PatternFinder | 6.408
EvaluaterBot | 5.807
Ensemble | 5.463
Nash2 | 5.400
LastOptimalBot | 5.283
FastGrudger | 5.236
Number6 | 5.180
WeightedAverage | 5.114
HistoricAverage | 5.049
OldTitForTat | 4.849
TitForTat | 4.641
Nash | 4.238
AllD | 4.092
Jade | 4.064
CopyCat | 4.008
RandomBot | 3.975
TatForTit | 3.715
AllC | 3.456
NeverCOOP | 3.241
AllN | 3.112
king-of-the-hill python
1
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
2 days ago
1
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
2 days ago
1
Done. This was on the sandbox for like a month!
– Blacksilver
2 days ago
2
If you wrap most of main.py inwhile len(botlist) > 1:
withbotlist.remove(lowest_scoring_bot)
at the bottom of the loop, you get an elimination tournament with interesting results.
– Sparr
2 days ago
1
Another version of this someday might pass the entire interaction history rather than just the last move. It doesn't change much although it simplifies user code slightly. But it would allow for extensions, such as noisy communication channels that clarify over time: "Really, a D, even though I've said C four times in a row? No, I didn't say D; what do you take me for? Oh, sorry, can we just forget that round?"
– Scott Sauyet
2 days ago
|
show 8 more comments
up vote
18
down vote
favorite
up vote
18
down vote
favorite
CHALLENGE STATUS: OPEN
Comment, open a PR, or otherwise yell at me if I'm missing your bot.
Prisoner's dilemma ... with three choices. Crazy, huh?
Here's our payoff matrix. Player A on the left, B on the top
A,B| C | N | D
---|---|---|---
C |3,3|4,1|0,5
N |1,4|2,2|3,2
D |5,0|2,3|1,1
The payoff matrix is engineered so that it's best for both players to always cooperate, but you can gain (usually) by choosing Neutral or Defection.
Here's some (competing) example bots.
# turns out if you don't actually have to implement __init__(). TIL!
class AllC:
def round(self, _): return "C"
class AllN:
def round(self, _): return "N"
class AllD:
def round(self, _): return "D"
class RandomBot:
def round(self, _): return random.choice(["C", "N", "D"])
# Actually using an identically-behaving "FastGrudger".
class Grudger:
def __init__(self):
self.history =
def round(self, last):
if(last):
self.history.append(last)
if(self.history.count("D") > 0):
return "D"
return "C"
class TitForTat:
def round(self, last):
if(last == "D"):
return "D"
return "C"
Your bot is a Python3 class. A new instance is created for every game, and round()
is called each round, with your opponent's choice from last round (or None, if it's the first round)
There's a 50 rep bounty for the winner in like a month.
Specifics
- Every bot plays every other bot (1v1), including itself, in [REDACTED] rounds.
- Standard loopholes disallowed.
- No messing with anything outside your class or other underhanded shenanigains.
- You may submit up to five bots.
- Yes, you can implement handshake.
- Any response other than
C
,N
, orD
will be silently taken asN
. - Each bot's points from every game they play will be totaled up and compared.
Controller
Check!
Other Languages
I'll throw together an API if anyone needs it.
Scores: 2018-11-13 11:15 EST
20 bots, 400 games.
name | score/bot/round
----------------|----------------
PatternFinder | 6.408
EvaluaterBot | 5.807
Ensemble | 5.463
Nash2 | 5.400
LastOptimalBot | 5.283
FastGrudger | 5.236
Number6 | 5.180
WeightedAverage | 5.114
HistoricAverage | 5.049
OldTitForTat | 4.849
TitForTat | 4.641
Nash | 4.238
AllD | 4.092
Jade | 4.064
CopyCat | 4.008
RandomBot | 3.975
TatForTit | 3.715
AllC | 3.456
NeverCOOP | 3.241
AllN | 3.112
king-of-the-hill python
CHALLENGE STATUS: OPEN
Comment, open a PR, or otherwise yell at me if I'm missing your bot.
Prisoner's dilemma ... with three choices. Crazy, huh?
Here's our payoff matrix. Player A on the left, B on the top
A,B| C | N | D
---|---|---|---
C |3,3|4,1|0,5
N |1,4|2,2|3,2
D |5,0|2,3|1,1
The payoff matrix is engineered so that it's best for both players to always cooperate, but you can gain (usually) by choosing Neutral or Defection.
Here's some (competing) example bots.
# turns out if you don't actually have to implement __init__(). TIL!
class AllC:
def round(self, _): return "C"
class AllN:
def round(self, _): return "N"
class AllD:
def round(self, _): return "D"
class RandomBot:
def round(self, _): return random.choice(["C", "N", "D"])
# Actually using an identically-behaving "FastGrudger".
class Grudger:
def __init__(self):
self.history =
def round(self, last):
if(last):
self.history.append(last)
if(self.history.count("D") > 0):
return "D"
return "C"
class TitForTat:
def round(self, last):
if(last == "D"):
return "D"
return "C"
Your bot is a Python3 class. A new instance is created for every game, and round()
is called each round, with your opponent's choice from last round (or None, if it's the first round)
There's a 50 rep bounty for the winner in like a month.
Specifics
- Every bot plays every other bot (1v1), including itself, in [REDACTED] rounds.
- Standard loopholes disallowed.
- No messing with anything outside your class or other underhanded shenanigains.
- You may submit up to five bots.
- Yes, you can implement handshake.
- Any response other than
C
,N
, orD
will be silently taken asN
. - Each bot's points from every game they play will be totaled up and compared.
Controller
Check!
Other Languages
I'll throw together an API if anyone needs it.
Scores: 2018-11-13 11:15 EST
20 bots, 400 games.
name | score/bot/round
----------------|----------------
PatternFinder | 6.408
EvaluaterBot | 5.807
Ensemble | 5.463
Nash2 | 5.400
LastOptimalBot | 5.283
FastGrudger | 5.236
Number6 | 5.180
WeightedAverage | 5.114
HistoricAverage | 5.049
OldTitForTat | 4.849
TitForTat | 4.641
Nash | 4.238
AllD | 4.092
Jade | 4.064
CopyCat | 4.008
RandomBot | 3.975
TatForTit | 3.715
AllC | 3.456
NeverCOOP | 3.241
AllN | 3.112
king-of-the-hill python
king-of-the-hill python
edited yesterday
asked 2 days ago
Blacksilver
425314
425314
1
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
2 days ago
1
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
2 days ago
1
Done. This was on the sandbox for like a month!
– Blacksilver
2 days ago
2
If you wrap most of main.py inwhile len(botlist) > 1:
withbotlist.remove(lowest_scoring_bot)
at the bottom of the loop, you get an elimination tournament with interesting results.
– Sparr
2 days ago
1
Another version of this someday might pass the entire interaction history rather than just the last move. It doesn't change much although it simplifies user code slightly. But it would allow for extensions, such as noisy communication channels that clarify over time: "Really, a D, even though I've said C four times in a row? No, I didn't say D; what do you take me for? Oh, sorry, can we just forget that round?"
– Scott Sauyet
2 days ago
|
show 8 more comments
1
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
2 days ago
1
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
2 days ago
1
Done. This was on the sandbox for like a month!
– Blacksilver
2 days ago
2
If you wrap most of main.py inwhile len(botlist) > 1:
withbotlist.remove(lowest_scoring_bot)
at the bottom of the loop, you get an elimination tournament with interesting results.
– Sparr
2 days ago
1
Another version of this someday might pass the entire interaction history rather than just the last move. It doesn't change much although it simplifies user code slightly. But it would allow for extensions, such as noisy communication channels that clarify over time: "Really, a D, even though I've said C four times in a row? No, I didn't say D; what do you take me for? Oh, sorry, can we just forget that round?"
– Scott Sauyet
2 days ago
1
1
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
2 days ago
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
2 days ago
1
1
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
2 days ago
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
2 days ago
1
1
Done. This was on the sandbox for like a month!
– Blacksilver
2 days ago
Done. This was on the sandbox for like a month!
– Blacksilver
2 days ago
2
2
If you wrap most of main.py in
while len(botlist) > 1:
with botlist.remove(lowest_scoring_bot)
at the bottom of the loop, you get an elimination tournament with interesting results.– Sparr
2 days ago
If you wrap most of main.py in
while len(botlist) > 1:
with botlist.remove(lowest_scoring_bot)
at the bottom of the loop, you get an elimination tournament with interesting results.– Sparr
2 days ago
1
1
Another version of this someday might pass the entire interaction history rather than just the last move. It doesn't change much although it simplifies user code slightly. But it would allow for extensions, such as noisy communication channels that clarify over time: "Really, a D, even though I've said C four times in a row? No, I didn't say D; what do you take me for? Oh, sorry, can we just forget that round?"
– Scott Sauyet
2 days ago
Another version of this someday might pass the entire interaction history rather than just the last move. It doesn't change much although it simplifies user code slightly. But it would allow for extensions, such as noisy communication channels that clarify over time: "Really, a D, even though I've said C four times in a row? No, I didn't say D; what do you take me for? Oh, sorry, can we just forget that round?"
– Scott Sauyet
2 days ago
|
show 8 more comments
17 Answers
17
active
oldest
votes
up vote
9
down vote
EvaluaterBot
class EvaluaterBot:
def __init__(self):
self.c2i = {"C":0, "N":1, "D":2}
self.i2c = {0:"C", 1:"N", 2:"D"}
self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.last = [None, None]
def round(self, last):
if self.last[0] == None:
ret = 2
else:
# Input the latest enemy action (the reaction to my action 2 rounds ago)
# into the history
self.history[self.last[0]][self.c2i[last]] += 1
# The enemy will react to the last action I did
prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])
ret = (prediction - 1) % 3
self.last = [self.last[1], ret]
return self.i2c[ret]
Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.
Yep, beats everything.
– Blacksilver
2 days ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
2 days ago
add a comment |
up vote
7
down vote
TatForTit
class TatForTit:
def round(self, last):
if(last == "C"):
return "N"
return "D"
This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.
add a comment |
up vote
6
down vote
NashEquilibrium
This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.
class NashEquilibrium:
def round(self, _):
a = random.random()
if a <= 0.2:
return "C"
elif a <= 0.6:
return "N"
else:
return "D"
Constant Abusing Nash Equilibrium
This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.
It's only downside is that it averages 2.2 points per turn playing against itself.
class NashEquilibrium2:
def __init__(self):
self.opphistory = [None, None, None]
self.titfortatcounter = 0
self.titfortatflag = 0
self.mylast = "C"
self.constantflag = 0
self.myret = "C"
def round(self, last):
self.opphistory.pop(0)
self.opphistory.append(last)
# check if its a constant bot, if so exploit
if self.opphistory.count(self.opphistory[0]) == 3:
self.constantflag = 1
if last == "C":
self.myret = "D"
elif last == "N":
self.myret = "C"
elif last == "D":
self.myret = "N"
# check if its a titfortat bot, if so exploit
# give it 2 chances to see if its titfortat as it might happen randomly
if self.mylast == "D" and last == "D":
self.titfortatcounter = self.titfortatcounter + 1
if self.mylast == "D" and last!= "D":
self.titfortatcounter = 0
if self.titfortatcounter >= 3:
self.titfortatflag = 1
if self.titfortatflag == 1:
if last == "C":
self.myret = "D"
elif last == "D":
self.myret = "N"
elif last == "N":
# tit for tat doesn't return N, we made a mistake somewhere
self.titfortatflag = 0
self.titfortatcounter = 0
# else play the single game nash equilibrium
if self.constantflag == 0 and self.titfortatflag == 0:
a = random.random()
if a <= 0.2:
self.myret = "C"
elif a <= 0.6:
self.myret = "N"
else:
self.myret = "D"
self.mylast = self.myret
return self.myret
New contributor
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
2 days ago
Thank you fixed it
– Omer Faruk Yatkin
2 days ago
Little shorter:class NashEquilibrium: def round(self, _): a = random.random() for k, v in [(0.2, "C"), (0.6, "N"), (1, "D")]: if a <= k: return v
– Robert Grant
yesterday
add a comment |
up vote
5
down vote
PatternFinder
class PatternFinder:
def __init__(self):
import collections
self.size = 10
self.moves = [None]
self.other =
self.patterns = collections.defaultdict(list)
self.counter_moves = {"C":"D", "N":"C", "D":"N"}
self.initial_move = "D"
self.pattern_length_exponent = 1
self.pattern_age_exponent = 1
self.debug = False
def round(self, last):
self.other.append(last)
best_pattern_match = None
best_pattern_score = None
best_pattern_response = None
self.debug and print("match so far:",tuple(zip(self.moves,self.other)))
for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):
# record patterns ending with the move that just happened
pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))
if len(pattern_full) > 1:
pattern_trunc = pattern_full[:-1]
pattern_trunc_result = pattern_full[-1][1]
self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])
if pattern_full in self.patterns:
# we've seen this pattern at least once before
self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])
for [response,turn_num] in self.patterns[pattern_full]:
score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent
if best_pattern_score == None or score > best_pattern_score:
best_pattern_match = pattern_full
best_pattern_score = score
best_pattern_response = response
# this could be much smarter about aggregating previous responses
if best_pattern_response:
move = self.counter_moves[best_pattern_response]
else:
# fall back to playing nice
move = "C"
self.moves.append(move)
self.debug and print("I choose",move)
return move
This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
2 days ago
2
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
2 days ago
1
Maybe using a highly composite number (i.e., 12) would score better?
– Blacksilver
2 days ago
add a comment |
up vote
4
down vote
Jade
class Jade:
def __init__(self):
self.dRate = 0.001
self.nRate = 0.003
def round(self, last):
if last == 'D':
self.dRate *= 1.1
self.nRate *= 1.2
elif last == 'N':
self.dRate *= 1.03
self.nRate *= 1.05
self.dRate = min(self.dRate, 1)
self.nRate = min(self.nRate, 1)
x = random.random()
if x > (1 - self.dRate):
return 'D'
elif x > (1 - self.nRate):
return 'N'
else:
return 'C'
Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.
add a comment |
up vote
4
down vote
OldTitForTat
Old school player is too lazy to update for the new rules.
class OldTitForTat:
def round(self, last):
if(last == None)
return "C"
if(last == "C"):
return "C"
return "D"
add a comment |
up vote
3
down vote
NeverCOOP
class NeverCOOP:
def round(self, last):
try:
if last in "ND":
return "D"
else:
return "N"
except:
return "N"
If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...
New contributor
What's the try/except for?
– Blacksilver
2 days ago
1
@Blacksilver I'd assume it serves the same purpose as theif(last):
in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
2 days ago
ahh, I see.None in "ND"
errors.
– Blacksilver
2 days ago
Becauseif last and last in "ND":
was too complicated?
– immibis
2 days ago
add a comment |
up vote
3
down vote
LastOptimalBot
class LastOptimalBot:
def round(self, last):
return "N" if last == "D" else ("D" if last == "C" else "C")
Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.
Averages:
Me Opp
2.6 2 vs TitForTat
5 0 vs AllC
4 1 vs AllN
3 2 vs AllD
3.5 3.5 vs Random
3 2 vs Grudger
2 2 vs LastOptimalBot
1 3.5 vs TatForTit
4 1 vs NeverCOOP
1 4 vs EvaluaterBot
2.28 2.24 vs NashEquilibrium
2.91 average overall
oof. Maybe T4T would do better asreturn last
.
– Blacksilver
2 days ago
I'd like that! If TitForTat werereturn last
, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
2 days ago
return last
would be a better T4T for this challenge, I think
– Sparr
2 days ago
Just tried -- theif(last): return last; else: return "C"
is worse.
– Blacksilver
2 days ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
2 days ago
add a comment |
up vote
3
down vote
CopyCat
class CopyCat:
def round(self, last):
if last:
return last
return "C"
Copies the opponent's last move.
I don't expect this to do well, but no one had implemented this classic yet.
New contributor
add a comment |
up vote
2
down vote
Ensemble
This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.
Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.
(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)
It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).
from collections import defaultdict
import random
class Number6:
class Choices:
def __init__(self, C = 0, N = 0, D = 0):
self.C = C
self.N = N
self.D = D
def __init__(self, strategy = "maxExpected", markov_order = 3):
self.MARKOV_ORDER = markov_order;
self.my_choices = ""
self.opponent = defaultdict(lambda: self.Choices())
self.choice = None # previous choice
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
self.total_payoff = 0
# if random, will choose in proportion to payoff.
# otherwise, will always choose argmax
self.strategy = strategy
# maxExpected: maximize expected relative payoff
# random: like maxExpected, but it chooses in proportion to E[payoff]
# argmax: always choose the option that is optimal for expected opponent choice
def update_opponent_model(self, last):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
self.opponent[hist].C += ("C" == last)
self.opponent[hist].N += ("N" == last)
self.opponent[hist].D += ("D" == last)
def normalize(self, counts):
sum = float(counts.C + counts.N + counts.D)
if 0 == sum:
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
return self.Choices(
counts.C / sum, counts.N / sum, counts.D / sum)
def get_distribution(self):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
#print "check hist = " + hist
if hist in self.opponent:
return self.normalize(self.opponent[hist])
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
def choose(self, dist):
payoff = self.Choices()
# We're interested in *beating the opponent*, not
# maximizing our score, so we optimize the difference
payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D
payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D
payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D
# D has slightly better payoff on uniform opponent,
# so we select it on ties
if self.strategy == "maxExpected":
if payoff.C > payoff.N:
return "C" if payoff.C > payoff.D else "D"
return "N" if payoff.N > payoff.D else "D"
elif self.strategy == "randomize":
payoff = self.normalize(payoff)
r = random.uniform(0.0, 1.0)
if (r < payoff.C): return "C"
return "N" if (r < payoff.N) else "D"
elif self.strategy == "argMax":
if dist.C > dist.N:
return "D" if dist.C > dist.D else "N"
return "C" if dist.N > dist.D else "N"
assert(0) #, "I am not a number! I am a free man!")
def update_history(self):
self.my_choices += self.choice
if len(self.my_choices) > self.MARKOV_ORDER:
assert(len(self.my_choices) == self.MARKOV_ORDER + 1)
self.my_choices = self.my_choices[1:]
def round(self, last):
if last: self.update_opponent_model(last)
dist = self.get_distribution()
self.choice = self.choose(dist)
self.update_history()
return self.choice
class Ensemble:
def __init__(self):
self.models =
self.votes =
self.prev_choice =
for order in range(0, 6):
self.models.append(Number6("maxExpected", order))
self.models.append(Number6("randomize", order))
#self.models.append(Number6("argMax", order))
for i in range(0, len(self.models)):
self.votes.append(0)
self.prev_choice.append("D")
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
def round(self, last):
if last:
for i in range(0, len(self.models)):
self.votes[i] += self.payoff[self.prev_choice[i]][last]
# vote. Sufficiently terrible models get negative votes
C = 0
N = 0
D = 0
for i in range(0, len(self.models)):
choice = self.models[i].round(last)
if "C" == choice: C += self.votes[i]
if "N" == choice: N += self.votes[i]
if "D" == choice: D += self.votes[i]
self.prev_choice[i] = choice
if C > D and C > N: return "C"
elif N > D: return "N"
else: return "D"
Test Framework
In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.
import sys, inspect
import opponents
from ensemble import Ensemble
def count_payoff(label, them):
if None == them: return
me = choices[label]
payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
if label not in total_payoff: total_payoff[label] = 0
total_payoff[label] += payoff[me][them]
def update_hist(label, choice):
choices[label] = choice
opponents = [ x[1] for x
in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]
for k in opponents:
total_payoff = {}
for j in range(0, 100):
A = Ensemble()
B = k()
choices = {}
aChoice = None
bChoice = None
for i in range(0, 100):
count_payoff(A.__class__.__name__, bChoice)
a = A.round(bChoice)
update_hist(A.__class__.__name__, a)
count_payoff(B.__class__.__name__, aChoice)
b = B.round(aChoice)
update_hist(B.__class__.__name__, b)
aChoice = a
bChoice = b
print total_payoff
The controller is ready, you didn't have to do all that...
– Blacksilver
2 days ago
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
2 days ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
2 days ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
2 days ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
2 days ago
add a comment |
up vote
2
down vote
Improved Dirichlet Dice
import random
class DirichletDice2:
def __init__(self):
self.alpha = dict(
C = {'C' : 1, 'N' : 1, 'D' : 1},
N = {'C' : 1, 'N' : 1, 'D' : 1},
D = {'C' : 1, 'N' : 1, 'D' : 1}
)
self.myLast = [None, None]
self.payoff = dict(
C = { "C": 0, "N": 3, "D": -5 },
N = { "C": -3, "N": 0, "D": 1 },
D = { "C": 5, "N": -1, "D": 0 }
)
def DirichletDraw(self, key):
alpha = self.alpha[key].values()
mu = [random.gammavariate(a,1) for a in alpha]
mu = [m / sum(mu) for m in mu]
return mu
def ExpectedPayoff(self, probs):
expectedPayoff = {}
for val in ['C','N','D']:
payoff = sum([p * v for p,v in zip(probs, self.payoff[val].values())])
expectedPayoff[val] = payoff
return expectedPayoff
def round(self, last):
if last is None:
self.myLast[0] = 'D'
return 'D'
#update dice corresponding to opponent's last response to my
#outcome two turns ago
if self.myLast[1] is not None:
self.alpha[self.myLast[1]][last] += 1
#draw probs for my opponent's roll from Dirichlet distribution and then return the optimal response
mu = self.DirichletDraw(self.myLast[0])
expectedPayoff = self.ExpectedPayoff(mu)
res = max(expectedPayoff, key=expectedPayoff.get)
#update myLast
self.myLast[1] = self.myLast[0]
self.myLast[0] = res
return res
This is an improved version of Dirichlet Dice. Instead of taking the expected multinomial distribution from the Dirichlet distribution, it draws a Multinomial distribution randomly from the Dirichlet distribution. Then, instead of drawing from the Multinomial and giving the optimal response to that, it gives the optimal expected response to the given Multinomial using the points. So the randomness has essentially been shifted from the Multinomial draw to the Dirichlet draw. Also, the priors are more flat now, to encourage exploration.
It's "improved" because it now accounts for the points system by giving the best expected value against the probabilities, while maintaining its randomness by drawing the probabilities themselves. Before, I tried simply doing the best expected payoff from the expected probabilities, but that did badly because it just got stuck, and didn't explore enough to update its dice. Also it was more predictable and exploitable.
Original submission:
Dirichlet Dice
import random
class DirichletDice:
def __init__(self):
self.alpha = dict(
C = {'C' : 2, 'N' : 3, 'D' : 1},
N = {'C' : 1, 'N' : 2, 'D' : 3},
D = {'C' : 3, 'N' : 1, 'D' : 2}
)
self.Response = {'C' : 'D', 'N' : 'C', 'D' : 'N'}
self.myLast = [None, None]
#expected value of the dirichlet distribution given by Alpha
def MultinomialDraw(self, key):
alpha = list(self.alpha[key].values())
probs = [x / sum(alpha) for x in alpha]
outcome = random.choices(['C','N','D'], weights=probs)[0]
return outcome
def round(self, last):
if last is None:
self.myLast[0] = 'D'
return 'D'
#update dice corresponding to opponent's last response to my
#outcome two turns ago
if self.myLast[1] is not None:
self.alpha[self.myLast[1]][last] += 1
#predict opponent's move based on my last move
predict = self.MultinomialDraw(self.myLast[0])
res = self.Response[predict]
#update myLast
self.myLast[1] = self.myLast[0]
self.myLast[0] = res
return res
Basically I'm assuming that the opponent's response to my last output is a multinomial variable (weighted dice), one for each of my outputs, so there's a dice for "C", one for "N", and one for "D". So if my last roll was, for example, a "N" then I roll the "N-dice" to guess what their response would be to my "N". I begin with a Dirichlet prior that assumes that my opponent is somewhat "smart" (more likely to play the one with the best payoff against my last roll, least likely to play the one with the worst payoff). I generate the "expected" Multinomial distribution from the appropriate Dirichlet prior (this is the expected value of the probability distribution over their dice weights). I roll the weighted dice of my last output, and respond with the one with the best payoff against that dice outcome.
Starting in the third round, I do a Bayesian update of the appropriate Dirichlet prior of my opponent's last response to the thing I played two rounds ago. I'm trying to iteratively learn their dice weightings.
I could have also simply picked the response with the best "expected" outcome once generating the dice, instead of simply rolling the dice and responding to the outcome. However, I wanted to keep the randomness in, so that my bot is less vulnerable to the ones that try to predict a pattern.
New contributor
add a comment |
up vote
2
down vote
Kevin
class Kevin:
def round(self, last):
return {"C":"N","N":"D","D":"C",None:"N"} [last]
Picks the worst choice. The worst bot made.
Useless
import random
class Useless:
def __init__(self):
self.lastLast = None
def round(self, last):
tempLastLast = self.lastLast
self.lastLast = last
if(last == "D" and tempLastLast == "N"):
return "C"
if(last == "D" and tempLastLast == "C"):
return "N"
if(last == "N" and tempLastLast == "D"):
return "C"
if(last == "N" and tempLastLast == "C"):
return "D"
if(last == "C" and tempLastLast == "D"):
return "N"
if(last == "C" and tempLastLast == "N"):
return "D"
return random.choice("CND")
It looks at the last two moves done by the opponent and picks the most not done else it picks something random. There is probably a better way of doing this.
New contributor
add a comment |
up vote
1
down vote
Weighted Average
class WeightedAverageBot:
def __init__(self):
self.C_bias = 1/4
self.N = self.C_bias
self.D = self.C_bias
self.prev_weight = 1/2
def round(self, last):
if last:
if last == "C" or last == "N":
self.D *= self.prev_weight
if last == "C" or last == "D":
self.N *= self.prev_weight
if last == "N":
self.N = 1 - ((1 - self.N) * self.prev_weight)
if last == "D":
self.D = 1 - ((1 - self.D) * self.prev_weight)
if self.N <= self.C_bias and self.D <= self.C_bias:
return "D"
if self.N > self.D:
return "C"
return "N"
The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.
add a comment |
up vote
1
down vote
Tetragram
import itertools
class Tetragram:
def __init__(self):
self.history = {x: ['C'] for x in itertools.product('CND', repeat=4)}
self.theirs =
self.previous = None
def round(self, last):
if self.previous is not None and len(self.previous) == 4:
self.history[self.previous].append(last)
if last is not None:
self.theirs = (self.theirs + [last])[-3:]
if self.previous is not None and len(self.previous) == 4:
expected = random.choice(self.history[self.previous])
if expected == 'C':
choice = 'C'
elif expected == 'N':
choice = 'C'
else:
choice = 'N'
else:
choice = 'C'
self.previous = tuple(self.theirs + [choice])
return choice
Try to find a pattern in the opponent's moves, assuming they're also watching our last move.
add a comment |
up vote
1
down vote
Handshake
class HandshakeBot:
def __init__(self):
self.handshake_length = 4
self.handshake = ["N","N","C","D"]
while len(self.handshake) < self.handshake_length:
self.handshake *= 2
self.handshake = self.handshake[:self.handshake_length]
self.opp_hand =
self.friendly = None
def round(self, last):
if last:
if self.friendly == None:
# still trying to handshake
self.opp_hand.append(last)
if self.opp_hand[-1] != self.handshake[len(self.opp_hand)-1]:
self.friendly = False
return "D"
if len(self.opp_hand) == len(self.handshake):
self.friendly = True
return "C"
return self.handshake[len(self.opp_hand)]
elif self.friendly == True:
# successful handshake and continued cooperation
if last == "C":
return "C"
self.friendly = False
return "D"
else:
# failed handshake or abandoned cooperation
return "N" if last == "D" else ("D" if last == "C" else "C")
return self.handshake[0]
Recognizes when it's playing against itself, then cooperates. Otherwise mimics LastOptimalBot which seems like the best one-line strategy. Performs worse than LastOptimalBot, by an amount inversely proportional to the number of rounds. Obviously would do better if there were more copies of it in the field *cough**wink*.
Just submit a few clones that have different non-handshake behavior.
– Blacksilver
yesterday
That seems exploit-y. I could submit one such clone for every simple behavior represented here.
– Sparr
yesterday
I've added an extra clause saying that you can only submit max five bots.
– Blacksilver
yesterday
add a comment |
up vote
1
down vote
ShiftingOptimalBot
class ShiftingOptimalBot:
def __init__(self):
# wins, draws, losses
self.history = [0,0,0]
self.lastMove = None
self.state = 0
def round(self, last):
if last == None:
self.lastMove = "C"
return self.lastMove
if last == self.lastMove:
self.history[1] += 1
elif (last == "C" and self.lastMove == "D") or (last == "D" and self.lastMove == "N") or (last == "N" and self.lastMove == "C"):
self.history[0] += 1
else:
self.history[2] += 1
if self.history[0] + 1 < self.history[2] or self.history[2] > 5:
self.state = (self.state + 1) % 3
self.history = [0,0,0]
if self.history[1] > self.history[0] + self.history[2] + 2:
self.state = (self.state + 2) % 3
self.history = [0,0,0]
if self.state == 0:
self.lastMove = "N" if last == "D" else ("D" if last == "C" else "C")
elif self.state == 1:
self.lastMove = last
else:
self.lastMove = "C" if last == "D" else ("N" if last == "C" else "D")
return self.lastMove
This bot uses LastOptimalBot's algorithm as long as it's winning. If the other bot starts predicting it, however, it will start playing whichever move its opponent played last (which is the move that beats the move that would beat LastOptimalBot). It cycles through simple transpositions of those algorithms as long as it continues to lose (or when it gets bored by drawing a lot).
Honestly, I'm surprised that LastOptimalBot is sitting in 5th as I post this. I'm fairly certain this will do better, assuming that I wrote this python correctly.
add a comment |
up vote
1
down vote
class HistoricAverage:
PAYOFFS = {
"C":{"C":3,"N":1,"D":5},
"N":{"C":4,"N":2,"D":2},
"D":{"C":0,"N":3,"D":1}}
def __init__(self):
self.payoffsum = {"C":0, "N":0, "D":0}
def round(this, last):
if(last != None):
for x in this.payoffsum:
this.payoffsum[x] += HistoricAverage.PAYOFFS[last][x]
return max(this.payoffsum, key=this.payoffsum.get)
Looks at history and finds the action that would have been best on average. Starts cooperative.
This could run faster if it didn't re-calculate the averages every round.
– Sparr
yesterday
@Sparr true. I edited it so it does now.
– MegaTom
21 hours ago
add a comment |
17 Answers
17
active
oldest
votes
17 Answers
17
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
9
down vote
EvaluaterBot
class EvaluaterBot:
def __init__(self):
self.c2i = {"C":0, "N":1, "D":2}
self.i2c = {0:"C", 1:"N", 2:"D"}
self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.last = [None, None]
def round(self, last):
if self.last[0] == None:
ret = 2
else:
# Input the latest enemy action (the reaction to my action 2 rounds ago)
# into the history
self.history[self.last[0]][self.c2i[last]] += 1
# The enemy will react to the last action I did
prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])
ret = (prediction - 1) % 3
self.last = [self.last[1], ret]
return self.i2c[ret]
Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.
Yep, beats everything.
– Blacksilver
2 days ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
2 days ago
add a comment |
up vote
9
down vote
EvaluaterBot
class EvaluaterBot:
def __init__(self):
self.c2i = {"C":0, "N":1, "D":2}
self.i2c = {0:"C", 1:"N", 2:"D"}
self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.last = [None, None]
def round(self, last):
if self.last[0] == None:
ret = 2
else:
# Input the latest enemy action (the reaction to my action 2 rounds ago)
# into the history
self.history[self.last[0]][self.c2i[last]] += 1
# The enemy will react to the last action I did
prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])
ret = (prediction - 1) % 3
self.last = [self.last[1], ret]
return self.i2c[ret]
Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.
Yep, beats everything.
– Blacksilver
2 days ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
2 days ago
add a comment |
up vote
9
down vote
up vote
9
down vote
EvaluaterBot
class EvaluaterBot:
def __init__(self):
self.c2i = {"C":0, "N":1, "D":2}
self.i2c = {0:"C", 1:"N", 2:"D"}
self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.last = [None, None]
def round(self, last):
if self.last[0] == None:
ret = 2
else:
# Input the latest enemy action (the reaction to my action 2 rounds ago)
# into the history
self.history[self.last[0]][self.c2i[last]] += 1
# The enemy will react to the last action I did
prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])
ret = (prediction - 1) % 3
self.last = [self.last[1], ret]
return self.i2c[ret]
Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.
EvaluaterBot
class EvaluaterBot:
def __init__(self):
self.c2i = {"C":0, "N":1, "D":2}
self.i2c = {0:"C", 1:"N", 2:"D"}
self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.last = [None, None]
def round(self, last):
if self.last[0] == None:
ret = 2
else:
# Input the latest enemy action (the reaction to my action 2 rounds ago)
# into the history
self.history[self.last[0]][self.c2i[last]] += 1
# The enemy will react to the last action I did
prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])
ret = (prediction - 1) % 3
self.last = [self.last[1], ret]
return self.i2c[ret]
Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.
edited 2 days ago
answered 2 days ago
Black Owl Kai
5217
5217
Yep, beats everything.
– Blacksilver
2 days ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
2 days ago
add a comment |
Yep, beats everything.
– Blacksilver
2 days ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
2 days ago
Yep, beats everything.
– Blacksilver
2 days ago
Yep, beats everything.
– Blacksilver
2 days ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
2 days ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
2 days ago
add a comment |
up vote
7
down vote
TatForTit
class TatForTit:
def round(self, last):
if(last == "C"):
return "N"
return "D"
This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.
add a comment |
up vote
7
down vote
TatForTit
class TatForTit:
def round(self, last):
if(last == "C"):
return "N"
return "D"
This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.
add a comment |
up vote
7
down vote
up vote
7
down vote
TatForTit
class TatForTit:
def round(self, last):
if(last == "C"):
return "N"
return "D"
This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.
TatForTit
class TatForTit:
def round(self, last):
if(last == "C"):
return "N"
return "D"
This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.
edited 2 days ago
answered 2 days ago
Sparr
5,0081633
5,0081633
add a comment |
add a comment |
up vote
6
down vote
NashEquilibrium
This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.
class NashEquilibrium:
def round(self, _):
a = random.random()
if a <= 0.2:
return "C"
elif a <= 0.6:
return "N"
else:
return "D"
Constant Abusing Nash Equilibrium
This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.
It's only downside is that it averages 2.2 points per turn playing against itself.
class NashEquilibrium2:
def __init__(self):
self.opphistory = [None, None, None]
self.titfortatcounter = 0
self.titfortatflag = 0
self.mylast = "C"
self.constantflag = 0
self.myret = "C"
def round(self, last):
self.opphistory.pop(0)
self.opphistory.append(last)
# check if its a constant bot, if so exploit
if self.opphistory.count(self.opphistory[0]) == 3:
self.constantflag = 1
if last == "C":
self.myret = "D"
elif last == "N":
self.myret = "C"
elif last == "D":
self.myret = "N"
# check if its a titfortat bot, if so exploit
# give it 2 chances to see if its titfortat as it might happen randomly
if self.mylast == "D" and last == "D":
self.titfortatcounter = self.titfortatcounter + 1
if self.mylast == "D" and last!= "D":
self.titfortatcounter = 0
if self.titfortatcounter >= 3:
self.titfortatflag = 1
if self.titfortatflag == 1:
if last == "C":
self.myret = "D"
elif last == "D":
self.myret = "N"
elif last == "N":
# tit for tat doesn't return N, we made a mistake somewhere
self.titfortatflag = 0
self.titfortatcounter = 0
# else play the single game nash equilibrium
if self.constantflag == 0 and self.titfortatflag == 0:
a = random.random()
if a <= 0.2:
self.myret = "C"
elif a <= 0.6:
self.myret = "N"
else:
self.myret = "D"
self.mylast = self.myret
return self.myret
New contributor
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
2 days ago
Thank you fixed it
– Omer Faruk Yatkin
2 days ago
Little shorter:class NashEquilibrium: def round(self, _): a = random.random() for k, v in [(0.2, "C"), (0.6, "N"), (1, "D")]: if a <= k: return v
– Robert Grant
yesterday
add a comment |
up vote
6
down vote
NashEquilibrium
This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.
class NashEquilibrium:
def round(self, _):
a = random.random()
if a <= 0.2:
return "C"
elif a <= 0.6:
return "N"
else:
return "D"
Constant Abusing Nash Equilibrium
This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.
It's only downside is that it averages 2.2 points per turn playing against itself.
class NashEquilibrium2:
def __init__(self):
self.opphistory = [None, None, None]
self.titfortatcounter = 0
self.titfortatflag = 0
self.mylast = "C"
self.constantflag = 0
self.myret = "C"
def round(self, last):
self.opphistory.pop(0)
self.opphistory.append(last)
# check if its a constant bot, if so exploit
if self.opphistory.count(self.opphistory[0]) == 3:
self.constantflag = 1
if last == "C":
self.myret = "D"
elif last == "N":
self.myret = "C"
elif last == "D":
self.myret = "N"
# check if its a titfortat bot, if so exploit
# give it 2 chances to see if its titfortat as it might happen randomly
if self.mylast == "D" and last == "D":
self.titfortatcounter = self.titfortatcounter + 1
if self.mylast == "D" and last!= "D":
self.titfortatcounter = 0
if self.titfortatcounter >= 3:
self.titfortatflag = 1
if self.titfortatflag == 1:
if last == "C":
self.myret = "D"
elif last == "D":
self.myret = "N"
elif last == "N":
# tit for tat doesn't return N, we made a mistake somewhere
self.titfortatflag = 0
self.titfortatcounter = 0
# else play the single game nash equilibrium
if self.constantflag == 0 and self.titfortatflag == 0:
a = random.random()
if a <= 0.2:
self.myret = "C"
elif a <= 0.6:
self.myret = "N"
else:
self.myret = "D"
self.mylast = self.myret
return self.myret
New contributor
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
2 days ago
Thank you fixed it
– Omer Faruk Yatkin
2 days ago
Little shorter:class NashEquilibrium: def round(self, _): a = random.random() for k, v in [(0.2, "C"), (0.6, "N"), (1, "D")]: if a <= k: return v
– Robert Grant
yesterday
add a comment |
up vote
6
down vote
up vote
6
down vote
NashEquilibrium
This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.
class NashEquilibrium:
def round(self, _):
a = random.random()
if a <= 0.2:
return "C"
elif a <= 0.6:
return "N"
else:
return "D"
Constant Abusing Nash Equilibrium
This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.
It's only downside is that it averages 2.2 points per turn playing against itself.
class NashEquilibrium2:
def __init__(self):
self.opphistory = [None, None, None]
self.titfortatcounter = 0
self.titfortatflag = 0
self.mylast = "C"
self.constantflag = 0
self.myret = "C"
def round(self, last):
self.opphistory.pop(0)
self.opphistory.append(last)
# check if its a constant bot, if so exploit
if self.opphistory.count(self.opphistory[0]) == 3:
self.constantflag = 1
if last == "C":
self.myret = "D"
elif last == "N":
self.myret = "C"
elif last == "D":
self.myret = "N"
# check if its a titfortat bot, if so exploit
# give it 2 chances to see if its titfortat as it might happen randomly
if self.mylast == "D" and last == "D":
self.titfortatcounter = self.titfortatcounter + 1
if self.mylast == "D" and last!= "D":
self.titfortatcounter = 0
if self.titfortatcounter >= 3:
self.titfortatflag = 1
if self.titfortatflag == 1:
if last == "C":
self.myret = "D"
elif last == "D":
self.myret = "N"
elif last == "N":
# tit for tat doesn't return N, we made a mistake somewhere
self.titfortatflag = 0
self.titfortatcounter = 0
# else play the single game nash equilibrium
if self.constantflag == 0 and self.titfortatflag == 0:
a = random.random()
if a <= 0.2:
self.myret = "C"
elif a <= 0.6:
self.myret = "N"
else:
self.myret = "D"
self.mylast = self.myret
return self.myret
New contributor
NashEquilibrium
This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.
class NashEquilibrium:
def round(self, _):
a = random.random()
if a <= 0.2:
return "C"
elif a <= 0.6:
return "N"
else:
return "D"
Constant Abusing Nash Equilibrium
This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.
It's only downside is that it averages 2.2 points per turn playing against itself.
class NashEquilibrium2:
def __init__(self):
self.opphistory = [None, None, None]
self.titfortatcounter = 0
self.titfortatflag = 0
self.mylast = "C"
self.constantflag = 0
self.myret = "C"
def round(self, last):
self.opphistory.pop(0)
self.opphistory.append(last)
# check if its a constant bot, if so exploit
if self.opphistory.count(self.opphistory[0]) == 3:
self.constantflag = 1
if last == "C":
self.myret = "D"
elif last == "N":
self.myret = "C"
elif last == "D":
self.myret = "N"
# check if its a titfortat bot, if so exploit
# give it 2 chances to see if its titfortat as it might happen randomly
if self.mylast == "D" and last == "D":
self.titfortatcounter = self.titfortatcounter + 1
if self.mylast == "D" and last!= "D":
self.titfortatcounter = 0
if self.titfortatcounter >= 3:
self.titfortatflag = 1
if self.titfortatflag == 1:
if last == "C":
self.myret = "D"
elif last == "D":
self.myret = "N"
elif last == "N":
# tit for tat doesn't return N, we made a mistake somewhere
self.titfortatflag = 0
self.titfortatcounter = 0
# else play the single game nash equilibrium
if self.constantflag == 0 and self.titfortatflag == 0:
a = random.random()
if a <= 0.2:
self.myret = "C"
elif a <= 0.6:
self.myret = "N"
else:
self.myret = "D"
self.mylast = self.myret
return self.myret
New contributor
edited 2 days ago
Blacksilver
425314
425314
New contributor
answered 2 days ago
Omer Faruk Yatkin
614
614
New contributor
New contributor
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
2 days ago
Thank you fixed it
– Omer Faruk Yatkin
2 days ago
Little shorter:class NashEquilibrium: def round(self, _): a = random.random() for k, v in [(0.2, "C"), (0.6, "N"), (1, "D")]: if a <= k: return v
– Robert Grant
yesterday
add a comment |
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
2 days ago
Thank you fixed it
– Omer Faruk Yatkin
2 days ago
Little shorter:class NashEquilibrium: def round(self, _): a = random.random() for k, v in [(0.2, "C"), (0.6, "N"), (1, "D")]: if a <= k: return v
– Robert Grant
yesterday
1
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
2 days ago
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
2 days ago
Thank you fixed it
– Omer Faruk Yatkin
2 days ago
Thank you fixed it
– Omer Faruk Yatkin
2 days ago
Little shorter:
class NashEquilibrium: def round(self, _): a = random.random() for k, v in [(0.2, "C"), (0.6, "N"), (1, "D")]: if a <= k: return v
– Robert Grant
yesterday
Little shorter:
class NashEquilibrium: def round(self, _): a = random.random() for k, v in [(0.2, "C"), (0.6, "N"), (1, "D")]: if a <= k: return v
– Robert Grant
yesterday
add a comment |
up vote
5
down vote
PatternFinder
class PatternFinder:
def __init__(self):
import collections
self.size = 10
self.moves = [None]
self.other =
self.patterns = collections.defaultdict(list)
self.counter_moves = {"C":"D", "N":"C", "D":"N"}
self.initial_move = "D"
self.pattern_length_exponent = 1
self.pattern_age_exponent = 1
self.debug = False
def round(self, last):
self.other.append(last)
best_pattern_match = None
best_pattern_score = None
best_pattern_response = None
self.debug and print("match so far:",tuple(zip(self.moves,self.other)))
for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):
# record patterns ending with the move that just happened
pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))
if len(pattern_full) > 1:
pattern_trunc = pattern_full[:-1]
pattern_trunc_result = pattern_full[-1][1]
self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])
if pattern_full in self.patterns:
# we've seen this pattern at least once before
self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])
for [response,turn_num] in self.patterns[pattern_full]:
score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent
if best_pattern_score == None or score > best_pattern_score:
best_pattern_match = pattern_full
best_pattern_score = score
best_pattern_response = response
# this could be much smarter about aggregating previous responses
if best_pattern_response:
move = self.counter_moves[best_pattern_response]
else:
# fall back to playing nice
move = "C"
self.moves.append(move)
self.debug and print("I choose",move)
return move
This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
2 days ago
2
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
2 days ago
1
Maybe using a highly composite number (i.e., 12) would score better?
– Blacksilver
2 days ago
add a comment |
up vote
5
down vote
PatternFinder
class PatternFinder:
def __init__(self):
import collections
self.size = 10
self.moves = [None]
self.other =
self.patterns = collections.defaultdict(list)
self.counter_moves = {"C":"D", "N":"C", "D":"N"}
self.initial_move = "D"
self.pattern_length_exponent = 1
self.pattern_age_exponent = 1
self.debug = False
def round(self, last):
self.other.append(last)
best_pattern_match = None
best_pattern_score = None
best_pattern_response = None
self.debug and print("match so far:",tuple(zip(self.moves,self.other)))
for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):
# record patterns ending with the move that just happened
pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))
if len(pattern_full) > 1:
pattern_trunc = pattern_full[:-1]
pattern_trunc_result = pattern_full[-1][1]
self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])
if pattern_full in self.patterns:
# we've seen this pattern at least once before
self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])
for [response,turn_num] in self.patterns[pattern_full]:
score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent
if best_pattern_score == None or score > best_pattern_score:
best_pattern_match = pattern_full
best_pattern_score = score
best_pattern_response = response
# this could be much smarter about aggregating previous responses
if best_pattern_response:
move = self.counter_moves[best_pattern_response]
else:
# fall back to playing nice
move = "C"
self.moves.append(move)
self.debug and print("I choose",move)
return move
This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
2 days ago
2
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
2 days ago
1
Maybe using a highly composite number (i.e., 12) would score better?
– Blacksilver
2 days ago
add a comment |
up vote
5
down vote
up vote
5
down vote
PatternFinder
class PatternFinder:
def __init__(self):
import collections
self.size = 10
self.moves = [None]
self.other =
self.patterns = collections.defaultdict(list)
self.counter_moves = {"C":"D", "N":"C", "D":"N"}
self.initial_move = "D"
self.pattern_length_exponent = 1
self.pattern_age_exponent = 1
self.debug = False
def round(self, last):
self.other.append(last)
best_pattern_match = None
best_pattern_score = None
best_pattern_response = None
self.debug and print("match so far:",tuple(zip(self.moves,self.other)))
for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):
# record patterns ending with the move that just happened
pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))
if len(pattern_full) > 1:
pattern_trunc = pattern_full[:-1]
pattern_trunc_result = pattern_full[-1][1]
self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])
if pattern_full in self.patterns:
# we've seen this pattern at least once before
self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])
for [response,turn_num] in self.patterns[pattern_full]:
score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent
if best_pattern_score == None or score > best_pattern_score:
best_pattern_match = pattern_full
best_pattern_score = score
best_pattern_response = response
# this could be much smarter about aggregating previous responses
if best_pattern_response:
move = self.counter_moves[best_pattern_response]
else:
# fall back to playing nice
move = "C"
self.moves.append(move)
self.debug and print("I choose",move)
return move
This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.
PatternFinder
class PatternFinder:
def __init__(self):
import collections
self.size = 10
self.moves = [None]
self.other =
self.patterns = collections.defaultdict(list)
self.counter_moves = {"C":"D", "N":"C", "D":"N"}
self.initial_move = "D"
self.pattern_length_exponent = 1
self.pattern_age_exponent = 1
self.debug = False
def round(self, last):
self.other.append(last)
best_pattern_match = None
best_pattern_score = None
best_pattern_response = None
self.debug and print("match so far:",tuple(zip(self.moves,self.other)))
for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):
# record patterns ending with the move that just happened
pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))
if len(pattern_full) > 1:
pattern_trunc = pattern_full[:-1]
pattern_trunc_result = pattern_full[-1][1]
self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])
if pattern_full in self.patterns:
# we've seen this pattern at least once before
self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])
for [response,turn_num] in self.patterns[pattern_full]:
score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent
if best_pattern_score == None or score > best_pattern_score:
best_pattern_match = pattern_full
best_pattern_score = score
best_pattern_response = response
# this could be much smarter about aggregating previous responses
if best_pattern_response:
move = self.counter_moves[best_pattern_response]
else:
# fall back to playing nice
move = "C"
self.moves.append(move)
self.debug and print("I choose",move)
return move
This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.
edited 2 days ago
answered 2 days ago
Sparr
5,0081633
5,0081633
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
2 days ago
2
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
2 days ago
1
Maybe using a highly composite number (i.e., 12) would score better?
– Blacksilver
2 days ago
add a comment |
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
2 days ago
2
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
2 days ago
1
Maybe using a highly composite number (i.e., 12) would score better?
– Blacksilver
2 days ago
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
2 days ago
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
2 days ago
2
2
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
2 days ago
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
2 days ago
1
1
Maybe using a highly composite number (i.e., 12) would score better?
– Blacksilver
2 days ago
Maybe using a highly composite number (i.e., 12) would score better?
– Blacksilver
2 days ago
add a comment |
up vote
4
down vote
Jade
class Jade:
def __init__(self):
self.dRate = 0.001
self.nRate = 0.003
def round(self, last):
if last == 'D':
self.dRate *= 1.1
self.nRate *= 1.2
elif last == 'N':
self.dRate *= 1.03
self.nRate *= 1.05
self.dRate = min(self.dRate, 1)
self.nRate = min(self.nRate, 1)
x = random.random()
if x > (1 - self.dRate):
return 'D'
elif x > (1 - self.nRate):
return 'N'
else:
return 'C'
Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.
add a comment |
up vote
4
down vote
Jade
class Jade:
def __init__(self):
self.dRate = 0.001
self.nRate = 0.003
def round(self, last):
if last == 'D':
self.dRate *= 1.1
self.nRate *= 1.2
elif last == 'N':
self.dRate *= 1.03
self.nRate *= 1.05
self.dRate = min(self.dRate, 1)
self.nRate = min(self.nRate, 1)
x = random.random()
if x > (1 - self.dRate):
return 'D'
elif x > (1 - self.nRate):
return 'N'
else:
return 'C'
Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.
add a comment |
up vote
4
down vote
up vote
4
down vote
Jade
class Jade:
def __init__(self):
self.dRate = 0.001
self.nRate = 0.003
def round(self, last):
if last == 'D':
self.dRate *= 1.1
self.nRate *= 1.2
elif last == 'N':
self.dRate *= 1.03
self.nRate *= 1.05
self.dRate = min(self.dRate, 1)
self.nRate = min(self.nRate, 1)
x = random.random()
if x > (1 - self.dRate):
return 'D'
elif x > (1 - self.nRate):
return 'N'
else:
return 'C'
Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.
Jade
class Jade:
def __init__(self):
self.dRate = 0.001
self.nRate = 0.003
def round(self, last):
if last == 'D':
self.dRate *= 1.1
self.nRate *= 1.2
elif last == 'N':
self.dRate *= 1.03
self.nRate *= 1.05
self.dRate = min(self.dRate, 1)
self.nRate = min(self.nRate, 1)
x = random.random()
if x > (1 - self.dRate):
return 'D'
elif x > (1 - self.nRate):
return 'N'
else:
return 'C'
Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.
answered 2 days ago
Mnemonic
4,6351630
4,6351630
add a comment |
add a comment |
up vote
4
down vote
OldTitForTat
Old school player is too lazy to update for the new rules.
class OldTitForTat:
def round(self, last):
if(last == None)
return "C"
if(last == "C"):
return "C"
return "D"
add a comment |
up vote
4
down vote
OldTitForTat
Old school player is too lazy to update for the new rules.
class OldTitForTat:
def round(self, last):
if(last == None)
return "C"
if(last == "C"):
return "C"
return "D"
add a comment |
up vote
4
down vote
up vote
4
down vote
OldTitForTat
Old school player is too lazy to update for the new rules.
class OldTitForTat:
def round(self, last):
if(last == None)
return "C"
if(last == "C"):
return "C"
return "D"
OldTitForTat
Old school player is too lazy to update for the new rules.
class OldTitForTat:
def round(self, last):
if(last == None)
return "C"
if(last == "C"):
return "C"
return "D"
answered 2 days ago
Joshua
2,382918
2,382918
add a comment |
add a comment |
up vote
3
down vote
NeverCOOP
class NeverCOOP:
def round(self, last):
try:
if last in "ND":
return "D"
else:
return "N"
except:
return "N"
If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...
New contributor
What's the try/except for?
– Blacksilver
2 days ago
1
@Blacksilver I'd assume it serves the same purpose as theif(last):
in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
2 days ago
ahh, I see.None in "ND"
errors.
– Blacksilver
2 days ago
Becauseif last and last in "ND":
was too complicated?
– immibis
2 days ago
add a comment |
up vote
3
down vote
NeverCOOP
class NeverCOOP:
def round(self, last):
try:
if last in "ND":
return "D"
else:
return "N"
except:
return "N"
If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...
New contributor
What's the try/except for?
– Blacksilver
2 days ago
1
@Blacksilver I'd assume it serves the same purpose as theif(last):
in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
2 days ago
ahh, I see.None in "ND"
errors.
– Blacksilver
2 days ago
Becauseif last and last in "ND":
was too complicated?
– immibis
2 days ago
add a comment |
up vote
3
down vote
up vote
3
down vote
NeverCOOP
class NeverCOOP:
def round(self, last):
try:
if last in "ND":
return "D"
else:
return "N"
except:
return "N"
If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...
New contributor
NeverCOOP
class NeverCOOP:
def round(self, last):
try:
if last in "ND":
return "D"
else:
return "N"
except:
return "N"
If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...
New contributor
New contributor
answered 2 days ago
glietz
716
716
New contributor
New contributor
What's the try/except for?
– Blacksilver
2 days ago
1
@Blacksilver I'd assume it serves the same purpose as theif(last):
in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
2 days ago
ahh, I see.None in "ND"
errors.
– Blacksilver
2 days ago
Becauseif last and last in "ND":
was too complicated?
– immibis
2 days ago
add a comment |
What's the try/except for?
– Blacksilver
2 days ago
1
@Blacksilver I'd assume it serves the same purpose as theif(last):
in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
2 days ago
ahh, I see.None in "ND"
errors.
– Blacksilver
2 days ago
Becauseif last and last in "ND":
was too complicated?
– immibis
2 days ago
What's the try/except for?
– Blacksilver
2 days ago
What's the try/except for?
– Blacksilver
2 days ago
1
1
@Blacksilver I'd assume it serves the same purpose as the
if(last):
in your Grudger bot, detecting whether there was a previous round.– ETHproductions
2 days ago
@Blacksilver I'd assume it serves the same purpose as the
if(last):
in your Grudger bot, detecting whether there was a previous round.– ETHproductions
2 days ago
ahh, I see.
None in "ND"
errors.– Blacksilver
2 days ago
ahh, I see.
None in "ND"
errors.– Blacksilver
2 days ago
Because
if last and last in "ND":
was too complicated?– immibis
2 days ago
Because
if last and last in "ND":
was too complicated?– immibis
2 days ago
add a comment |
up vote
3
down vote
LastOptimalBot
class LastOptimalBot:
def round(self, last):
return "N" if last == "D" else ("D" if last == "C" else "C")
Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.
Averages:
Me Opp
2.6 2 vs TitForTat
5 0 vs AllC
4 1 vs AllN
3 2 vs AllD
3.5 3.5 vs Random
3 2 vs Grudger
2 2 vs LastOptimalBot
1 3.5 vs TatForTit
4 1 vs NeverCOOP
1 4 vs EvaluaterBot
2.28 2.24 vs NashEquilibrium
2.91 average overall
oof. Maybe T4T would do better asreturn last
.
– Blacksilver
2 days ago
I'd like that! If TitForTat werereturn last
, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
2 days ago
return last
would be a better T4T for this challenge, I think
– Sparr
2 days ago
Just tried -- theif(last): return last; else: return "C"
is worse.
– Blacksilver
2 days ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
2 days ago
add a comment |
up vote
3
down vote
LastOptimalBot
class LastOptimalBot:
def round(self, last):
return "N" if last == "D" else ("D" if last == "C" else "C")
Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.
Averages:
Me Opp
2.6 2 vs TitForTat
5 0 vs AllC
4 1 vs AllN
3 2 vs AllD
3.5 3.5 vs Random
3 2 vs Grudger
2 2 vs LastOptimalBot
1 3.5 vs TatForTit
4 1 vs NeverCOOP
1 4 vs EvaluaterBot
2.28 2.24 vs NashEquilibrium
2.91 average overall
oof. Maybe T4T would do better asreturn last
.
– Blacksilver
2 days ago
I'd like that! If TitForTat werereturn last
, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
2 days ago
return last
would be a better T4T for this challenge, I think
– Sparr
2 days ago
Just tried -- theif(last): return last; else: return "C"
is worse.
– Blacksilver
2 days ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
2 days ago
add a comment |
up vote
3
down vote
up vote
3
down vote
LastOptimalBot
class LastOptimalBot:
def round(self, last):
return "N" if last == "D" else ("D" if last == "C" else "C")
Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.
Averages:
Me Opp
2.6 2 vs TitForTat
5 0 vs AllC
4 1 vs AllN
3 2 vs AllD
3.5 3.5 vs Random
3 2 vs Grudger
2 2 vs LastOptimalBot
1 3.5 vs TatForTit
4 1 vs NeverCOOP
1 4 vs EvaluaterBot
2.28 2.24 vs NashEquilibrium
2.91 average overall
LastOptimalBot
class LastOptimalBot:
def round(self, last):
return "N" if last == "D" else ("D" if last == "C" else "C")
Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.
Averages:
Me Opp
2.6 2 vs TitForTat
5 0 vs AllC
4 1 vs AllN
3 2 vs AllD
3.5 3.5 vs Random
3 2 vs Grudger
2 2 vs LastOptimalBot
1 3.5 vs TatForTit
4 1 vs NeverCOOP
1 4 vs EvaluaterBot
2.28 2.24 vs NashEquilibrium
2.91 average overall
edited 2 days ago
answered 2 days ago
Spitemaster
3315
3315
oof. Maybe T4T would do better asreturn last
.
– Blacksilver
2 days ago
I'd like that! If TitForTat werereturn last
, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
2 days ago
return last
would be a better T4T for this challenge, I think
– Sparr
2 days ago
Just tried -- theif(last): return last; else: return "C"
is worse.
– Blacksilver
2 days ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
2 days ago
add a comment |
oof. Maybe T4T would do better asreturn last
.
– Blacksilver
2 days ago
I'd like that! If TitForTat werereturn last
, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
2 days ago
return last
would be a better T4T for this challenge, I think
– Sparr
2 days ago
Just tried -- theif(last): return last; else: return "C"
is worse.
– Blacksilver
2 days ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
2 days ago
oof. Maybe T4T would do better as
return last
.– Blacksilver
2 days ago
oof. Maybe T4T would do better as
return last
.– Blacksilver
2 days ago
I'd like that! If TitForTat were
return last
, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.– Spitemaster
2 days ago
I'd like that! If TitForTat were
return last
, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.– Spitemaster
2 days ago
return last
would be a better T4T for this challenge, I think– Sparr
2 days ago
return last
would be a better T4T for this challenge, I think– Sparr
2 days ago
Just tried -- the
if(last): return last; else: return "C"
is worse.– Blacksilver
2 days ago
Just tried -- the
if(last): return last; else: return "C"
is worse.– Blacksilver
2 days ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
2 days ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
2 days ago
add a comment |
up vote
3
down vote
CopyCat
class CopyCat:
def round(self, last):
if last:
return last
return "C"
Copies the opponent's last move.
I don't expect this to do well, but no one had implemented this classic yet.
New contributor
add a comment |
up vote
3
down vote
CopyCat
class CopyCat:
def round(self, last):
if last:
return last
return "C"
Copies the opponent's last move.
I don't expect this to do well, but no one had implemented this classic yet.
New contributor
add a comment |
up vote
3
down vote
up vote
3
down vote
CopyCat
class CopyCat:
def round(self, last):
if last:
return last
return "C"
Copies the opponent's last move.
I don't expect this to do well, but no one had implemented this classic yet.
New contributor
CopyCat
class CopyCat:
def round(self, last):
if last:
return last
return "C"
Copies the opponent's last move.
I don't expect this to do well, but no one had implemented this classic yet.
New contributor
edited 2 days ago
Blacksilver
425314
425314
New contributor
answered 2 days ago
MannerPots
312
312
New contributor
New contributor
add a comment |
add a comment |
up vote
2
down vote
Ensemble
This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.
Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.
(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)
It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).
from collections import defaultdict
import random
class Number6:
class Choices:
def __init__(self, C = 0, N = 0, D = 0):
self.C = C
self.N = N
self.D = D
def __init__(self, strategy = "maxExpected", markov_order = 3):
self.MARKOV_ORDER = markov_order;
self.my_choices = ""
self.opponent = defaultdict(lambda: self.Choices())
self.choice = None # previous choice
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
self.total_payoff = 0
# if random, will choose in proportion to payoff.
# otherwise, will always choose argmax
self.strategy = strategy
# maxExpected: maximize expected relative payoff
# random: like maxExpected, but it chooses in proportion to E[payoff]
# argmax: always choose the option that is optimal for expected opponent choice
def update_opponent_model(self, last):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
self.opponent[hist].C += ("C" == last)
self.opponent[hist].N += ("N" == last)
self.opponent[hist].D += ("D" == last)
def normalize(self, counts):
sum = float(counts.C + counts.N + counts.D)
if 0 == sum:
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
return self.Choices(
counts.C / sum, counts.N / sum, counts.D / sum)
def get_distribution(self):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
#print "check hist = " + hist
if hist in self.opponent:
return self.normalize(self.opponent[hist])
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
def choose(self, dist):
payoff = self.Choices()
# We're interested in *beating the opponent*, not
# maximizing our score, so we optimize the difference
payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D
payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D
payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D
# D has slightly better payoff on uniform opponent,
# so we select it on ties
if self.strategy == "maxExpected":
if payoff.C > payoff.N:
return "C" if payoff.C > payoff.D else "D"
return "N" if payoff.N > payoff.D else "D"
elif self.strategy == "randomize":
payoff = self.normalize(payoff)
r = random.uniform(0.0, 1.0)
if (r < payoff.C): return "C"
return "N" if (r < payoff.N) else "D"
elif self.strategy == "argMax":
if dist.C > dist.N:
return "D" if dist.C > dist.D else "N"
return "C" if dist.N > dist.D else "N"
assert(0) #, "I am not a number! I am a free man!")
def update_history(self):
self.my_choices += self.choice
if len(self.my_choices) > self.MARKOV_ORDER:
assert(len(self.my_choices) == self.MARKOV_ORDER + 1)
self.my_choices = self.my_choices[1:]
def round(self, last):
if last: self.update_opponent_model(last)
dist = self.get_distribution()
self.choice = self.choose(dist)
self.update_history()
return self.choice
class Ensemble:
def __init__(self):
self.models =
self.votes =
self.prev_choice =
for order in range(0, 6):
self.models.append(Number6("maxExpected", order))
self.models.append(Number6("randomize", order))
#self.models.append(Number6("argMax", order))
for i in range(0, len(self.models)):
self.votes.append(0)
self.prev_choice.append("D")
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
def round(self, last):
if last:
for i in range(0, len(self.models)):
self.votes[i] += self.payoff[self.prev_choice[i]][last]
# vote. Sufficiently terrible models get negative votes
C = 0
N = 0
D = 0
for i in range(0, len(self.models)):
choice = self.models[i].round(last)
if "C" == choice: C += self.votes[i]
if "N" == choice: N += self.votes[i]
if "D" == choice: D += self.votes[i]
self.prev_choice[i] = choice
if C > D and C > N: return "C"
elif N > D: return "N"
else: return "D"
Test Framework
In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.
import sys, inspect
import opponents
from ensemble import Ensemble
def count_payoff(label, them):
if None == them: return
me = choices[label]
payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
if label not in total_payoff: total_payoff[label] = 0
total_payoff[label] += payoff[me][them]
def update_hist(label, choice):
choices[label] = choice
opponents = [ x[1] for x
in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]
for k in opponents:
total_payoff = {}
for j in range(0, 100):
A = Ensemble()
B = k()
choices = {}
aChoice = None
bChoice = None
for i in range(0, 100):
count_payoff(A.__class__.__name__, bChoice)
a = A.round(bChoice)
update_hist(A.__class__.__name__, a)
count_payoff(B.__class__.__name__, aChoice)
b = B.round(aChoice)
update_hist(B.__class__.__name__, b)
aChoice = a
bChoice = b
print total_payoff
The controller is ready, you didn't have to do all that...
– Blacksilver
2 days ago
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
2 days ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
2 days ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
2 days ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
2 days ago
add a comment |
up vote
2
down vote
Ensemble
This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.
Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.
(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)
It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).
from collections import defaultdict
import random
class Number6:
class Choices:
def __init__(self, C = 0, N = 0, D = 0):
self.C = C
self.N = N
self.D = D
def __init__(self, strategy = "maxExpected", markov_order = 3):
self.MARKOV_ORDER = markov_order;
self.my_choices = ""
self.opponent = defaultdict(lambda: self.Choices())
self.choice = None # previous choice
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
self.total_payoff = 0
# if random, will choose in proportion to payoff.
# otherwise, will always choose argmax
self.strategy = strategy
# maxExpected: maximize expected relative payoff
# random: like maxExpected, but it chooses in proportion to E[payoff]
# argmax: always choose the option that is optimal for expected opponent choice
def update_opponent_model(self, last):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
self.opponent[hist].C += ("C" == last)
self.opponent[hist].N += ("N" == last)
self.opponent[hist].D += ("D" == last)
def normalize(self, counts):
sum = float(counts.C + counts.N + counts.D)
if 0 == sum:
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
return self.Choices(
counts.C / sum, counts.N / sum, counts.D / sum)
def get_distribution(self):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
#print "check hist = " + hist
if hist in self.opponent:
return self.normalize(self.opponent[hist])
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
def choose(self, dist):
payoff = self.Choices()
# We're interested in *beating the opponent*, not
# maximizing our score, so we optimize the difference
payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D
payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D
payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D
# D has slightly better payoff on uniform opponent,
# so we select it on ties
if self.strategy == "maxExpected":
if payoff.C > payoff.N:
return "C" if payoff.C > payoff.D else "D"
return "N" if payoff.N > payoff.D else "D"
elif self.strategy == "randomize":
payoff = self.normalize(payoff)
r = random.uniform(0.0, 1.0)
if (r < payoff.C): return "C"
return "N" if (r < payoff.N) else "D"
elif self.strategy == "argMax":
if dist.C > dist.N:
return "D" if dist.C > dist.D else "N"
return "C" if dist.N > dist.D else "N"
assert(0) #, "I am not a number! I am a free man!")
def update_history(self):
self.my_choices += self.choice
if len(self.my_choices) > self.MARKOV_ORDER:
assert(len(self.my_choices) == self.MARKOV_ORDER + 1)
self.my_choices = self.my_choices[1:]
def round(self, last):
if last: self.update_opponent_model(last)
dist = self.get_distribution()
self.choice = self.choose(dist)
self.update_history()
return self.choice
class Ensemble:
def __init__(self):
self.models =
self.votes =
self.prev_choice =
for order in range(0, 6):
self.models.append(Number6("maxExpected", order))
self.models.append(Number6("randomize", order))
#self.models.append(Number6("argMax", order))
for i in range(0, len(self.models)):
self.votes.append(0)
self.prev_choice.append("D")
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
def round(self, last):
if last:
for i in range(0, len(self.models)):
self.votes[i] += self.payoff[self.prev_choice[i]][last]
# vote. Sufficiently terrible models get negative votes
C = 0
N = 0
D = 0
for i in range(0, len(self.models)):
choice = self.models[i].round(last)
if "C" == choice: C += self.votes[i]
if "N" == choice: N += self.votes[i]
if "D" == choice: D += self.votes[i]
self.prev_choice[i] = choice
if C > D and C > N: return "C"
elif N > D: return "N"
else: return "D"
Test Framework
In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.
import sys, inspect
import opponents
from ensemble import Ensemble
def count_payoff(label, them):
if None == them: return
me = choices[label]
payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
if label not in total_payoff: total_payoff[label] = 0
total_payoff[label] += payoff[me][them]
def update_hist(label, choice):
choices[label] = choice
opponents = [ x[1] for x
in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]
for k in opponents:
total_payoff = {}
for j in range(0, 100):
A = Ensemble()
B = k()
choices = {}
aChoice = None
bChoice = None
for i in range(0, 100):
count_payoff(A.__class__.__name__, bChoice)
a = A.round(bChoice)
update_hist(A.__class__.__name__, a)
count_payoff(B.__class__.__name__, aChoice)
b = B.round(aChoice)
update_hist(B.__class__.__name__, b)
aChoice = a
bChoice = b
print total_payoff
The controller is ready, you didn't have to do all that...
– Blacksilver
2 days ago
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
2 days ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
2 days ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
2 days ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
2 days ago
add a comment |
up vote
2
down vote
up vote
2
down vote
Ensemble
This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.
Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.
(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)
It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).
from collections import defaultdict
import random
class Number6:
class Choices:
def __init__(self, C = 0, N = 0, D = 0):
self.C = C
self.N = N
self.D = D
def __init__(self, strategy = "maxExpected", markov_order = 3):
self.MARKOV_ORDER = markov_order;
self.my_choices = ""
self.opponent = defaultdict(lambda: self.Choices())
self.choice = None # previous choice
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
self.total_payoff = 0
# if random, will choose in proportion to payoff.
# otherwise, will always choose argmax
self.strategy = strategy
# maxExpected: maximize expected relative payoff
# random: like maxExpected, but it chooses in proportion to E[payoff]
# argmax: always choose the option that is optimal for expected opponent choice
def update_opponent_model(self, last):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
self.opponent[hist].C += ("C" == last)
self.opponent[hist].N += ("N" == last)
self.opponent[hist].D += ("D" == last)
def normalize(self, counts):
sum = float(counts.C + counts.N + counts.D)
if 0 == sum:
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
return self.Choices(
counts.C / sum, counts.N / sum, counts.D / sum)
def get_distribution(self):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
#print "check hist = " + hist
if hist in self.opponent:
return self.normalize(self.opponent[hist])
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
def choose(self, dist):
payoff = self.Choices()
# We're interested in *beating the opponent*, not
# maximizing our score, so we optimize the difference
payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D
payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D
payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D
# D has slightly better payoff on uniform opponent,
# so we select it on ties
if self.strategy == "maxExpected":
if payoff.C > payoff.N:
return "C" if payoff.C > payoff.D else "D"
return "N" if payoff.N > payoff.D else "D"
elif self.strategy == "randomize":
payoff = self.normalize(payoff)
r = random.uniform(0.0, 1.0)
if (r < payoff.C): return "C"
return "N" if (r < payoff.N) else "D"
elif self.strategy == "argMax":
if dist.C > dist.N:
return "D" if dist.C > dist.D else "N"
return "C" if dist.N > dist.D else "N"
assert(0) #, "I am not a number! I am a free man!")
def update_history(self):
self.my_choices += self.choice
if len(self.my_choices) > self.MARKOV_ORDER:
assert(len(self.my_choices) == self.MARKOV_ORDER + 1)
self.my_choices = self.my_choices[1:]
def round(self, last):
if last: self.update_opponent_model(last)
dist = self.get_distribution()
self.choice = self.choose(dist)
self.update_history()
return self.choice
class Ensemble:
def __init__(self):
self.models =
self.votes =
self.prev_choice =
for order in range(0, 6):
self.models.append(Number6("maxExpected", order))
self.models.append(Number6("randomize", order))
#self.models.append(Number6("argMax", order))
for i in range(0, len(self.models)):
self.votes.append(0)
self.prev_choice.append("D")
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
def round(self, last):
if last:
for i in range(0, len(self.models)):
self.votes[i] += self.payoff[self.prev_choice[i]][last]
# vote. Sufficiently terrible models get negative votes
C = 0
N = 0
D = 0
for i in range(0, len(self.models)):
choice = self.models[i].round(last)
if "C" == choice: C += self.votes[i]
if "N" == choice: N += self.votes[i]
if "D" == choice: D += self.votes[i]
self.prev_choice[i] = choice
if C > D and C > N: return "C"
elif N > D: return "N"
else: return "D"
Test Framework
In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.
import sys, inspect
import opponents
from ensemble import Ensemble
def count_payoff(label, them):
if None == them: return
me = choices[label]
payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
if label not in total_payoff: total_payoff[label] = 0
total_payoff[label] += payoff[me][them]
def update_hist(label, choice):
choices[label] = choice
opponents = [ x[1] for x
in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]
for k in opponents:
total_payoff = {}
for j in range(0, 100):
A = Ensemble()
B = k()
choices = {}
aChoice = None
bChoice = None
for i in range(0, 100):
count_payoff(A.__class__.__name__, bChoice)
a = A.round(bChoice)
update_hist(A.__class__.__name__, a)
count_payoff(B.__class__.__name__, aChoice)
b = B.round(aChoice)
update_hist(B.__class__.__name__, b)
aChoice = a
bChoice = b
print total_payoff
Ensemble
This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.
Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.
(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)
It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).
from collections import defaultdict
import random
class Number6:
class Choices:
def __init__(self, C = 0, N = 0, D = 0):
self.C = C
self.N = N
self.D = D
def __init__(self, strategy = "maxExpected", markov_order = 3):
self.MARKOV_ORDER = markov_order;
self.my_choices = ""
self.opponent = defaultdict(lambda: self.Choices())
self.choice = None # previous choice
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
self.total_payoff = 0
# if random, will choose in proportion to payoff.
# otherwise, will always choose argmax
self.strategy = strategy
# maxExpected: maximize expected relative payoff
# random: like maxExpected, but it chooses in proportion to E[payoff]
# argmax: always choose the option that is optimal for expected opponent choice
def update_opponent_model(self, last):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
self.opponent[hist].C += ("C" == last)
self.opponent[hist].N += ("N" == last)
self.opponent[hist].D += ("D" == last)
def normalize(self, counts):
sum = float(counts.C + counts.N + counts.D)
if 0 == sum:
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
return self.Choices(
counts.C / sum, counts.N / sum, counts.D / sum)
def get_distribution(self):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
#print "check hist = " + hist
if hist in self.opponent:
return self.normalize(self.opponent[hist])
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
def choose(self, dist):
payoff = self.Choices()
# We're interested in *beating the opponent*, not
# maximizing our score, so we optimize the difference
payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D
payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D
payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D
# D has slightly better payoff on uniform opponent,
# so we select it on ties
if self.strategy == "maxExpected":
if payoff.C > payoff.N:
return "C" if payoff.C > payoff.D else "D"
return "N" if payoff.N > payoff.D else "D"
elif self.strategy == "randomize":
payoff = self.normalize(payoff)
r = random.uniform(0.0, 1.0)
if (r < payoff.C): return "C"
return "N" if (r < payoff.N) else "D"
elif self.strategy == "argMax":
if dist.C > dist.N:
return "D" if dist.C > dist.D else "N"
return "C" if dist.N > dist.D else "N"
assert(0) #, "I am not a number! I am a free man!")
def update_history(self):
self.my_choices += self.choice
if len(self.my_choices) > self.MARKOV_ORDER:
assert(len(self.my_choices) == self.MARKOV_ORDER + 1)
self.my_choices = self.my_choices[1:]
def round(self, last):
if last: self.update_opponent_model(last)
dist = self.get_distribution()
self.choice = self.choose(dist)
self.update_history()
return self.choice
class Ensemble:
def __init__(self):
self.models =
self.votes =
self.prev_choice =
for order in range(0, 6):
self.models.append(Number6("maxExpected", order))
self.models.append(Number6("randomize", order))
#self.models.append(Number6("argMax", order))
for i in range(0, len(self.models)):
self.votes.append(0)
self.prev_choice.append("D")
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
def round(self, last):
if last:
for i in range(0, len(self.models)):
self.votes[i] += self.payoff[self.prev_choice[i]][last]
# vote. Sufficiently terrible models get negative votes
C = 0
N = 0
D = 0
for i in range(0, len(self.models)):
choice = self.models[i].round(last)
if "C" == choice: C += self.votes[i]
if "N" == choice: N += self.votes[i]
if "D" == choice: D += self.votes[i]
self.prev_choice[i] = choice
if C > D and C > N: return "C"
elif N > D: return "N"
else: return "D"
Test Framework
In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.
import sys, inspect
import opponents
from ensemble import Ensemble
def count_payoff(label, them):
if None == them: return
me = choices[label]
payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
if label not in total_payoff: total_payoff[label] = 0
total_payoff[label] += payoff[me][them]
def update_hist(label, choice):
choices[label] = choice
opponents = [ x[1] for x
in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]
for k in opponents:
total_payoff = {}
for j in range(0, 100):
A = Ensemble()
B = k()
choices = {}
aChoice = None
bChoice = None
for i in range(0, 100):
count_payoff(A.__class__.__name__, bChoice)
a = A.round(bChoice)
update_hist(A.__class__.__name__, a)
count_payoff(B.__class__.__name__, aChoice)
b = B.round(aChoice)
update_hist(B.__class__.__name__, b)
aChoice = a
bChoice = b
print total_payoff
edited 2 days ago
answered 2 days ago
Ray
1,438511
1,438511
The controller is ready, you didn't have to do all that...
– Blacksilver
2 days ago
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
2 days ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
2 days ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
2 days ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
2 days ago
add a comment |
The controller is ready, you didn't have to do all that...
– Blacksilver
2 days ago
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
2 days ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
2 days ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
2 days ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
2 days ago
The controller is ready, you didn't have to do all that...
– Blacksilver
2 days ago
The controller is ready, you didn't have to do all that...
– Blacksilver
2 days ago
1
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
2 days ago
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
2 days ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
2 days ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
2 days ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
2 days ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
2 days ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
2 days ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
2 days ago
add a comment |
up vote
2
down vote
Improved Dirichlet Dice
import random
class DirichletDice2:
def __init__(self):
self.alpha = dict(
C = {'C' : 1, 'N' : 1, 'D' : 1},
N = {'C' : 1, 'N' : 1, 'D' : 1},
D = {'C' : 1, 'N' : 1, 'D' : 1}
)
self.myLast = [None, None]
self.payoff = dict(
C = { "C": 0, "N": 3, "D": -5 },
N = { "C": -3, "N": 0, "D": 1 },
D = { "C": 5, "N": -1, "D": 0 }
)
def DirichletDraw(self, key):
alpha = self.alpha[key].values()
mu = [random.gammavariate(a,1) for a in alpha]
mu = [m / sum(mu) for m in mu]
return mu
def ExpectedPayoff(self, probs):
expectedPayoff = {}
for val in ['C','N','D']:
payoff = sum([p * v for p,v in zip(probs, self.payoff[val].values())])
expectedPayoff[val] = payoff
return expectedPayoff
def round(self, last):
if last is None:
self.myLast[0] = 'D'
return 'D'
#update dice corresponding to opponent's last response to my
#outcome two turns ago
if self.myLast[1] is not None:
self.alpha[self.myLast[1]][last] += 1
#draw probs for my opponent's roll from Dirichlet distribution and then return the optimal response
mu = self.DirichletDraw(self.myLast[0])
expectedPayoff = self.ExpectedPayoff(mu)
res = max(expectedPayoff, key=expectedPayoff.get)
#update myLast
self.myLast[1] = self.myLast[0]
self.myLast[0] = res
return res
This is an improved version of Dirichlet Dice. Instead of taking the expected multinomial distribution from the Dirichlet distribution, it draws a Multinomial distribution randomly from the Dirichlet distribution. Then, instead of drawing from the Multinomial and giving the optimal response to that, it gives the optimal expected response to the given Multinomial using the points. So the randomness has essentially been shifted from the Multinomial draw to the Dirichlet draw. Also, the priors are more flat now, to encourage exploration.
It's "improved" because it now accounts for the points system by giving the best expected value against the probabilities, while maintaining its randomness by drawing the probabilities themselves. Before, I tried simply doing the best expected payoff from the expected probabilities, but that did badly because it just got stuck, and didn't explore enough to update its dice. Also it was more predictable and exploitable.
Original submission:
Dirichlet Dice
import random
class DirichletDice:
def __init__(self):
self.alpha = dict(
C = {'C' : 2, 'N' : 3, 'D' : 1},
N = {'C' : 1, 'N' : 2, 'D' : 3},
D = {'C' : 3, 'N' : 1, 'D' : 2}
)
self.Response = {'C' : 'D', 'N' : 'C', 'D' : 'N'}
self.myLast = [None, None]
#expected value of the dirichlet distribution given by Alpha
def MultinomialDraw(self, key):
alpha = list(self.alpha[key].values())
probs = [x / sum(alpha) for x in alpha]
outcome = random.choices(['C','N','D'], weights=probs)[0]
return outcome
def round(self, last):
if last is None:
self.myLast[0] = 'D'
return 'D'
#update dice corresponding to opponent's last response to my
#outcome two turns ago
if self.myLast[1] is not None:
self.alpha[self.myLast[1]][last] += 1
#predict opponent's move based on my last move
predict = self.MultinomialDraw(self.myLast[0])
res = self.Response[predict]
#update myLast
self.myLast[1] = self.myLast[0]
self.myLast[0] = res
return res
Basically I'm assuming that the opponent's response to my last output is a multinomial variable (weighted dice), one for each of my outputs, so there's a dice for "C", one for "N", and one for "D". So if my last roll was, for example, a "N" then I roll the "N-dice" to guess what their response would be to my "N". I begin with a Dirichlet prior that assumes that my opponent is somewhat "smart" (more likely to play the one with the best payoff against my last roll, least likely to play the one with the worst payoff). I generate the "expected" Multinomial distribution from the appropriate Dirichlet prior (this is the expected value of the probability distribution over their dice weights). I roll the weighted dice of my last output, and respond with the one with the best payoff against that dice outcome.
Starting in the third round, I do a Bayesian update of the appropriate Dirichlet prior of my opponent's last response to the thing I played two rounds ago. I'm trying to iteratively learn their dice weightings.
I could have also simply picked the response with the best "expected" outcome once generating the dice, instead of simply rolling the dice and responding to the outcome. However, I wanted to keep the randomness in, so that my bot is less vulnerable to the ones that try to predict a pattern.
New contributor
add a comment |
up vote
2
down vote
Improved Dirichlet Dice
import random
class DirichletDice2:
def __init__(self):
self.alpha = dict(
C = {'C' : 1, 'N' : 1, 'D' : 1},
N = {'C' : 1, 'N' : 1, 'D' : 1},
D = {'C' : 1, 'N' : 1, 'D' : 1}
)
self.myLast = [None, None]
self.payoff = dict(
C = { "C": 0, "N": 3, "D": -5 },
N = { "C": -3, "N": 0, "D": 1 },
D = { "C": 5, "N": -1, "D": 0 }
)
def DirichletDraw(self, key):
alpha = self.alpha[key].values()
mu = [random.gammavariate(a,1) for a in alpha]
mu = [m / sum(mu) for m in mu]
return mu
def ExpectedPayoff(self, probs):
expectedPayoff = {}
for val in ['C','N','D']:
payoff = sum([p * v for p,v in zip(probs, self.payoff[val].values())])
expectedPayoff[val] = payoff
return expectedPayoff
def round(self, last):
if last is None:
self.myLast[0] = 'D'
return 'D'
#update dice corresponding to opponent's last response to my
#outcome two turns ago
if self.myLast[1] is not None:
self.alpha[self.myLast[1]][last] += 1
#draw probs for my opponent's roll from Dirichlet distribution and then return the optimal response
mu = self.DirichletDraw(self.myLast[0])
expectedPayoff = self.ExpectedPayoff(mu)
res = max(expectedPayoff, key=expectedPayoff.get)
#update myLast
self.myLast[1] = self.myLast[0]
self.myLast[0] = res
return res
This is an improved version of Dirichlet Dice. Instead of taking the expected multinomial distribution from the Dirichlet distribution, it draws a Multinomial distribution randomly from the Dirichlet distribution. Then, instead of drawing from the Multinomial and giving the optimal response to that, it gives the optimal expected response to the given Multinomial using the points. So the randomness has essentially been shifted from the Multinomial draw to the Dirichlet draw. Also, the priors are more flat now, to encourage exploration.
It's "improved" because it now accounts for the points system by giving the best expected value against the probabilities, while maintaining its randomness by drawing the probabilities themselves. Before, I tried simply doing the best expected payoff from the expected probabilities, but that did badly because it just got stuck, and didn't explore enough to update its dice. Also it was more predictable and exploitable.
Original submission:
Dirichlet Dice
import random
class DirichletDice:
def __init__(self):
self.alpha = dict(
C = {'C' : 2, 'N' : 3, 'D' : 1},
N = {'C' : 1, 'N' : 2, 'D' : 3},
D = {'C' : 3, 'N' : 1, 'D' : 2}
)
self.Response = {'C' : 'D', 'N' : 'C', 'D' : 'N'}
self.myLast = [None, None]
#expected value of the dirichlet distribution given by Alpha
def MultinomialDraw(self, key):
alpha = list(self.alpha[key].values())
probs = [x / sum(alpha) for x in alpha]
outcome = random.choices(['C','N','D'], weights=probs)[0]
return outcome
def round(self, last):
if last is None:
self.myLast[0] = 'D'
return 'D'
#update dice corresponding to opponent's last response to my
#outcome two turns ago
if self.myLast[1] is not None:
self.alpha[self.myLast[1]][last] += 1
#predict opponent's move based on my last move
predict = self.MultinomialDraw(self.myLast[0])
res = self.Response[predict]
#update myLast
self.myLast[1] = self.myLast[0]
self.myLast[0] = res
return res
Basically I'm assuming that the opponent's response to my last output is a multinomial variable (weighted dice), one for each of my outputs, so there's a dice for "C", one for "N", and one for "D". So if my last roll was, for example, a "N" then I roll the "N-dice" to guess what their response would be to my "N". I begin with a Dirichlet prior that assumes that my opponent is somewhat "smart" (more likely to play the one with the best payoff against my last roll, least likely to play the one with the worst payoff). I generate the "expected" Multinomial distribution from the appropriate Dirichlet prior (this is the expected value of the probability distribution over their dice weights). I roll the weighted dice of my last output, and respond with the one with the best payoff against that dice outcome.
Starting in the third round, I do a Bayesian update of the appropriate Dirichlet prior of my opponent's last response to the thing I played two rounds ago. I'm trying to iteratively learn their dice weightings.
I could have also simply picked the response with the best "expected" outcome once generating the dice, instead of simply rolling the dice and responding to the outcome. However, I wanted to keep the randomness in, so that my bot is less vulnerable to the ones that try to predict a pattern.
New contributor
add a comment |
up vote
2
down vote
up vote
2
down vote
Improved Dirichlet Dice
import random
class DirichletDice2:
def __init__(self):
self.alpha = dict(
C = {'C' : 1, 'N' : 1, 'D' : 1},
N = {'C' : 1, 'N' : 1, 'D' : 1},
D = {'C' : 1, 'N' : 1, 'D' : 1}
)
self.myLast = [None, None]
self.payoff = dict(
C = { "C": 0, "N": 3, "D": -5 },
N = { "C": -3, "N": 0, "D": 1 },
D = { "C": 5, "N": -1, "D": 0 }
)
def DirichletDraw(self, key):
alpha = self.alpha[key].values()
mu = [random.gammavariate(a,1) for a in alpha]
mu = [m / sum(mu) for m in mu]
return mu
def ExpectedPayoff(self, probs):
expectedPayoff = {}
for val in ['C','N','D']:
payoff = sum([p * v for p,v in zip(probs, self.payoff[val].values())])
expectedPayoff[val] = payoff
return expectedPayoff
def round(self, last):
if last is None:
self.myLast[0] = 'D'
return 'D'
#update dice corresponding to opponent's last response to my
#outcome two turns ago
if self.myLast[1] is not None:
self.alpha[self.myLast[1]][last] += 1
#draw probs for my opponent's roll from Dirichlet distribution and then return the optimal response
mu = self.DirichletDraw(self.myLast[0])
expectedPayoff = self.ExpectedPayoff(mu)
res = max(expectedPayoff, key=expectedPayoff.get)
#update myLast
self.myLast[1] = self.myLast[0]
self.myLast[0] = res
return res
This is an improved version of Dirichlet Dice. Instead of taking the expected multinomial distribution from the Dirichlet distribution, it draws a Multinomial distribution randomly from the Dirichlet distribution. Then, instead of drawing from the Multinomial and giving the optimal response to that, it gives the optimal expected response to the given Multinomial using the points. So the randomness has essentially been shifted from the Multinomial draw to the Dirichlet draw. Also, the priors are more flat now, to encourage exploration.
It's "improved" because it now accounts for the points system by giving the best expected value against the probabilities, while maintaining its randomness by drawing the probabilities themselves. Before, I tried simply doing the best expected payoff from the expected probabilities, but that did badly because it just got stuck, and didn't explore enough to update its dice. Also it was more predictable and exploitable.
Original submission:
Dirichlet Dice
import random
class DirichletDice:
def __init__(self):
self.alpha = dict(
C = {'C' : 2, 'N' : 3, 'D' : 1},
N = {'C' : 1, 'N' : 2, 'D' : 3},
D = {'C' : 3, 'N' : 1, 'D' : 2}
)
self.Response = {'C' : 'D', 'N' : 'C', 'D' : 'N'}
self.myLast = [None, None]
#expected value of the dirichlet distribution given by Alpha
def MultinomialDraw(self, key):
alpha = list(self.alpha[key].values())
probs = [x / sum(alpha) for x in alpha]
outcome = random.choices(['C','N','D'], weights=probs)[0]
return outcome
def round(self, last):
if last is None:
self.myLast[0] = 'D'
return 'D'
#update dice corresponding to opponent's last response to my
#outcome two turns ago
if self.myLast[1] is not None:
self.alpha[self.myLast[1]][last] += 1
#predict opponent's move based on my last move
predict = self.MultinomialDraw(self.myLast[0])
res = self.Response[predict]
#update myLast
self.myLast[1] = self.myLast[0]
self.myLast[0] = res
return res
Basically I'm assuming that the opponent's response to my last output is a multinomial variable (weighted dice), one for each of my outputs, so there's a dice for "C", one for "N", and one for "D". So if my last roll was, for example, a "N" then I roll the "N-dice" to guess what their response would be to my "N". I begin with a Dirichlet prior that assumes that my opponent is somewhat "smart" (more likely to play the one with the best payoff against my last roll, least likely to play the one with the worst payoff). I generate the "expected" Multinomial distribution from the appropriate Dirichlet prior (this is the expected value of the probability distribution over their dice weights). I roll the weighted dice of my last output, and respond with the one with the best payoff against that dice outcome.
Starting in the third round, I do a Bayesian update of the appropriate Dirichlet prior of my opponent's last response to the thing I played two rounds ago. I'm trying to iteratively learn their dice weightings.
I could have also simply picked the response with the best "expected" outcome once generating the dice, instead of simply rolling the dice and responding to the outcome. However, I wanted to keep the randomness in, so that my bot is less vulnerable to the ones that try to predict a pattern.
New contributor
Improved Dirichlet Dice
import random
class DirichletDice2:
def __init__(self):
self.alpha = dict(
C = {'C' : 1, 'N' : 1, 'D' : 1},
N = {'C' : 1, 'N' : 1, 'D' : 1},
D = {'C' : 1, 'N' : 1, 'D' : 1}
)
self.myLast = [None, None]
self.payoff = dict(
C = { "C": 0, "N": 3, "D": -5 },
N = { "C": -3, "N": 0, "D": 1 },
D = { "C": 5, "N": -1, "D": 0 }
)
def DirichletDraw(self, key):
alpha = self.alpha[key].values()
mu = [random.gammavariate(a,1) for a in alpha]
mu = [m / sum(mu) for m in mu]
return mu
def ExpectedPayoff(self, probs):
expectedPayoff = {}
for val in ['C','N','D']:
payoff = sum([p * v for p,v in zip(probs, self.payoff[val].values())])
expectedPayoff[val] = payoff
return expectedPayoff
def round(self, last):
if last is None:
self.myLast[0] = 'D'
return 'D'
#update dice corresponding to opponent's last response to my
#outcome two turns ago
if self.myLast[1] is not None:
self.alpha[self.myLast[1]][last] += 1
#draw probs for my opponent's roll from Dirichlet distribution and then return the optimal response
mu = self.DirichletDraw(self.myLast[0])
expectedPayoff = self.ExpectedPayoff(mu)
res = max(expectedPayoff, key=expectedPayoff.get)
#update myLast
self.myLast[1] = self.myLast[0]
self.myLast[0] = res
return res
This is an improved version of Dirichlet Dice. Instead of taking the expected multinomial distribution from the Dirichlet distribution, it draws a Multinomial distribution randomly from the Dirichlet distribution. Then, instead of drawing from the Multinomial and giving the optimal response to that, it gives the optimal expected response to the given Multinomial using the points. So the randomness has essentially been shifted from the Multinomial draw to the Dirichlet draw. Also, the priors are more flat now, to encourage exploration.
It's "improved" because it now accounts for the points system by giving the best expected value against the probabilities, while maintaining its randomness by drawing the probabilities themselves. Before, I tried simply doing the best expected payoff from the expected probabilities, but that did badly because it just got stuck, and didn't explore enough to update its dice. Also it was more predictable and exploitable.
Original submission:
Dirichlet Dice
import random
class DirichletDice:
def __init__(self):
self.alpha = dict(
C = {'C' : 2, 'N' : 3, 'D' : 1},
N = {'C' : 1, 'N' : 2, 'D' : 3},
D = {'C' : 3, 'N' : 1, 'D' : 2}
)
self.Response = {'C' : 'D', 'N' : 'C', 'D' : 'N'}
self.myLast = [None, None]
#expected value of the dirichlet distribution given by Alpha
def MultinomialDraw(self, key):
alpha = list(self.alpha[key].values())
probs = [x / sum(alpha) for x in alpha]
outcome = random.choices(['C','N','D'], weights=probs)[0]
return outcome
def round(self, last):
if last is None:
self.myLast[0] = 'D'
return 'D'
#update dice corresponding to opponent's last response to my
#outcome two turns ago
if self.myLast[1] is not None:
self.alpha[self.myLast[1]][last] += 1
#predict opponent's move based on my last move
predict = self.MultinomialDraw(self.myLast[0])
res = self.Response[predict]
#update myLast
self.myLast[1] = self.myLast[0]
self.myLast[0] = res
return res
Basically I'm assuming that the opponent's response to my last output is a multinomial variable (weighted dice), one for each of my outputs, so there's a dice for "C", one for "N", and one for "D". So if my last roll was, for example, a "N" then I roll the "N-dice" to guess what their response would be to my "N". I begin with a Dirichlet prior that assumes that my opponent is somewhat "smart" (more likely to play the one with the best payoff against my last roll, least likely to play the one with the worst payoff). I generate the "expected" Multinomial distribution from the appropriate Dirichlet prior (this is the expected value of the probability distribution over their dice weights). I roll the weighted dice of my last output, and respond with the one with the best payoff against that dice outcome.
Starting in the third round, I do a Bayesian update of the appropriate Dirichlet prior of my opponent's last response to the thing I played two rounds ago. I'm trying to iteratively learn their dice weightings.
I could have also simply picked the response with the best "expected" outcome once generating the dice, instead of simply rolling the dice and responding to the outcome. However, I wanted to keep the randomness in, so that my bot is less vulnerable to the ones that try to predict a pattern.
New contributor
edited 23 hours ago
New contributor
answered yesterday
Bridgeburners
1212
1212
New contributor
New contributor
add a comment |
add a comment |
up vote
2
down vote
Kevin
class Kevin:
def round(self, last):
return {"C":"N","N":"D","D":"C",None:"N"} [last]
Picks the worst choice. The worst bot made.
Useless
import random
class Useless:
def __init__(self):
self.lastLast = None
def round(self, last):
tempLastLast = self.lastLast
self.lastLast = last
if(last == "D" and tempLastLast == "N"):
return "C"
if(last == "D" and tempLastLast == "C"):
return "N"
if(last == "N" and tempLastLast == "D"):
return "C"
if(last == "N" and tempLastLast == "C"):
return "D"
if(last == "C" and tempLastLast == "D"):
return "N"
if(last == "C" and tempLastLast == "N"):
return "D"
return random.choice("CND")
It looks at the last two moves done by the opponent and picks the most not done else it picks something random. There is probably a better way of doing this.
New contributor
add a comment |
up vote
2
down vote
Kevin
class Kevin:
def round(self, last):
return {"C":"N","N":"D","D":"C",None:"N"} [last]
Picks the worst choice. The worst bot made.
Useless
import random
class Useless:
def __init__(self):
self.lastLast = None
def round(self, last):
tempLastLast = self.lastLast
self.lastLast = last
if(last == "D" and tempLastLast == "N"):
return "C"
if(last == "D" and tempLastLast == "C"):
return "N"
if(last == "N" and tempLastLast == "D"):
return "C"
if(last == "N" and tempLastLast == "C"):
return "D"
if(last == "C" and tempLastLast == "D"):
return "N"
if(last == "C" and tempLastLast == "N"):
return "D"
return random.choice("CND")
It looks at the last two moves done by the opponent and picks the most not done else it picks something random. There is probably a better way of doing this.
New contributor
add a comment |
up vote
2
down vote
up vote
2
down vote
Kevin
class Kevin:
def round(self, last):
return {"C":"N","N":"D","D":"C",None:"N"} [last]
Picks the worst choice. The worst bot made.
Useless
import random
class Useless:
def __init__(self):
self.lastLast = None
def round(self, last):
tempLastLast = self.lastLast
self.lastLast = last
if(last == "D" and tempLastLast == "N"):
return "C"
if(last == "D" and tempLastLast == "C"):
return "N"
if(last == "N" and tempLastLast == "D"):
return "C"
if(last == "N" and tempLastLast == "C"):
return "D"
if(last == "C" and tempLastLast == "D"):
return "N"
if(last == "C" and tempLastLast == "N"):
return "D"
return random.choice("CND")
It looks at the last two moves done by the opponent and picks the most not done else it picks something random. There is probably a better way of doing this.
New contributor
Kevin
class Kevin:
def round(self, last):
return {"C":"N","N":"D","D":"C",None:"N"} [last]
Picks the worst choice. The worst bot made.
Useless
import random
class Useless:
def __init__(self):
self.lastLast = None
def round(self, last):
tempLastLast = self.lastLast
self.lastLast = last
if(last == "D" and tempLastLast == "N"):
return "C"
if(last == "D" and tempLastLast == "C"):
return "N"
if(last == "N" and tempLastLast == "D"):
return "C"
if(last == "N" and tempLastLast == "C"):
return "D"
if(last == "C" and tempLastLast == "D"):
return "N"
if(last == "C" and tempLastLast == "N"):
return "D"
return random.choice("CND")
It looks at the last two moves done by the opponent and picks the most not done else it picks something random. There is probably a better way of doing this.
New contributor
edited 22 hours ago
New contributor
answered 23 hours ago
Link1J
192
192
New contributor
New contributor
add a comment |
add a comment |
up vote
1
down vote
Weighted Average
class WeightedAverageBot:
def __init__(self):
self.C_bias = 1/4
self.N = self.C_bias
self.D = self.C_bias
self.prev_weight = 1/2
def round(self, last):
if last:
if last == "C" or last == "N":
self.D *= self.prev_weight
if last == "C" or last == "D":
self.N *= self.prev_weight
if last == "N":
self.N = 1 - ((1 - self.N) * self.prev_weight)
if last == "D":
self.D = 1 - ((1 - self.D) * self.prev_weight)
if self.N <= self.C_bias and self.D <= self.C_bias:
return "D"
if self.N > self.D:
return "C"
return "N"
The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.
add a comment |
up vote
1
down vote
Weighted Average
class WeightedAverageBot:
def __init__(self):
self.C_bias = 1/4
self.N = self.C_bias
self.D = self.C_bias
self.prev_weight = 1/2
def round(self, last):
if last:
if last == "C" or last == "N":
self.D *= self.prev_weight
if last == "C" or last == "D":
self.N *= self.prev_weight
if last == "N":
self.N = 1 - ((1 - self.N) * self.prev_weight)
if last == "D":
self.D = 1 - ((1 - self.D) * self.prev_weight)
if self.N <= self.C_bias and self.D <= self.C_bias:
return "D"
if self.N > self.D:
return "C"
return "N"
The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.
add a comment |
up vote
1
down vote
up vote
1
down vote
Weighted Average
class WeightedAverageBot:
def __init__(self):
self.C_bias = 1/4
self.N = self.C_bias
self.D = self.C_bias
self.prev_weight = 1/2
def round(self, last):
if last:
if last == "C" or last == "N":
self.D *= self.prev_weight
if last == "C" or last == "D":
self.N *= self.prev_weight
if last == "N":
self.N = 1 - ((1 - self.N) * self.prev_weight)
if last == "D":
self.D = 1 - ((1 - self.D) * self.prev_weight)
if self.N <= self.C_bias and self.D <= self.C_bias:
return "D"
if self.N > self.D:
return "C"
return "N"
The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.
Weighted Average
class WeightedAverageBot:
def __init__(self):
self.C_bias = 1/4
self.N = self.C_bias
self.D = self.C_bias
self.prev_weight = 1/2
def round(self, last):
if last:
if last == "C" or last == "N":
self.D *= self.prev_weight
if last == "C" or last == "D":
self.N *= self.prev_weight
if last == "N":
self.N = 1 - ((1 - self.N) * self.prev_weight)
if last == "D":
self.D = 1 - ((1 - self.D) * self.prev_weight)
if self.N <= self.C_bias and self.D <= self.C_bias:
return "D"
if self.N > self.D:
return "C"
return "N"
The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.
answered 2 days ago
Sparr
5,0081633
5,0081633
add a comment |
add a comment |
up vote
1
down vote
Tetragram
import itertools
class Tetragram:
def __init__(self):
self.history = {x: ['C'] for x in itertools.product('CND', repeat=4)}
self.theirs =
self.previous = None
def round(self, last):
if self.previous is not None and len(self.previous) == 4:
self.history[self.previous].append(last)
if last is not None:
self.theirs = (self.theirs + [last])[-3:]
if self.previous is not None and len(self.previous) == 4:
expected = random.choice(self.history[self.previous])
if expected == 'C':
choice = 'C'
elif expected == 'N':
choice = 'C'
else:
choice = 'N'
else:
choice = 'C'
self.previous = tuple(self.theirs + [choice])
return choice
Try to find a pattern in the opponent's moves, assuming they're also watching our last move.
add a comment |
up vote
1
down vote
Tetragram
import itertools
class Tetragram:
def __init__(self):
self.history = {x: ['C'] for x in itertools.product('CND', repeat=4)}
self.theirs =
self.previous = None
def round(self, last):
if self.previous is not None and len(self.previous) == 4:
self.history[self.previous].append(last)
if last is not None:
self.theirs = (self.theirs + [last])[-3:]
if self.previous is not None and len(self.previous) == 4:
expected = random.choice(self.history[self.previous])
if expected == 'C':
choice = 'C'
elif expected == 'N':
choice = 'C'
else:
choice = 'N'
else:
choice = 'C'
self.previous = tuple(self.theirs + [choice])
return choice
Try to find a pattern in the opponent's moves, assuming they're also watching our last move.
add a comment |
up vote
1
down vote
up vote
1
down vote
Tetragram
import itertools
class Tetragram:
def __init__(self):
self.history = {x: ['C'] for x in itertools.product('CND', repeat=4)}
self.theirs =
self.previous = None
def round(self, last):
if self.previous is not None and len(self.previous) == 4:
self.history[self.previous].append(last)
if last is not None:
self.theirs = (self.theirs + [last])[-3:]
if self.previous is not None and len(self.previous) == 4:
expected = random.choice(self.history[self.previous])
if expected == 'C':
choice = 'C'
elif expected == 'N':
choice = 'C'
else:
choice = 'N'
else:
choice = 'C'
self.previous = tuple(self.theirs + [choice])
return choice
Try to find a pattern in the opponent's moves, assuming they're also watching our last move.
Tetragram
import itertools
class Tetragram:
def __init__(self):
self.history = {x: ['C'] for x in itertools.product('CND', repeat=4)}
self.theirs =
self.previous = None
def round(self, last):
if self.previous is not None and len(self.previous) == 4:
self.history[self.previous].append(last)
if last is not None:
self.theirs = (self.theirs + [last])[-3:]
if self.previous is not None and len(self.previous) == 4:
expected = random.choice(self.history[self.previous])
if expected == 'C':
choice = 'C'
elif expected == 'N':
choice = 'C'
else:
choice = 'N'
else:
choice = 'C'
self.previous = tuple(self.theirs + [choice])
return choice
Try to find a pattern in the opponent's moves, assuming they're also watching our last move.
answered yesterday
Mnemonic
4,6351630
4,6351630
add a comment |
add a comment |
up vote
1
down vote
Handshake
class HandshakeBot:
def __init__(self):
self.handshake_length = 4
self.handshake = ["N","N","C","D"]
while len(self.handshake) < self.handshake_length:
self.handshake *= 2
self.handshake = self.handshake[:self.handshake_length]
self.opp_hand =
self.friendly = None
def round(self, last):
if last:
if self.friendly == None:
# still trying to handshake
self.opp_hand.append(last)
if self.opp_hand[-1] != self.handshake[len(self.opp_hand)-1]:
self.friendly = False
return "D"
if len(self.opp_hand) == len(self.handshake):
self.friendly = True
return "C"
return self.handshake[len(self.opp_hand)]
elif self.friendly == True:
# successful handshake and continued cooperation
if last == "C":
return "C"
self.friendly = False
return "D"
else:
# failed handshake or abandoned cooperation
return "N" if last == "D" else ("D" if last == "C" else "C")
return self.handshake[0]
Recognizes when it's playing against itself, then cooperates. Otherwise mimics LastOptimalBot which seems like the best one-line strategy. Performs worse than LastOptimalBot, by an amount inversely proportional to the number of rounds. Obviously would do better if there were more copies of it in the field *cough**wink*.
Just submit a few clones that have different non-handshake behavior.
– Blacksilver
yesterday
That seems exploit-y. I could submit one such clone for every simple behavior represented here.
– Sparr
yesterday
I've added an extra clause saying that you can only submit max five bots.
– Blacksilver
yesterday
add a comment |
up vote
1
down vote
Handshake
class HandshakeBot:
def __init__(self):
self.handshake_length = 4
self.handshake = ["N","N","C","D"]
while len(self.handshake) < self.handshake_length:
self.handshake *= 2
self.handshake = self.handshake[:self.handshake_length]
self.opp_hand =
self.friendly = None
def round(self, last):
if last:
if self.friendly == None:
# still trying to handshake
self.opp_hand.append(last)
if self.opp_hand[-1] != self.handshake[len(self.opp_hand)-1]:
self.friendly = False
return "D"
if len(self.opp_hand) == len(self.handshake):
self.friendly = True
return "C"
return self.handshake[len(self.opp_hand)]
elif self.friendly == True:
# successful handshake and continued cooperation
if last == "C":
return "C"
self.friendly = False
return "D"
else:
# failed handshake or abandoned cooperation
return "N" if last == "D" else ("D" if last == "C" else "C")
return self.handshake[0]
Recognizes when it's playing against itself, then cooperates. Otherwise mimics LastOptimalBot which seems like the best one-line strategy. Performs worse than LastOptimalBot, by an amount inversely proportional to the number of rounds. Obviously would do better if there were more copies of it in the field *cough**wink*.
Just submit a few clones that have different non-handshake behavior.
– Blacksilver
yesterday
That seems exploit-y. I could submit one such clone for every simple behavior represented here.
– Sparr
yesterday
I've added an extra clause saying that you can only submit max five bots.
– Blacksilver
yesterday
add a comment |
up vote
1
down vote
up vote
1
down vote
Handshake
class HandshakeBot:
def __init__(self):
self.handshake_length = 4
self.handshake = ["N","N","C","D"]
while len(self.handshake) < self.handshake_length:
self.handshake *= 2
self.handshake = self.handshake[:self.handshake_length]
self.opp_hand =
self.friendly = None
def round(self, last):
if last:
if self.friendly == None:
# still trying to handshake
self.opp_hand.append(last)
if self.opp_hand[-1] != self.handshake[len(self.opp_hand)-1]:
self.friendly = False
return "D"
if len(self.opp_hand) == len(self.handshake):
self.friendly = True
return "C"
return self.handshake[len(self.opp_hand)]
elif self.friendly == True:
# successful handshake and continued cooperation
if last == "C":
return "C"
self.friendly = False
return "D"
else:
# failed handshake or abandoned cooperation
return "N" if last == "D" else ("D" if last == "C" else "C")
return self.handshake[0]
Recognizes when it's playing against itself, then cooperates. Otherwise mimics LastOptimalBot which seems like the best one-line strategy. Performs worse than LastOptimalBot, by an amount inversely proportional to the number of rounds. Obviously would do better if there were more copies of it in the field *cough**wink*.
Handshake
class HandshakeBot:
def __init__(self):
self.handshake_length = 4
self.handshake = ["N","N","C","D"]
while len(self.handshake) < self.handshake_length:
self.handshake *= 2
self.handshake = self.handshake[:self.handshake_length]
self.opp_hand =
self.friendly = None
def round(self, last):
if last:
if self.friendly == None:
# still trying to handshake
self.opp_hand.append(last)
if self.opp_hand[-1] != self.handshake[len(self.opp_hand)-1]:
self.friendly = False
return "D"
if len(self.opp_hand) == len(self.handshake):
self.friendly = True
return "C"
return self.handshake[len(self.opp_hand)]
elif self.friendly == True:
# successful handshake and continued cooperation
if last == "C":
return "C"
self.friendly = False
return "D"
else:
# failed handshake or abandoned cooperation
return "N" if last == "D" else ("D" if last == "C" else "C")
return self.handshake[0]
Recognizes when it's playing against itself, then cooperates. Otherwise mimics LastOptimalBot which seems like the best one-line strategy. Performs worse than LastOptimalBot, by an amount inversely proportional to the number of rounds. Obviously would do better if there were more copies of it in the field *cough**wink*.
answered yesterday
Sparr
5,0081633
5,0081633
Just submit a few clones that have different non-handshake behavior.
– Blacksilver
yesterday
That seems exploit-y. I could submit one such clone for every simple behavior represented here.
– Sparr
yesterday
I've added an extra clause saying that you can only submit max five bots.
– Blacksilver
yesterday
add a comment |
Just submit a few clones that have different non-handshake behavior.
– Blacksilver
yesterday
That seems exploit-y. I could submit one such clone for every simple behavior represented here.
– Sparr
yesterday
I've added an extra clause saying that you can only submit max five bots.
– Blacksilver
yesterday
Just submit a few clones that have different non-handshake behavior.
– Blacksilver
yesterday
Just submit a few clones that have different non-handshake behavior.
– Blacksilver
yesterday
That seems exploit-y. I could submit one such clone for every simple behavior represented here.
– Sparr
yesterday
That seems exploit-y. I could submit one such clone for every simple behavior represented here.
– Sparr
yesterday
I've added an extra clause saying that you can only submit max five bots.
– Blacksilver
yesterday
I've added an extra clause saying that you can only submit max five bots.
– Blacksilver
yesterday
add a comment |
up vote
1
down vote
ShiftingOptimalBot
class ShiftingOptimalBot:
def __init__(self):
# wins, draws, losses
self.history = [0,0,0]
self.lastMove = None
self.state = 0
def round(self, last):
if last == None:
self.lastMove = "C"
return self.lastMove
if last == self.lastMove:
self.history[1] += 1
elif (last == "C" and self.lastMove == "D") or (last == "D" and self.lastMove == "N") or (last == "N" and self.lastMove == "C"):
self.history[0] += 1
else:
self.history[2] += 1
if self.history[0] + 1 < self.history[2] or self.history[2] > 5:
self.state = (self.state + 1) % 3
self.history = [0,0,0]
if self.history[1] > self.history[0] + self.history[2] + 2:
self.state = (self.state + 2) % 3
self.history = [0,0,0]
if self.state == 0:
self.lastMove = "N" if last == "D" else ("D" if last == "C" else "C")
elif self.state == 1:
self.lastMove = last
else:
self.lastMove = "C" if last == "D" else ("N" if last == "C" else "D")
return self.lastMove
This bot uses LastOptimalBot's algorithm as long as it's winning. If the other bot starts predicting it, however, it will start playing whichever move its opponent played last (which is the move that beats the move that would beat LastOptimalBot). It cycles through simple transpositions of those algorithms as long as it continues to lose (or when it gets bored by drawing a lot).
Honestly, I'm surprised that LastOptimalBot is sitting in 5th as I post this. I'm fairly certain this will do better, assuming that I wrote this python correctly.
add a comment |
up vote
1
down vote
ShiftingOptimalBot
class ShiftingOptimalBot:
def __init__(self):
# wins, draws, losses
self.history = [0,0,0]
self.lastMove = None
self.state = 0
def round(self, last):
if last == None:
self.lastMove = "C"
return self.lastMove
if last == self.lastMove:
self.history[1] += 1
elif (last == "C" and self.lastMove == "D") or (last == "D" and self.lastMove == "N") or (last == "N" and self.lastMove == "C"):
self.history[0] += 1
else:
self.history[2] += 1
if self.history[0] + 1 < self.history[2] or self.history[2] > 5:
self.state = (self.state + 1) % 3
self.history = [0,0,0]
if self.history[1] > self.history[0] + self.history[2] + 2:
self.state = (self.state + 2) % 3
self.history = [0,0,0]
if self.state == 0:
self.lastMove = "N" if last == "D" else ("D" if last == "C" else "C")
elif self.state == 1:
self.lastMove = last
else:
self.lastMove = "C" if last == "D" else ("N" if last == "C" else "D")
return self.lastMove
This bot uses LastOptimalBot's algorithm as long as it's winning. If the other bot starts predicting it, however, it will start playing whichever move its opponent played last (which is the move that beats the move that would beat LastOptimalBot). It cycles through simple transpositions of those algorithms as long as it continues to lose (or when it gets bored by drawing a lot).
Honestly, I'm surprised that LastOptimalBot is sitting in 5th as I post this. I'm fairly certain this will do better, assuming that I wrote this python correctly.
add a comment |
up vote
1
down vote
up vote
1
down vote
ShiftingOptimalBot
class ShiftingOptimalBot:
def __init__(self):
# wins, draws, losses
self.history = [0,0,0]
self.lastMove = None
self.state = 0
def round(self, last):
if last == None:
self.lastMove = "C"
return self.lastMove
if last == self.lastMove:
self.history[1] += 1
elif (last == "C" and self.lastMove == "D") or (last == "D" and self.lastMove == "N") or (last == "N" and self.lastMove == "C"):
self.history[0] += 1
else:
self.history[2] += 1
if self.history[0] + 1 < self.history[2] or self.history[2] > 5:
self.state = (self.state + 1) % 3
self.history = [0,0,0]
if self.history[1] > self.history[0] + self.history[2] + 2:
self.state = (self.state + 2) % 3
self.history = [0,0,0]
if self.state == 0:
self.lastMove = "N" if last == "D" else ("D" if last == "C" else "C")
elif self.state == 1:
self.lastMove = last
else:
self.lastMove = "C" if last == "D" else ("N" if last == "C" else "D")
return self.lastMove
This bot uses LastOptimalBot's algorithm as long as it's winning. If the other bot starts predicting it, however, it will start playing whichever move its opponent played last (which is the move that beats the move that would beat LastOptimalBot). It cycles through simple transpositions of those algorithms as long as it continues to lose (or when it gets bored by drawing a lot).
Honestly, I'm surprised that LastOptimalBot is sitting in 5th as I post this. I'm fairly certain this will do better, assuming that I wrote this python correctly.
ShiftingOptimalBot
class ShiftingOptimalBot:
def __init__(self):
# wins, draws, losses
self.history = [0,0,0]
self.lastMove = None
self.state = 0
def round(self, last):
if last == None:
self.lastMove = "C"
return self.lastMove
if last == self.lastMove:
self.history[1] += 1
elif (last == "C" and self.lastMove == "D") or (last == "D" and self.lastMove == "N") or (last == "N" and self.lastMove == "C"):
self.history[0] += 1
else:
self.history[2] += 1
if self.history[0] + 1 < self.history[2] or self.history[2] > 5:
self.state = (self.state + 1) % 3
self.history = [0,0,0]
if self.history[1] > self.history[0] + self.history[2] + 2:
self.state = (self.state + 2) % 3
self.history = [0,0,0]
if self.state == 0:
self.lastMove = "N" if last == "D" else ("D" if last == "C" else "C")
elif self.state == 1:
self.lastMove = last
else:
self.lastMove = "C" if last == "D" else ("N" if last == "C" else "D")
return self.lastMove
This bot uses LastOptimalBot's algorithm as long as it's winning. If the other bot starts predicting it, however, it will start playing whichever move its opponent played last (which is the move that beats the move that would beat LastOptimalBot). It cycles through simple transpositions of those algorithms as long as it continues to lose (or when it gets bored by drawing a lot).
Honestly, I'm surprised that LastOptimalBot is sitting in 5th as I post this. I'm fairly certain this will do better, assuming that I wrote this python correctly.
edited 22 hours ago
Blacksilver
425314
425314
answered yesterday
Spitemaster
3315
3315
add a comment |
add a comment |
up vote
1
down vote
class HistoricAverage:
PAYOFFS = {
"C":{"C":3,"N":1,"D":5},
"N":{"C":4,"N":2,"D":2},
"D":{"C":0,"N":3,"D":1}}
def __init__(self):
self.payoffsum = {"C":0, "N":0, "D":0}
def round(this, last):
if(last != None):
for x in this.payoffsum:
this.payoffsum[x] += HistoricAverage.PAYOFFS[last][x]
return max(this.payoffsum, key=this.payoffsum.get)
Looks at history and finds the action that would have been best on average. Starts cooperative.
This could run faster if it didn't re-calculate the averages every round.
– Sparr
yesterday
@Sparr true. I edited it so it does now.
– MegaTom
21 hours ago
add a comment |
up vote
1
down vote
class HistoricAverage:
PAYOFFS = {
"C":{"C":3,"N":1,"D":5},
"N":{"C":4,"N":2,"D":2},
"D":{"C":0,"N":3,"D":1}}
def __init__(self):
self.payoffsum = {"C":0, "N":0, "D":0}
def round(this, last):
if(last != None):
for x in this.payoffsum:
this.payoffsum[x] += HistoricAverage.PAYOFFS[last][x]
return max(this.payoffsum, key=this.payoffsum.get)
Looks at history and finds the action that would have been best on average. Starts cooperative.
This could run faster if it didn't re-calculate the averages every round.
– Sparr
yesterday
@Sparr true. I edited it so it does now.
– MegaTom
21 hours ago
add a comment |
up vote
1
down vote
up vote
1
down vote
class HistoricAverage:
PAYOFFS = {
"C":{"C":3,"N":1,"D":5},
"N":{"C":4,"N":2,"D":2},
"D":{"C":0,"N":3,"D":1}}
def __init__(self):
self.payoffsum = {"C":0, "N":0, "D":0}
def round(this, last):
if(last != None):
for x in this.payoffsum:
this.payoffsum[x] += HistoricAverage.PAYOFFS[last][x]
return max(this.payoffsum, key=this.payoffsum.get)
Looks at history and finds the action that would have been best on average. Starts cooperative.
class HistoricAverage:
PAYOFFS = {
"C":{"C":3,"N":1,"D":5},
"N":{"C":4,"N":2,"D":2},
"D":{"C":0,"N":3,"D":1}}
def __init__(self):
self.payoffsum = {"C":0, "N":0, "D":0}
def round(this, last):
if(last != None):
for x in this.payoffsum:
this.payoffsum[x] += HistoricAverage.PAYOFFS[last][x]
return max(this.payoffsum, key=this.payoffsum.get)
Looks at history and finds the action that would have been best on average. Starts cooperative.
edited 21 hours ago
answered 2 days ago
MegaTom
3,4421324
3,4421324
This could run faster if it didn't re-calculate the averages every round.
– Sparr
yesterday
@Sparr true. I edited it so it does now.
– MegaTom
21 hours ago
add a comment |
This could run faster if it didn't re-calculate the averages every round.
– Sparr
yesterday
@Sparr true. I edited it so it does now.
– MegaTom
21 hours ago
This could run faster if it didn't re-calculate the averages every round.
– Sparr
yesterday
This could run faster if it didn't re-calculate the averages every round.
– Sparr
yesterday
@Sparr true. I edited it so it does now.
– MegaTom
21 hours ago
@Sparr true. I edited it so it does now.
– MegaTom
21 hours ago
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodegolf.stackexchange.com%2fquestions%2f175794%2fiterated-prisoners-trilemma%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
2 days ago
1
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
2 days ago
1
Done. This was on the sandbox for like a month!
– Blacksilver
2 days ago
2
If you wrap most of main.py in
while len(botlist) > 1:
withbotlist.remove(lowest_scoring_bot)
at the bottom of the loop, you get an elimination tournament with interesting results.– Sparr
2 days ago
1
Another version of this someday might pass the entire interaction history rather than just the last move. It doesn't change much although it simplifies user code slightly. But it would allow for extensions, such as noisy communication channels that clarify over time: "Really, a D, even though I've said C four times in a row? No, I didn't say D; what do you take me for? Oh, sorry, can we just forget that round?"
– Scott Sauyet
2 days ago