semi_uniform
Provides implementations of semi-uniform bandit strategies.
Semi-uniform strategies will choose to explore or exploit at each time step. When
exploring, a random arm will be played. When exploiting, the arm with the greatest
estimated action value will be played. epsilon
, the chance of exploration, is
computed differently with different semi-uniform strategies.
EpsilonFirstStrategy(eps)
Bases: SemiUniformStrategy
Epsilon-first bandit strategy.
The epsilon-first strategy has a pure exploration phase followed by a pure exploitation phase.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
eps |
float
|
The ratio of exploration steps (must be between 0 and 1). |
required |
Source code in mabby/strategies/semi_uniform.py
132 133 134 135 136 137 138 139 140 141 |
|
EpsilonGreedyStrategy(eps)
Bases: SemiUniformStrategy
Epsilon-greedy bandit strategy.
The epsilon-greedy strategy has a fixed chance of exploration every time step.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
eps |
float
|
The chance of exploration (must be between 0 and 1). |
required |
Source code in mabby/strategies/semi_uniform.py
103 104 105 106 107 108 109 110 111 112 |
|
RandomStrategy()
Bases: SemiUniformStrategy
Random bandit strategy.
The random strategy chooses arms at random, i.e., it explores with 100% chance.
Source code in mabby/strategies/semi_uniform.py
84 85 86 |
|
SemiUniformStrategy()
Bases: Strategy
, ABC
, EnforceOverrides
Base class for semi-uniform bandit strategies.
Every semi-uniform strategy must implement
effective_eps
to compute the chance of exploration at each time step.
Source code in mabby/strategies/semi_uniform.py
33 34 |
|
effective_eps()
abstractmethod
Returns the effective epsilon value.
The effective epsilon value is the probability at the current time step that the bandit will explore rather than exploit. Depending on the strategy, the effective epsilon value may be different from the nominal epsilon value set.
Source code in mabby/strategies/semi_uniform.py
68 69 70 71 72 73 74 75 |
|