strategy
Provides Strategy
class.
Strategy()
Bases: ABC
, EnforceOverrides
Base class for a bandit strategy.
A strategy provides the computational logic for choosing which bandit arms to play and updating parameter estimates.
Source code in mabby/strategies/strategy.py
22 23 24 |
|
Ns: NDArray[np.uint32]
property
abstractmethod
The number of times each arm has been played.
Qs: NDArray[np.float64]
property
abstractmethod
The current estimated action values for each arm.
__repr__()
abstractmethod
Returns a string representation of the strategy.
Source code in mabby/strategies/strategy.py
26 27 28 |
|
agent(**kwargs)
Creates an agent following the strategy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**kwargs |
str
|
Parameters for initializing the agent (see
|
{}
|
Returns:
Type | Description |
---|---|
Agent
|
The created agent with the strategy. |
Source code in mabby/strategies/strategy.py
70 71 72 73 74 75 76 77 78 79 80 |
|
choose(rng)
abstractmethod
Returns the next arm to play.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rng |
Generator
|
A random number generator. |
required |
Returns:
Type | Description |
---|---|
int
|
The index of the arm to play. |
Source code in mabby/strategies/strategy.py
39 40 41 42 43 44 45 46 47 48 |
|
prime(k, steps)
abstractmethod
Primes the strategy before running a trial.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
k |
int
|
The number of bandit arms to choose from. |
required |
steps |
int
|
The number of steps to the simulation will be run. |
required |
Source code in mabby/strategies/strategy.py
30 31 32 33 34 35 36 37 |
|
update(choice, reward, rng=None)
abstractmethod
Updates internal parameter estimates based on reward observation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
choice |
int
|
The most recent choice made. |
required |
reward |
float
|
The observed reward from the agent's most recent choice. |
required |
rng |
Generator | None
|
A random number generator. |
None
|
Source code in mabby/strategies/strategy.py
50 51 52 53 54 55 56 57 58 |
|