arms
Provides Arm
base class with some common reward distributions.
Arm(**kwargs)
Bases: ABC
, EnforceOverrides
Base class for a bandit arm implementing a reward distribution.
An arm represents one of the decision choices available to the agent in a bandit problem. It has a hidden reward distribution and can be played by the agent to generate observable rewards.
Source code in mabby/arms.py
21 22 23 |
|
mean: float
property
abstractmethod
The mean reward of the arm.
Returns:
Type | Description |
---|---|
float
|
The computed mean of the arm's reward distribution. |
__repr__()
abstractmethod
Returns the string representation of the arm.
Source code in mabby/arms.py
45 46 47 |
|
bandit(rng=None, seed=None, **kwargs)
classmethod
Creates a bandit with arms of the same reward distribution type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rng |
Generator | None
|
A random number generator. |
None
|
seed |
int | None
|
A seed for random number generation if |
None
|
**kwargs |
list[float]
|
A dictionary where keys are arm parameter names and values are lists of parameter values for each arm. |
{}
|
Returns:
Type | Description |
---|---|
Bandit
|
A bandit with the specified arms. |
Source code in mabby/arms.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
|
play(rng)
abstractmethod
Plays the arm and samples a reward.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rng |
Generator
|
A random number generator. |
required |
Returns:
Type | Description |
---|---|
float
|
The sampled reward from the arm's reward distribution. |
Source code in mabby/arms.py
25 26 27 28 29 30 31 32 33 34 |
|
BernoulliArm(p)
Bases: Arm
Bandit arm with a Bernoulli reward distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
p |
float
|
Parameter of the Bernoulli distribution. |
required |
Source code in mabby/arms.py
76 77 78 79 80 81 82 83 84 85 86 87 |
|
GaussianArm(loc, scale)
Bases: Arm
Bandit arm with a Gaussian reward distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
loc |
float
|
Mean ("center") of the Gaussian distribution. |
required |
scale |
float
|
Standard deviation of the Gaussian distribution. |
required |
Source code in mabby/arms.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
|