Overview

mabby is a library for simulating multi-armed bandits (MABs), a resource-allocation problem and framework in reinforcement learning. It allows users to quickly yet flexibly define and run bandit simulations, with the ability to:

choose from a wide range of classic bandit algorithms to use
configure environments with custom arm spaces and rewards distributions
collect and visualize simulation metrics like regret and optimality

Installation

Prerequisites: Python 3.9+ and pip

Install mabby with pip:

pip install mabby

Basic Usage

The code example below demonstrates the basic steps of running a simulation with mabby. For more in-depth examples, please see the Usage Examples section of the mabby documentation.

import mabby as mb

# configure bandit arms
bandit = mb.BernoulliArm.bandit(p=[0.3, 0.6])

# configure bandit strategy
strategy = mb.strategies.EpsilonGreedyStrategy(eps=0.2)

# setup simulation
simulation = mb.Simulation(bandit=bandit, strategies=[strategy])

# run simulation
stats = simulation.run(trials=100, steps=300)

# plot regret statistics
stats.plot_regret()

Contributing

Please see CONTRIBUTING for more information.

License

This software is licensed under the Apache 2.0 license. Please see LICENSE for more information.