Multi-agent RL ============== .. automodule:: marl.marl Agents ====== Base Agent ---------- .. automodule:: marl.agent.agent Q-value based model ------------------- .. automodule:: marl.agent.q_agent Policy Gradient based model --------------------------- .. automodule:: marl.agent.pg_agent Multi-agent Policy Gradient based model --------------------------------------- .. automodule:: marl.agent.maac_agent Experience ========== Experience ---------- .. automodule:: marl.experience.experience ReplayBuffer ------------ .. automodule:: marl.experience.replay_buffer Exploration =========== Exploration ----------- .. automodule:: marl.exploration.expl_process Eps-Greedy ---------- .. automodule:: marl.exploration.greedy .. automodule:: marl.exploration.eps_greedy Ornstein–Uhlenbeck Process -------------------------- .. automodule:: marl.exploration.ou_noise Policies ======== Base Policy ----------- .. automodule:: marl.policy.policy Several Policies ---------------- .. automodule:: marl.policy.policies Models ====== Value and Q-Value array ----------------------- .. automodule:: marl.model.qvalue Neural network model -------------------- .. automodule:: marl.model.nn.mlpnet