Skip to content

Drivers

Armen Kasparian edited this page Mar 28, 2024 · 8 revisions

This driver script orchestrates the training and inference processes for reinforcement learning (RL) models using the Gymnasium framework and custom environments. It is designed to be versatile, supporting various agents, environments, and training configurations.

Features

  • Flexible Environment Support: Compatible with standard Gymnasium environments, custom environments, and PACEs environments if available.
  • Configurable Agent and Buffer: Allows specification of the agent type and experience replay buffer settings through command-line arguments or configuration files.
  • Dynamic Logging and Model Saving: Provides detailed logging of the training process and saves model configurations and states to facilitate reproducibility and analysis. Logs and model checkpoints will default to ./results from the directory where the command was sent and can be visualized via tensorboard.
  • Inference Mode: Supports running the agent in an inference-only mode to evaluate performance without further training.

Key Components

Command Line Arguments

The script accepts various command-line arguments to specify the training and environment settings:

  • --index: Index for tracking multiple runs.
  • --nepisodes: Number of episodes to run.
  • --nsteps: Maximum number of steps per episode.
  • --bsize: Experience replay buffer size.
  • --btype: Type of experience replay buffer.
  • --agent: Identifier for the RL agent to use.
  • --env: Identifier for the Gymnasium environment.
  • --logdir: Directory to save training logs and model states.
  • --inference: Flag to run the agent in inference mode only.

Main Process Flow

  1. Initialization: Parses command-line arguments and sets up the logging directory based on the provided configuration.
  2. Environment Setup: Initializes the specified Gymnasium environment, including custom and PACEs environments if available.
  3. Agent Initialization: Creates the RL agent with the specified parameters, including the experience replay buffer configuration.
  4. Training Loop: Executes the training process for the specified number of episodes, logging episodic rewards and saving model checkpoints based on performance improvements.
  5. Inference Loop: If in inference mode, runs the agent through the environment without training to evaluate its performance.

Utility Functions

  • run_opt: The core function that encapsulates the training/inference workflow.
  • Logging and Configuration Saving: Throughout the script, extensive logging and model/configuration saving mechanisms are in place to ensure transparency and reproducibility.

Usage

To run the script, use the following command with the desired arguments:

python rl_driver_script.py --agent [AGENT_ID] --env [ENV_ID] --nepisodes [NUM_EPISODES] --logdir [LOG_DIRECTORY] --inference [INFERENCE_FLAG]

Inference Mode

In this mode, the training will be skipped and inference of the model will take place in the specified environment. Be sure to have the path of the models you would like to test in the selected agents configuration files (load_model parameter).

python rl_driver_script.py --agent [AGENT_ID] --env [ENV_ID] --nepisodes [NUM_EPISODES] --logdir [LOG_DIRECTORY] --inference True

Clone this wiki locally