Skip to content

Environments

Armen Kasparian edited this page Mar 28, 2024 · 3 revisions

Creating Custom Gymnasium Environments

Creating custom environments in Gymnasium allows for the design and testing of reinforcement learning algorithms on a wide variety of tasks. This guide outlines how to create a custom Gymnasium environment, exemplified by a 2D Circle environment (Circle2D), which simulates an agent moving within a 2-dimensional space to reach a target distance from the origin. Creating custom environments in Gymnasium involves implementing a set of core functionalities that define the environment's dynamics, including how agents interact with it and how it responds. Below is a breakdown of the essential functions and properties required to document the creation of a custom Gym environment.

Core Functions

__init__(self, ...)

  • Purpose: Initializes the environment. This includes setting up any necessary parameters, the observation space, and the action space.
  • Parameters: Environment-specific parameters such as dimensions of the observation space, action space configurations, and any other initialization arguments.
  • Returns: None.

step(self, action)

  • Purpose: Advances the environment by one timestep using the action provided by the agent. It returns the next state, the reward, whether the episode has ended, and additional information.
  • Parameters:
    • action: An action provided by the agent.
  • Returns:
    • observation (object): The agent's observation of the current environment.
    • reward (float): Amount of reward returned after previous action.
    • done (bool): Whether the episode has ended via reaching the finish.
    • truncated (bool): Whether the episode has ended via running out of steps in the episode.
    • info (dict): Contains auxiliary diagnostic information.

reset(self)

  • Purpose: Resets the environment to an initial state and returns an initial observation.
  • Parameters: None.
  • Returns:
    • observation (object): The initial observation.

render(self, mode='human')

  • Purpose: (Optional) Renders the environment for human viewing.
  • Parameters:
    • mode (str): The mode to render with.
  • Returns: Varies depending on the rendering mode.

close(self)

  • Purpose: (Optional) Performs any necessary cleanup.
  • Parameters: None.
  • Returns: None.

Properties

action_space

  • Purpose: Specifies the space of valid actions (e.g., spaces.Discrete, spaces.Box, etc.).
  • Type: gym.spaces instance.

observation_space

  • Purpose: Specifies the space of valid observations (e.g., spaces.Discrete, spaces.Box, etc.).
  • Type: gym.spaces instance.

Optional Methods

seed(self, seed=None)

  • Purpose: Sets the seed for the environment's random number generator(s).
  • Parameters:
    • seed (int): The seed to use.
  • Returns: List of seeds used in the environment's random number generators.

Custom Methods

  • Purpose: Implement any environment-specific methods needed for more complex environments.
  • Parameters: Varies depending on the method.
  • Returns: Varies depending on the method.

Clone this wiki locally