-
Notifications
You must be signed in to change notification settings - Fork 3
Environments
Armen Kasparian edited this page Mar 28, 2024
·
3 revisions
Creating custom environments in Gymnasium allows for the design and testing of reinforcement learning algorithms on a wide variety of tasks. This guide outlines how to create a custom Gymnasium environment, exemplified by a 2D Circle environment (Circle2D), which simulates an agent moving within a 2-dimensional space to reach a target distance from the origin. Creating custom environments in Gymnasium involves implementing a set of core functionalities that define the environment's dynamics, including how agents interact with it and how it responds. Below is a breakdown of the essential functions and properties required to document the creation of a custom Gym environment.
- Purpose: Initializes the environment. This includes setting up any necessary parameters, the observation space, and the action space.
- Parameters: Environment-specific parameters such as dimensions of the observation space, action space configurations, and any other initialization arguments.
- Returns: None.
- Purpose: Advances the environment by one timestep using the action provided by the agent. It returns the next state, the reward, whether the episode has ended, and additional information.
-
Parameters:
-
action: An action provided by the agent.
-
-
Returns:
-
observation(object): The agent's observation of the current environment. -
reward(float): Amount of reward returned after previous action. -
done(bool): Whether the episode has ended via reaching the finish. -
truncated(bool): Whether the episode has ended via running out of steps in the episode. -
info(dict): Contains auxiliary diagnostic information.
-
- Purpose: Resets the environment to an initial state and returns an initial observation.
- Parameters: None.
-
Returns:
-
observation(object): The initial observation.
-
- Purpose: (Optional) Renders the environment for human viewing.
-
Parameters:
-
mode(str): The mode to render with.
-
- Returns: Varies depending on the rendering mode.
- Purpose: (Optional) Performs any necessary cleanup.
- Parameters: None.
- Returns: None.
-
Purpose: Specifies the space of valid actions (e.g.,
spaces.Discrete,spaces.Box, etc.). -
Type:
gym.spacesinstance.
-
Purpose: Specifies the space of valid observations (e.g.,
spaces.Discrete,spaces.Box, etc.). -
Type:
gym.spacesinstance.
- Purpose: Sets the seed for the environment's random number generator(s).
-
Parameters:
-
seed(int): The seed to use.
-
- Returns: List of seeds used in the environment's random number generators.
- Purpose: Implement any environment-specific methods needed for more complex environments.
- Parameters: Varies depending on the method.
- Returns: Varies depending on the method.