一個用於強化學習的 API 標準,具有多樣化的參考環境集合

Lunar Lander

Gymnasium 是 OpenAI 的 Gym 函式庫的一個維護分支。 Gymnasium 介面簡單、符合 Python 風格,並且能夠表示一般的強化學習問題,而且有一個 相容性封裝器,適用於舊的 Gym 環境

import gymnasium as gym

# Initialise the environment
env = gym.make("LunarLander-v3", render_mode="human")

# Reset the environment to generate the first observation
observation, info = env.reset(seed=42)
for _ in range(1000):
    # this is where you would insert your policy
    action = env.action_space.sample()

    # step (transition) through the environment with the action
    # receiving the next observation, reward and if the episode has terminated or truncated
    observation, reward, terminated, truncated, info = env.step(action)

    # If the episode has ended then we can reset to start a new episode
    if terminated or truncated:
        observation, info = env.reset()

env.close()