machin.env¶

utils¶

machin.env.utils provides utilities to deal with various environments.

machin.env.utils.openai_gym.disable_view_window()[source]¶

wrappers¶

machin.env.wrappers provides parallel execution wrappers for various environments.

class machin.env.wrappers.base.ParallelWrapperBase(*_, **__)[source]¶

Bases: abc.ABC

Note

Parallel wrapper is designed to wrap the same kind of environments, they may have different parameters, but must have the same action and observation space.

abstract reset(idx=None)[source]¶

Reset all environments if id is None, otherwise reset the specific environment(s) with given index(es).

Parameters: idx (Union[int, List[int], None]) – Environment index(es) to be reset.
Returns: Initial observation of all environments. Format is unspecified.
Return type: Any

abstract step(action, idx=None)[source]¶

Let specified environment(s) run one time step. specified environments must be active and have not reached terminal states before.

Parameters

action – actions to take.
idx (Union[int, List[int], None]) – Environment index(es) to be run.

Returns

New states of environments.

Return type

Any

abstract seed(seed=None)[source]¶

Set seed(s) for all environment(s).

Parameters: seed (Union[int, List[int], None]) – A single integer seed for all environments, or a list of integers for each environment, or None for default seed.
Returns: New seed of each environment.
Return type: List[int]

abstract render(*args, **kwargs)[source]¶

Render all environments.

Return type: Any

abstract close()[source]¶

Close all environments.

Return type: Any

abstract active()[source]¶

Returns: Indexes of active environments.
Return type: List[int]

abstract size()[source]¶

Returns: Number of environments.
Return type: int

abstract property action_space¶: Returns: Action space descriptor.

abstract property observation_space¶: Returns: Observation space descriptor.

exception machin.env.wrappers.openai_gym.GymTerminationError[source]¶: Bases: Exception

class machin.env.wrappers.openai_gym.ParallelWrapperDummy(env_creators)[source]¶

Bases: machin.env.wrappers.base.ParallelWrapperBase

Dummy parallel wrapper for gym environments, implemented using for-loop.

For debug purpose only.

Parameters: env_creators (List[Callable[[int], gym.core.Env]]) – List of gym environment creators, used to create environments, accepts a index as your environment id.

reset(idx=None)[source]¶

Returns: A list of gym states.
Parameters: List[int]] idx (Union[int,) –
Return type: List[object]

step(action, idx=None)[source]¶

Let specified environment(s) run one time step. Specified environments must be active and have not reached terminal states before.

Parameters

action (Union[numpy.ndarray, List[Any]]) – Actions sent to each specified environment, the size of the first dimension must match the number of selected environments.
idx (Union[int, List[int]]) – Indexes of selected environments, default is all.

Returns

Observation, reward, terminal, and diagnostic info.

Return type

Tuple[List[object], List[float], List[bool], List[dict]]

seed(seed=None)[source]¶

Set seeds for all environments.

Parameters: seed (Union[int, List[int]]) – If seed is int, the same seed will be used for all environments. If seed is List[int], it must have the same size as the number of all environments. If seed is None, all environments will use the default seed.
Returns: Actual used seed returned by all environments.
Return type: List[int]

render(idx=None, *_, **__)[source]¶

Render all/specified environments.

Parameters: idx (Union[int, List[int]]) – Indexes of selected environments, default is all.
Returns: A list or rendered frames, of type np.ndarray and size (H, W, 3).
Return type: List[numpy.ndarray]

close()[source]¶

Close all environments.

Return type: None

active()[source]¶

Returns: Indexes of current active environments.

Return type: List[int]

size()[source]¶

Returns: Number of environments.

Return type: int

property action_space¶: Returns: Action space descriptor.

property observation_space¶: Returns: Observation space descriptor.

class machin.env.wrappers.openai_gym.ParallelWrapperSubProc(env_creators)[source]¶

Bases: machin.env.wrappers.base.ParallelWrapperBase

Parallel wrapper based on sub processes.

Parameters: env_creators (List[Callable[[int], gym.core.Env]]) – List of gym environment creators, used to create environments on sub process workers, accepts a index as your environment id.
Return type: None

reset(idx=None)[source]¶

Returns: A list of gym states.
Parameters: List[int]] idx (Union[int,) –
Return type: List[object]

step(action, idx=None)[source]¶

Let specified environment(s) run one time step. Specified environments must be active and have not reached terminal states before.

Parameters

action (Union[numpy.ndarray, List[Any]]) – Actions sent to each specified environment, the size of the first dimension must match the number of selected environments.
idx (Union[int, List[int]]) – Indexes of selected environments, default is all.

Returns

Observation, reward, terminal, and diagnostic info.

Return type

Tuple[List[object], List[float], List[bool], List[dict]]

seed(seed=None)[source]¶

Set seeds for all environments.

Parameters: seed (Union[int, List[int]]) – If seed is int, the same seed will be used for all environments. If seed is List[int], it must have the same size as the number of all environments. If seed is None, all environments will use the default seed.
Returns: Actual used seed returned by all environments.
Return type: List[int]

render(idx=None, *args, **kwargs)[source]¶

Render all/specified environments.

Parameters: idx (Union[int, List[int]]) – Indexes of selected environments, default is all.
Returns: A list or rendered frames, of type np.ndarray and size (H, W, 3).
Return type: List[numpy.ndarray]

close()[source]¶

Close all environments, including the wrapper.

Return type: None

active()[source]¶

Returns: Indexes of current active environments.

Return type: List[int]

size()[source]¶

Returns: Number of environments.

Return type: int

property action_space¶: Returns: Action space descriptor.

property observation_space¶: Returns: Observation space descriptor.