machin.env

utils

machin.env.utils provides utilities to deal with various environments.

machin.env.utils.openai_gym.disable_view_window()[source]

wrappers

machin.env.wrappers provides parallel execution wrappers for various environments.

class machin.env.wrappers.base.ParallelWrapperBase(*_, **__)[source]

Bases: abc.ABC

Note

Parallel wrapper is designed to wrap the same kind of environments, they may have different parameters, but must have the same action and observation space.

abstract reset(idx=None)[source]

Reset all environments if id is None, otherwise reset the specific environment(s) with given index(es).

Parameters

idx (Union[int, List[int], None]) – Environment index(es) to be reset.

Returns

Initial observation of all environments. Format is unspecified.

Return type

Any

abstract step(action, idx=None)[source]

Let specified environment(s) run one time step. specified environments must be active and have not reached terminal states before.

Parameters
  • action – actions to take.

  • idx (Union[int, List[int], None]) – Environment index(es) to be run.

Returns

New states of environments.

Return type

Any

abstract seed(seed=None)[source]

Set seed(s) for all environment(s).

Parameters

seed (Union[int, List[int], None]) – A single integer seed for all environments, or a list of integers for each environment, or None for default seed.

Returns

New seed of each environment.

Return type

List[int]

abstract render(*args, **kwargs)[source]

Render all environments.

Return type

Any

abstract close()[source]

Close all environments.

Return type

Any

abstract active()[source]
Returns

Indexes of active environments.

Return type

List[int]

abstract size()[source]
Returns

Number of environments.

Return type

int

abstract property action_space

Returns: Action space descriptor.

abstract property observation_space

Returns: Observation space descriptor.

exception machin.env.wrappers.openai_gym.GymTerminationError[source]

Bases: Exception

class machin.env.wrappers.openai_gym.ParallelWrapperDummy(env_creators)[source]

Bases: machin.env.wrappers.base.ParallelWrapperBase

Dummy parallel wrapper for gym environments, implemented using for-loop.

For debug purpose only.

Parameters

env_creators (List[Callable[[int], gym.core.Env]]) – List of gym environment creators, used to create environments, accepts a index as your environment id.

reset(idx=None)[source]
Returns

A list of gym states.

Parameters

List[int]] idx (Union[int,) –

Return type

List[object]

step(action, idx=None)[source]

Let specified environment(s) run one time step. Specified environments must be active and have not reached terminal states before.

Parameters
  • action (Union[numpy.ndarray, List[Any]]) – Actions sent to each specified environment, the size of the first dimension must match the number of selected environments.

  • idx (Union[int, List[int]]) – Indexes of selected environments, default is all.

Returns

Observation, reward, terminal, and diagnostic info.

Return type

Tuple[List[object], List[float], List[bool], List[dict]]

seed(seed=None)[source]

Set seeds for all environments.

Parameters

seed (Union[int, List[int]]) – If seed is int, the same seed will be used for all environments. If seed is List[int], it must have the same size as the number of all environments. If seed is None, all environments will use the default seed.

Returns

Actual used seed returned by all environments.

Return type

List[int]

render(idx=None, *_, **__)[source]

Render all/specified environments.

Parameters

idx (Union[int, List[int]]) – Indexes of selected environments, default is all.

Returns

A list or rendered frames, of type np.ndarray and size (H, W, 3).

Return type

List[numpy.ndarray]

close()[source]

Close all environments.

Return type

None

active()[source]

Returns: Indexes of current active environments.

Return type

List[int]

size()[source]

Returns: Number of environments.

Return type

int

property action_space

Returns: Action space descriptor.

property observation_space

Returns: Observation space descriptor.

class machin.env.wrappers.openai_gym.ParallelWrapperSubProc(env_creators)[source]

Bases: machin.env.wrappers.base.ParallelWrapperBase

Parallel wrapper based on sub processes.

Parameters

env_creators (List[Callable[[int], gym.core.Env]]) – List of gym environment creators, used to create environments on sub process workers, accepts a index as your environment id.

Return type

None

reset(idx=None)[source]
Returns

A list of gym states.

Parameters

List[int]] idx (Union[int,) –

Return type

List[object]

step(action, idx=None)[source]

Let specified environment(s) run one time step. Specified environments must be active and have not reached terminal states before.

Parameters
  • action (Union[numpy.ndarray, List[Any]]) – Actions sent to each specified environment, the size of the first dimension must match the number of selected environments.

  • idx (Union[int, List[int]]) – Indexes of selected environments, default is all.

Returns

Observation, reward, terminal, and diagnostic info.

Return type

Tuple[List[object], List[float], List[bool], List[dict]]

seed(seed=None)[source]

Set seeds for all environments.

Parameters

seed (Union[int, List[int]]) – If seed is int, the same seed will be used for all environments. If seed is List[int], it must have the same size as the number of all environments. If seed is None, all environments will use the default seed.

Returns

Actual used seed returned by all environments.

Return type

List[int]

render(idx=None, *args, **kwargs)[source]

Render all/specified environments.

Parameters

idx (Union[int, List[int]]) – Indexes of selected environments, default is all.

Returns

A list or rendered frames, of type np.ndarray and size (H, W, 3).

Return type

List[numpy.ndarray]

close()[source]

Close all environments, including the wrapper.

Return type

None

active()[source]

Returns: Indexes of current active environments.

Return type

List[int]

size()[source]

Returns: Number of environments.

Return type

int

property action_space

Returns: Action space descriptor.

property observation_space

Returns: Observation space descriptor.