machin.env¶
utils¶
machin.env.utils
provides utilities to deal with various environments.
wrappers¶
machin.env.wrappers
provides parallel execution wrappers for various
environments.
-
class
machin.env.wrappers.base.
ParallelWrapperBase
(*_, **__)[source]¶ Bases:
abc.ABC
Note
Parallel wrapper is designed to wrap the same kind of environments, they may have different parameters, but must have the same action and observation space.
-
abstract
reset
(idx=None)[source]¶ Reset all environments if id is
None
, otherwise reset the specific environment(s) with given index(es).- Parameters
idx (Union[int, List[int], None]) – Environment index(es) to be reset.
- Returns
Initial observation of all environments. Format is unspecified.
- Return type
Any
-
abstract
step
(action, idx=None)[source]¶ Let specified environment(s) run one time step. specified environments must be active and have not reached terminal states before.
- Parameters
action – actions to take.
idx (Union[int, List[int], None]) – Environment index(es) to be run.
- Returns
New states of environments.
- Return type
Any
-
abstract
seed
(seed=None)[source]¶ Set seed(s) for all environment(s).
- Parameters
seed (Union[int, List[int], None]) – A single integer seed for all environments, or a list of integers for each environment, or None for default seed.
- Returns
New seed of each environment.
- Return type
List[int]
-
abstract property
action_space
¶ Returns: Action space descriptor.
-
abstract property
observation_space
¶ Returns: Observation space descriptor.
-
abstract
-
class
machin.env.wrappers.openai_gym.
ParallelWrapperDummy
(env_creators)[source]¶ Bases:
machin.env.wrappers.base.ParallelWrapperBase
Dummy parallel wrapper for gym environments, implemented using for-loop.
For debug purpose only.
- Parameters
env_creators (List[Callable[[int], gym.core.Env]]) – List of gym environment creators, used to create environments, accepts a index as your environment id.
-
reset
(idx=None)[source]¶ - Returns
A list of gym states.
- Parameters
List[int]] idx (Union[int,) –
- Return type
List[object]
-
step
(action, idx=None)[source]¶ Let specified environment(s) run one time step. Specified environments must be active and have not reached terminal states before.
- Parameters
action (Union[numpy.ndarray, List[Any]]) – Actions sent to each specified environment, the size of the first dimension must match the number of selected environments.
idx (Union[int, List[int]]) – Indexes of selected environments, default is all.
- Returns
Observation, reward, terminal, and diagnostic info.
- Return type
Tuple[List[object], List[float], List[bool], List[dict]]
-
seed
(seed=None)[source]¶ Set seeds for all environments.
- Parameters
seed (Union[int, List[int]]) – If seed is
int
, the same seed will be used for all environments. If seed isList[int]
, it must have the same size as the number of all environments. If seed isNone
, all environments will use the default seed.- Returns
Actual used seed returned by all environments.
- Return type
List[int]
-
render
(idx=None, *_, **__)[source]¶ Render all/specified environments.
- Parameters
idx (Union[int, List[int]]) – Indexes of selected environments, default is all.
- Returns
A list or rendered frames, of type
np.ndarray
and size (H, W, 3).- Return type
List[numpy.ndarray]
-
property
action_space
¶ Returns: Action space descriptor.
-
property
observation_space
¶ Returns: Observation space descriptor.
-
class
machin.env.wrappers.openai_gym.
ParallelWrapperSubProc
(env_creators)[source]¶ Bases:
machin.env.wrappers.base.ParallelWrapperBase
Parallel wrapper based on sub processes.
- Parameters
env_creators (List[Callable[[int], gym.core.Env]]) – List of gym environment creators, used to create environments on sub process workers, accepts a index as your environment id.
- Return type
None
-
reset
(idx=None)[source]¶ - Returns
A list of gym states.
- Parameters
List[int]] idx (Union[int,) –
- Return type
List[object]
-
step
(action, idx=None)[source]¶ Let specified environment(s) run one time step. Specified environments must be active and have not reached terminal states before.
- Parameters
action (Union[numpy.ndarray, List[Any]]) – Actions sent to each specified environment, the size of the first dimension must match the number of selected environments.
idx (Union[int, List[int]]) – Indexes of selected environments, default is all.
- Returns
Observation, reward, terminal, and diagnostic info.
- Return type
Tuple[List[object], List[float], List[bool], List[dict]]
-
seed
(seed=None)[source]¶ Set seeds for all environments.
- Parameters
seed (Union[int, List[int]]) – If seed is
int
, the same seed will be used for all environments. If seed isList[int]
, it must have the same size as the number of all environments. If seed isNone
, all environments will use the default seed.- Returns
Actual used seed returned by all environments.
- Return type
List[int]
-
render
(idx=None, *args, **kwargs)[source]¶ Render all/specified environments.
- Parameters
idx (Union[int, List[int]]) – Indexes of selected environments, default is all.
- Returns
A list or rendered frames, of type
np.ndarray
and size (H, W, 3).- Return type
List[numpy.ndarray]
-
property
action_space
¶ Returns: Action space descriptor.
-
property
observation_space
¶ Returns: Observation space descriptor.