machin.utils

checker

exception machin.utils.checker.CheckError[source]

Bases: Exception

machin.utils.checker.check_model(writer, model, input_check_hooks=(<function i_chk_nan>, <function i_chk_range>), output_check_hooks=(<function o_chk_nan>, <function o_chk_range>), param_check_hooks=(<function p_chk_nan>, <function p_chk_range>), input_check_interval=1, output_check_interval=1, param_check_interval=100, name='')[source]

Check model input, output and parameters using hooks. All hooks (Input, output and parameter) check hooks are executed in the forward pass.

An example:

model = nn.Linear([100, 100])
check_model(model)

# Continue to do whatever you like.
model(t.zeros([100]))

Note

Only leaf modules will be checked (such as nn.Linear and not some complex neural network modules made of several sub-modules). But you can manually control granularity.

Warning

Do not output tuple in your forward() function if you have output check hooks, otherwise you must specify names for each output.

Hint

You may manually control the check granularity by using mark_as_atom_module().

You may specify a list of names for your module outputs so names given to your output check hooks will not be numbers, by using mark_module_output()

Hint

For all three kinds of hooks, your hook need to have the following signature:

hook(counter, writer, model, module, name, value)

where:

  • counter is the Counter, you can use Counter.get() to get the current pass number.

  • writer is SummaryWriter from tensorboardx.

  • model is your model.

  • module is the module currently being checked.

  • name is input/output/parameter name string. For input, their detail names will be extracted from module forward signature. Output detail names will be numbers or names you have specified.

  • value is input/output/parameter value.

Parameters
  • writer (tensorboardX.writer.SummaryWriter) – Tensorboard SummaryWriter used to log.

  • model (torch.nn.modules.module.Module) – Model to be checked.

  • input_check_hooks – A series of input check hooks.

  • output_check_hooks – A series of output check hooks.

  • param_check_hooks – A series of parameter check hooks.

  • input_check_interval – Interval (number of forward passes) of input checking.

  • output_check_interval – Interval (number of forward passes) of output checking.

  • param_check_interval – Interval (number of backward passes) of parameter checking.

  • name – Your model name.

Returns

A function f(), calling f() will deregister all check hooks.

machin.utils.checker.check_nan(tensor, name='')[source]

Check whether tensor has nan element.

Parameters
  • tensor (torch.Tensor) – Tensor to check

  • name – Name of tensor, will be printed in the error message.

Raises

RuntimeError if tensor has any nan element.

machin.utils.checker.check_shape(tensor, required_shape, name='')[source]

Check whether tensor has the specified shape.

Parameters
  • tensor (torch.Tensor) – Tensor to check.

  • required_shape (List[int]) – A list of int specifying shape of each dimension.

  • name – Name of tensor, will be printed in the error message.

Raises

RuntimeError if shape of the tensor doesn't match.

machin.utils.checker.i_chk_nan(_counter, _writer, _model, _module, input_name, input_val)[source]

Check whether there is any nan element in the input, if input is a tensor.

machin.utils.checker.i_chk_range(counter, writer, _model, _module, input_name, input_val)[source]

Compute min, max and mean value of the input, if input is a tensor.

machin.utils.checker.mark_as_atom_module(module)[source]

Mark module as a atom leaf module, so it can be checked.

machin.utils.checker.mark_module_output(module, output_names)[source]

Mark names of the module output. It will also tell checker about the number of outputs.

Parameters
  • module – Module to be marked.

  • output_names (List[str]) – Name of each output value.

machin.utils.checker.o_chk_nan(_counter, _writer, _model, _module, output_name, output_val)[source]

Check whether there is any nan element in the output, if input is a tensor.

machin.utils.checker.o_chk_range(counter, writer, _model, _module, output_name, output_val)[source]

Compute min, max and mean value of the output, if output is a tensor.

machin.utils.checker.p_chk_nan(counter, _writer, _model, _module, param_name, param_val)[source]

Check whether there is any nan element in the parameter.

machin.utils.checker.p_chk_range(counter, writer, _model, _module, param_name, param_val)[source]

Compute min, max and mean value of the parameter.

conf

class machin.utils.conf.Config(**configs)[source]

Bases: machin.utils.helper_classes.Object

machin.utils.conf.load_config_cmd(merge_conf=None)[source]

Get configs from the commandline by using “–conf”.

--conf a=b will set <Returned Config>.a = b

Example:

python3 test.py --conf device="cuda:1"
                --conf some_dict={"some_key":1}

Example:

from machin.utils.conf import Config
from machin.utils.save_env import SaveEnv

# set some config attributes
c = Config(
    model_save_int = 100,
    root_dir = "some_directory",
    restart_from_trial = "2020_05_09_15_00_31"
)

load_config_cmd(c)

# restart_from_trial specifies the trial name in your root
# directory.
# If it is set, then SaveEnv constructor will
# load arguments from that trial record, will overwrite.
# If not, then SaveEnv constructor will save configurations
# as: ``<c.some_root_dir>/<trial_start_time>/config/config.json``

save_env = SaveEnv(c)
Parameters

merge_conf (machin.utils.conf.Config) – Config to merge.

Return type

machin.utils.conf.Config

machin.utils.conf.load_config_file(json_file, merge_conf=None)[source]

Get configs from a json file.

Parameters
Returns

configuration

Return type

machin.utils.conf.Config

machin.utils.conf.merge_config(conf, merge)[source]

Merge config object with a dictionary, or a Config object, same keys in the conf will be overwritten by keys in merge.

Parameters
Return type

machin.utils.conf.Config

machin.utils.conf.save_config(conf, json_file)[source]

Dump config object to a json file.

Parameters

helper_classes

class machin.utils.helper_classes.Counter(start=0, step=1)[source]

Bases: object

count()[source]

Move counter forward by step

get()[source]

Get the internal number of counter.

reset()[source]

Reset the counter.

class machin.utils.helper_classes.Object(data=None, const_attrs=None)[source]

Bases: object

An generic object class, which stores a dictionary internally, and you can access and set its keys by accessing and seting attributes of the object.

data

Internal dictionary.

attr(item, value=None, change=False)[source]
call(*args, **kwargs)[source]

the implementation of Object.__call__, override it to customize call behavior.

class machin.utils.helper_classes.Switch(state=False)[source]

Bases: object

Parameters

state (bool) – Internal state, True for on, False for off.

flip()[source]

Inverse the internal state.

get()[source]
Returns

state of switch.

Return type

bool

off()[source]

Set to off.

on()[source]

Set to on.

class machin.utils.helper_classes.Timer[source]

Bases: object

begin()[source]

Begin timing.

end()[source]
Returns

Curent time difference since last begin()

class machin.utils.helper_classes.Trigger(state=False)[source]

Bases: machin.utils.helper_classes.Switch

Parameters

state (bool) – Internal state, True for on, False for off.

get()[source]

Get the state of trigger, will also set trigger to off.

Returns

state of trigger.

learning_rate

This module is the place for all learning rate functions, currently, only manual learning rate changing according to global steps is implemented,.

machin.utils.learning_rate.gen_learning_rate_func(lr_map, logger=None)[source]

Example:

from torch.optim.lr_scheduler import LambdaLR

# 0 <= step < 200000, lr=1e-3, 200000 <= step, lr=3e-4
lr_func = gen_learning_rate_func([(0, 1e-3), (200000, 3e-4)],)
lr_sch = LambdaLR(optimizer, lr_func)
Parameters
  • lr_map (List[Tuple[int, float]]) – A 2d learning rate map, the first element of each row is step. the second is learning rate.

  • logger (logging.Logger) – A logger to log current learning rate

Returns

A learning rate generation function with signature lr_gen(step)->lr, accepts int and returns float. use it in your pytorch lr scheduler.

logging

machin.utils.logging.default_logger

The default global logger.

TODO: maybe add logging utilities for distributed scenario?

class machin.utils.logging.FakeLogger[source]

Bases: object

critical(msg, *args, **kwargs)[source]
debug(msg, *args, **kwargs)[source]
error(msg, *args, **kwargs)[source]
exception(msg, *args, exc_info=True, **kwargs)[source]
info(msg, *args, **kwargs)[source]
log(level, msg, *args, **kwargs)[source]
setLevel(level)[source]
warn(msg, *args, **kwargs)[source]
warning(msg, *args, **kwargs)[source]

media

machin.utils.media.create_image(image, path, filename, extension='.png')[source]
Parameters
  • image (numpy.array) – A numpy array of shape (H, W, C) or (H, W), and with dtype = any float or any int. When a frame is float type, its value range should be [0, 1]. When a frame is integer type, its value range should be [0, 255].

  • path (str) – Directory to save the image.

  • filename (str) – File name.

  • extension (str) – File extension.

machin.utils.media.create_image_subproc(image, path, filename, extension='.png', daemon=True)[source]

Create image with a subprocess.

See also

create_image()

Note

if daemon is true, then this function cannot be used in a daemonic subprocess.

Parameters
  • image (numpy.array) – A numpy array of shape (H, W, C) or (H, W), and with dtype = any float or any int. When a frame is float type, its value range should be [0, 1]. When a frame is integer type, its value range should be [0, 255].

  • path (str) – Directory to save the image.

  • filename (str) – File name.

  • extension (str) – File extension.

  • daemon (bool) – Whether launching the saving process as a daemonic process.

Returns

A wait function, once called, block until creation has finished.

machin.utils.media.create_video(frames, path, filename, extension='.gif', fps=15)[source]
Parameters
  • frames (List[numpy.array]) – A list of numpy arrays of shape (H, W, C) or (H, W), and with dtype = any float or any int. When a frame is float type, its value range should be [0, 1]. When a frame is integer type, its value range should be [0, 255].

  • path (str) – Directory to save the video.

  • filename (str) – File name.

  • extension (str) – File extension.

  • fps (int) – frames per second.

machin.utils.media.create_video_subproc(frames, path, filename, extension='.gif', fps=15, daemon=True)[source]

Create video with a subprocess, since it takes a lot of time for moviepy to encode the video file.

See also

create_video()

Note

if daemon is true, then this function cannot be used in a daemonic subprocess.

Parameters
  • frames (List[numpy.array]) – A list of numpy arrays of shape (H, W, C) or (H, W), and with dtype = any float or any int. When a frame is float type, its value range should be [0, 1]. When a frame is integer type, its value range should be [0, 255].

  • path (str) – Directory to save the video.

  • filename (str) – File name.

  • extension (str) – File extension.

  • fps (int) – frames per second.

  • daemon (bool) – Whether launching the saving process as a daemonic process.

Returns

A wait function, once called, block until creation has finished.

machin.utils.media.show_image(image, show_normalized=True, pause_time=0.01, title='')[source]

Use matplotlib to show a single image. You may repeatedly call this method with the same title argument to show a video or a dynamically changing image.

Parameters
  • image (numpy.array) – A numpy array of shape (H, W, C) or (H, W), and with dtype = any float or any int. When a frame is float type, its value range should be [0, 1]. When a frame is integer type, its value range should be [0, 255].

  • show_normalized (bool) – Show normalized image alongside the original one.

  • pause_time (float) – Pause time between displaying current image and the next one.

  • title (str) – Title of the display window.

prepare

machin.utils.prepare.prep_clear_dirs(dirs)[source]
Parameters

dirs (Iterable[str]) – a list of directories to clear

machin.utils.prepare.prep_create_dirs(dirs)[source]

Note: will recursively create directories.

Parameters

dirs (Iterable[str]) – a list of directories to create if these directories are not found.

machin.utils.prepare.prep_load_model(model_dir, model_map, version=None, quiet=False, logger=None)[source]

Automatically find and load models.

Parameters
  • model_dir (str) – Directory to save models.

  • model_map (Dict[str, torch.nn.modules.module.Module]) – Model saving map.

  • version (int) – Version to load, if specified, otherwise automatically find the latest version.

  • quiet (bool) – Raise no error if no valid version could be found.

  • logger (Any) – Logger to use.

machin.utils.prepare.prep_load_state_dict(model, state_dict)[source]

Automatically load a loaded state dictionary

Note

This function handles tensor device remapping.

Parameters
  • model (torch.nn.modules.module.Module) –

  • state_dict (Any) –

save_env

class machin.utils.save_env.SaveEnv(env_root, restart_from_trial=None, time_format='%Y_%m_%d_%H_%M_%S')[source]

Bases: object

Create the default environment for saving. creates something like:

<your environment root>
├── config
├── log
│   ├── images
│   └── train_log
└── model
Parameters
  • env_root (str) – root directory for all trials of the environment.

  • restart_from_trial (Optional[str]) – instead of creating a new save environment for a new trial, use a existing save environment of an older trial, old trial name should be in format time_format

clear_trial_config_dir()[source]
clear_trial_image_dir()[source]
clear_trial_model_dir()[source]
clear_trial_train_log_dir()[source]
create_dirs(dirs)[source]

Create additional directories in root.

Parameters

dirs (Iterable[str]) – Directories.

get_trial_config_dir()[source]
get_trial_image_dir()[source]
get_trial_model_dir()[source]
get_trial_root()[source]
get_trial_time()[source]
get_trial_train_log_dir()[source]
remove_trials_older_than(diff_day=0, diff_hour=1, diff_minute=0, diff_second=0)[source]

By default this function removes all trials started one hour earlier than current time.

Parameters
  • diff_day (int) – Difference in days.

  • diff_hour (int) – Difference in hours.

  • diff_minute (int) – Difference in minutes.

  • diff_second (int) – Difference in seconds.

tensor_board

machin.utils.tensor_board.default_board

The default global board.

class machin.utils.tensor_board.TensorBoard[source]

Bases: object

Create a tensor board object.

writer

SummaryWriter of package tensorboardX.

init(*writer_args)[source]
is_inited()[source]

Returns: whether the board has been initialized with a writer.

Return type

bool

visualize

machin.utils.visualize.visualize_graph(final_tensor, visualize_dir='', exit_after_vis=True)[source]

Visualize a pytorch flow graph

Parameters
  • final_tensor – The last output tensor of the flow graph

  • visualize_dir – Directory to place the visualized files

  • exit_after_vis – Whether to exit the whole program after visualization.