machin.utils¶
checker¶
-
machin.utils.checker.
check_model
(writer, model, input_check_hooks=(<function i_chk_nan>, <function i_chk_range>), output_check_hooks=(<function o_chk_nan>, <function o_chk_range>), param_check_hooks=(<function p_chk_nan>, <function p_chk_range>), input_check_interval=1, output_check_interval=1, param_check_interval=100, name='')[source]¶ Check model input, output and parameters using hooks. All hooks (Input, output and parameter) check hooks are executed in the forward pass.
An example:
model = nn.Linear([100, 100]) check_model(model) # Continue to do whatever you like. model(t.zeros([100]))
Note
Only leaf modules will be checked (such as
nn.Linear
and not some complex neural network modules made of several sub-modules). But you can manually control granularity.Warning
Do not output
tuple
in yourforward()
function if you have output check hooks, otherwise you must specify names for each output.Hint
You may manually control the check granularity by using
mark_as_atom_module()
.You may specify a list of names for your module outputs so names given to your output check hooks will not be numbers, by using
mark_module_output()
Hint
For all three kinds of hooks, your hook need to have the following signature:
hook(counter, writer, model, module, name, value)
where:
counter
is theCounter
, you can useCounter.get()
to get the current pass number.writer
isSummaryWriter
fromtensorboardx
.model
is your model.module
is the module currently being checked.name
is input/output/parameter name string. For input, their detail names will be extracted from moduleforward
signature. Output detail names will be numbers or names you have specified.value
is input/output/parameter value.
- Parameters
writer (tensorboardX.writer.SummaryWriter) – Tensorboard
SummaryWriter
used to log.model (torch.nn.modules.module.Module) – Model to be checked.
input_check_hooks – A series of input check hooks.
output_check_hooks – A series of output check hooks.
param_check_hooks – A series of parameter check hooks.
input_check_interval – Interval (number of forward passes) of input checking.
output_check_interval – Interval (number of forward passes) of output checking.
param_check_interval – Interval (number of backward passes) of parameter checking.
name – Your model name.
- Returns
A function
f()
, callingf()
will deregister all check hooks.
-
machin.utils.checker.
check_nan
(tensor, name='')[source]¶ Check whether tensor has
nan
element.- Parameters
tensor (torch.Tensor) – Tensor to check
name – Name of tensor, will be printed in the error message.
- Raises
RuntimeError if tensor has any nan element. –
-
machin.utils.checker.
check_shape
(tensor, required_shape, name='')[source]¶ Check whether tensor has the specified shape.
- Parameters
tensor (torch.Tensor) – Tensor to check.
required_shape (List[int]) – A list of
int
specifying shape of each dimension.name – Name of tensor, will be printed in the error message.
- Raises
RuntimeError if shape of the tensor doesn't match. –
-
machin.utils.checker.
i_chk_nan
(_counter, _writer, _model, _module, input_name, input_val)[source]¶ Check whether there is any nan element in the input, if input is a tensor.
-
machin.utils.checker.
i_chk_range
(counter, writer, _model, _module, input_name, input_val)[source]¶ Compute min, max and mean value of the input, if input is a tensor.
-
machin.utils.checker.
mark_as_atom_module
(module)[source]¶ Mark module as a atom leaf module, so it can be checked.
-
machin.utils.checker.
mark_module_output
(module, output_names)[source]¶ Mark names of the module output. It will also tell checker about the number of outputs.
- Parameters
module – Module to be marked.
output_names (List[str]) – Name of each output value.
-
machin.utils.checker.
o_chk_nan
(_counter, _writer, _model, _module, output_name, output_val)[source]¶ Check whether there is any nan element in the output, if input is a tensor.
-
machin.utils.checker.
o_chk_range
(counter, writer, _model, _module, output_name, output_val)[source]¶ Compute min, max and mean value of the output, if output is a tensor.
conf¶
-
machin.utils.conf.
load_config_cmd
(merge_conf=None)[source]¶ Get configs from the commandline by using “–conf”.
--conf a=b
will set<Returned Config>.a = b
Example:
python3 test.py --conf device="cuda:1" --conf some_dict={"some_key":1}
Example:
from machin.utils.conf import Config from machin.utils.save_env import SaveEnv # set some config attributes c = Config( model_save_int = 100, root_dir = "some_directory", restart_from_trial = "2020_05_09_15_00_31" ) load_config_cmd(c) # restart_from_trial specifies the trial name in your root # directory. # If it is set, then SaveEnv constructor will # load arguments from that trial record, will overwrite. # If not, then SaveEnv constructor will save configurations # as: ``<c.some_root_dir>/<trial_start_time>/config/config.json`` save_env = SaveEnv(c)
- Parameters
merge_conf (machin.utils.conf.Config) – Config to merge.
- Return type
-
machin.utils.conf.
load_config_file
(json_file, merge_conf=None)[source]¶ Get configs from a json file.
- Parameters
json_file (str) – Path to the json config file.
merge_conf (machin.utils.conf.Config) – Config to merge.
- Returns
configuration
- Return type
-
machin.utils.conf.
merge_config
(conf, merge)[source]¶ Merge config object with a dictionary, or a Config object, same keys in the
conf
will be overwritten by keys inmerge
.- Parameters
conf (machin.utils.conf.Config) –
machin.utils.conf.Config] merge (Union[dict,) –
- Return type
-
machin.utils.conf.
save_config
(conf, json_file)[source]¶ Dump config object to a json file.
- Parameters
conf (machin.utils.conf.Config) –
json_file (str) –
helper_classes¶
-
class
machin.utils.helper_classes.
Object
(data=None, const_attrs=None)[source]¶ Bases:
object
An generic object class, which stores a dictionary internally, and you can access and set its keys by accessing and seting attributes of the object.
-
data
¶ Internal dictionary.
-
-
class
machin.utils.helper_classes.
Switch
(state=False)[source]¶ Bases:
object
- Parameters
state (bool) – Internal state,
True
for on,False
for off.
-
class
machin.utils.helper_classes.
Trigger
(state=False)[source]¶ Bases:
machin.utils.helper_classes.Switch
- Parameters
state (bool) – Internal state,
True
for on,False
for off.
learning_rate¶
This module is the place for all learning rate functions, currently, only manual learning rate changing according to global steps is implemented,.
-
machin.utils.learning_rate.
gen_learning_rate_func
(lr_map, logger=None)[source]¶ Example:
from torch.optim.lr_scheduler import LambdaLR # 0 <= step < 200000, lr=1e-3, 200000 <= step, lr=3e-4 lr_func = gen_learning_rate_func([(0, 1e-3), (200000, 3e-4)],) lr_sch = LambdaLR(optimizer, lr_func)
- Parameters
lr_map (List[Tuple[int, float]]) – A 2d learning rate map, the first element of each row is step. the second is learning rate.
logger (logging.Logger) – A logger to log current learning rate
- Returns
A learning rate generation function with signature lr_gen(step)->lr, accepts int and returns float. use it in your pytorch lr scheduler.
logging¶
-
machin.utils.logging.
default_logger
¶ The default global logger.
TODO: maybe add logging utilities for distributed scenario?
media¶
-
machin.utils.media.
create_image
(image, path, filename, extension='.png')[source]¶ - Parameters
image (numpy.array) – A numpy array of shape (H, W, C) or (H, W), and with
dtype
= any float or any int. When a frame is float type, its value range should be [0, 1]. When a frame is integer type, its value range should be [0, 255].path (str) – Directory to save the image.
filename (str) – File name.
extension (str) – File extension.
-
machin.utils.media.
create_image_subproc
(image, path, filename, extension='.png', daemon=True)[source]¶ Create image with a subprocess.
See also
Note
if
daemon
is true, then this function cannot be used in a daemonic subprocess.- Parameters
image (numpy.array) – A numpy array of shape (H, W, C) or (H, W), and with
dtype
= any float or any int. When a frame is float type, its value range should be [0, 1]. When a frame is integer type, its value range should be [0, 255].path (str) – Directory to save the image.
filename (str) – File name.
extension (str) – File extension.
daemon (bool) – Whether launching the saving process as a daemonic process.
- Returns
A wait function, once called, block until creation has finished.
-
machin.utils.media.
create_video
(frames, path, filename, extension='.gif', fps=15)[source]¶ - Parameters
frames (List[numpy.array]) – A list of numpy arrays of shape (H, W, C) or (H, W), and with
dtype
= any float or any int. When a frame is float type, its value range should be [0, 1]. When a frame is integer type, its value range should be [0, 255].path (str) – Directory to save the video.
filename (str) – File name.
extension (str) – File extension.
fps (int) – frames per second.
-
machin.utils.media.
create_video_subproc
(frames, path, filename, extension='.gif', fps=15, daemon=True)[source]¶ Create video with a subprocess, since it takes a lot of time for
moviepy
to encode the video file.See also
Note
if
daemon
is true, then this function cannot be used in a daemonic subprocess.- Parameters
frames (List[numpy.array]) – A list of numpy arrays of shape (H, W, C) or (H, W), and with
dtype
= any float or any int. When a frame is float type, its value range should be [0, 1]. When a frame is integer type, its value range should be [0, 255].path (str) – Directory to save the video.
filename (str) – File name.
extension (str) – File extension.
fps (int) – frames per second.
daemon (bool) – Whether launching the saving process as a daemonic process.
- Returns
A wait function, once called, block until creation has finished.
-
machin.utils.media.
show_image
(image, show_normalized=True, pause_time=0.01, title='')[source]¶ Use matplotlib to show a single image. You may repeatedly call this method with the same
title
argument to show a video or a dynamically changing image.- Parameters
image (numpy.array) – A numpy array of shape (H, W, C) or (H, W), and with
dtype
= any float or any int. When a frame is float type, its value range should be [0, 1]. When a frame is integer type, its value range should be [0, 255].show_normalized (bool) – Show normalized image alongside the original one.
pause_time (float) – Pause time between displaying current image and the next one.
title (str) – Title of the display window.
prepare¶
-
machin.utils.prepare.
prep_clear_dirs
(dirs)[source]¶ - Parameters
dirs (Iterable[str]) – a list of directories to clear
-
machin.utils.prepare.
prep_create_dirs
(dirs)[source]¶ Note: will recursively create directories.
- Parameters
dirs (Iterable[str]) – a list of directories to create if these directories are not found.
-
machin.utils.prepare.
prep_load_model
(model_dir, model_map, version=None, quiet=False, logger=None)[source]¶ Automatically find and load models.
- Parameters
model_dir (str) – Directory to save models.
model_map (Dict[str, torch.nn.modules.module.Module]) – Model saving map.
version (int) – Version to load, if specified, otherwise automatically find the latest version.
quiet (bool) – Raise no error if no valid version could be found.
logger (Any) – Logger to use.
save_env¶
-
class
machin.utils.save_env.
SaveEnv
(env_root, restart_from_trial=None, time_format='%Y_%m_%d_%H_%M_%S')[source]¶ Bases:
object
Create the default environment for saving. creates something like:
<your environment root> ├── config ├── log │ ├── images │ └── train_log └── model
- Parameters
env_root (str) – root directory for all trials of the environment.
restart_from_trial (Optional[str]) – instead of creating a new save environment for a new trial, use a existing save environment of an older trial, old trial name should be in format
time_format
-
create_dirs
(dirs)[source]¶ Create additional directories in root.
- Parameters
dirs (Iterable[str]) – Directories.
-
remove_trials_older_than
(diff_day=0, diff_hour=1, diff_minute=0, diff_second=0)[source]¶ By default this function removes all trials started one hour earlier than current time.
- Parameters
diff_day (int) – Difference in days.
diff_hour (int) – Difference in hours.
diff_minute (int) – Difference in minutes.
diff_second (int) – Difference in seconds.
tensor_board¶
-
machin.utils.tensor_board.
default_board
¶ The default global board.
visualize¶
-
machin.utils.visualize.
visualize_graph
(final_tensor, visualize_dir='', exit_after_vis=True)[source]¶ Visualize a pytorch flow graph
- Parameters
final_tensor – The last output tensor of the flow graph
visualize_dir – Directory to place the visualized files
exit_after_vis – Whether to exit the whole program after visualization.