seutil package
Subpackages
Submodules
seutil.BashUtils module
- class seutil.BashUtils.BashUtils[source]
Bases:
object
Utility functions for running Bash commands.
- PRINT_LIMIT = 1000
- class RunResult(return_code, stdout, stderr)[source]
Bases:
tuple
- return_code: int
Alias for field number 0
- stderr: str
Alias for field number 2
- stdout: str
Alias for field number 1
- classmethod run(cmd: str, expected_return_code: Optional[int] = None, is_update_env: bool = False, timeout: Optional[float] = None) seutil.BashUtils.BashUtils.RunResult [source]
Runs a Bash command and returns the stdout. :param cmd: the command to run. :param expected_return_code: if set to an int, will raise exception if the return code mismatch. :param is_update_env: if true, the environment in this python process (os.environ) will be updated upon the successful execution of cmd (i.e., returns 0), to reflect the changes to the enrionment variables cmd may make. Note it can not change the environment of the process that invoked this python process. It is useful because the updated environment will be used for later BashUtils.run executions. :param timeout: if not None, kill the process after timeout seconds and raise TimeoutExpire exception. :return: the run result, which is a named tuple with field return_code, stdout, stderr.
seutil.CliUtils module
- class seutil.CliUtils.StoreInDict(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[source]
Bases:
argparse.Action
- seutil.CliUtils.main(argv, actions: Dict[str, Callable], normalize_options: Optional[Callable[[Dict], Dict]] = None)[source]
Main function for command line option parsing, in the form of “THIS action options…”, Where each option is in the form of “-name=value”. :param argv: The command line inputs, without the name of the script (sys.argv[1:]). :param actions: The mapping from action name to action function. :param normalize_options: Optional function to normalize options, by default identical function.
seutil.GitHubUtils module
- class seutil.GitHubUtils.GitHubUtils[source]
Bases:
object
- DEFAULT_ACCESS_TOKEN = None
- DEFAULT_GITHUB_OBJECT = None
- GITHUB_SEARCH_ITEMS_MAX = 1000
- T
alias of TypeVar(‘T’)
- classmethod ensure_github_api_call(call: Callable[[github.MainClass.Github], seutil.GitHubUtils.T], github: Optional[github.MainClass.Github] = None, max_retry_times: int = inf) seutil.GitHubUtils.T [source]
- logger = <Logger GitHubUtils (DEBUG)>
- classmethod search_repos(q: str = '', sort: str = 'stars', order: str = 'desc', is_allow_fork: bool = False, max_num_repos: int = 1000, github: Optional[github.MainClass.Github] = None, max_retry_times: int = inf, *_, **qualifiers) List[github.Repository.Repository] [source]
Searches the repos by querying GitHub API v3. :return: a list of full names of the repos match the query.
- classmethod search_repos_of_language(language: str, max_num_repos: int = inf, is_allow_fork: bool = False, max_retry_times: int = inf, strategies: Optional[List[str]] = None) List[github.Repository.Repository] [source]
Searches for all the repos of the language. :return: a list of full names of matching repos.
- classmethod search_users(q: str = '', sort: str = 'repositories', order: str = 'desc', max_num_users: int = 1000, github: Optional[github.MainClass.Github] = None, max_retry_times: int = inf, *_, **qualifiers) List[github.NamedUser.NamedUser] [source]
Searches the users by querying GitHub API v3. :return: a list of usernames (login) of the users match the query.
seutil.IOUtils module
- class seutil.IOUtils.IOUtils[source]
Bases:
object
Utility functions for I/O.
- DEJSONFY_FUNC_NAME = 'dejsonfy'
- class Format(value)[source]
Bases:
enum.Enum
An enumeration.
- classmethod from_str(string: str) seutil.IOUtils.IOUtils.Format [source]
- json = (4,)
- jsonList = (5,)
- jsonNoSort = (3,)
- jsonPretty = (2,)
- pkl = (1,)
- txt = (0,)
- txtList = (6,)
- yaml = (7,)
- IO_FORMATS: Dict[seutil.IOUtils.IOUtils.Format, Dict] = defaultdict(<function IOUtils.<lambda>>, {<Format.pkl: (1,)>: {'mode': 'b', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.jsonPretty: (2,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.jsonNoSort: (3,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.json: (4,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.yaml: (7,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.jsonList: (5,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.txtList: (6,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}})
- JSONFY_ATTR_FIELD_NAME = 'jsonfy_attr'
- JSONFY_FUNC_NAME = 'jsonfy'
- class cd(path: Union[str, pathlib.Path])[source]
Bases:
object
Change directory. Usage:
- with IOUtils.cd(path):
<statements>
# end with
Using a string path is supported for backward compatibility. Using pathlib.Path should be preferred.
- classmethod dejsonfy(data, clz: Optional[Union[Type, str]] = None)[source]
Turns a json-compatible data structure to an object of class {@code clz}. If {@code clz} is not assigned, the data will be casted to dict or list if possible. Otherwise the data will be casted to the object through (try each option in order, if applicable): 1. DEJSONFY function, which takes the data as argument and returns a object;
should have the name {@link IOUtils#DEJSONFY_FUNC_NAME};
JSONFY_ATTR field, which is a dict of attribute name-type pairs, that will be extracted from the object to a dict; should have the name {@link IOUtils#JSONFY_ATTR_FIELD_NAME};
- classmethod dump(file_path: Union[str, pathlib.Path], obj: object, fmt: Union[seutil.IOUtils.IOUtils.Format, str] = Format.jsonPretty, append: bool = False) None [source]
Saves an object to the file in the specified format. By default, the format is json pretty-print, and the existing content in the file will be erased. :param file_path: the file to save the object into. :param obj: the object to save. :param fmt: the format, one of IOUtils.Format. :param append: if true, appends to the file instead of erasing existing content in the file.
- classmethod extend_json(file_name, data)[source]
Updates the json data file. The data should be list like (support extend).
- classmethod jsonfy(obj)[source]
Turns an object to a json-compatible data structure. A json-compatible data can only have list, dict (with str keys), str, int and float. Any object of other classes will be casted through (try each option in order, if applicable): 1. JSONFY function, which takes no argument and returns a json-compatible data;
should have the name {@link IOUtils#JSONFY_FUNC_NAME};
JSONFY_ATTR field, which is a dict of attribute name-type pairs, that will be extracted from the object to a dict; should have the name {@link IOUtils#JSONFY_ATTR_FIELD_NAME};
cast to a string.
- classmethod load(file_path: Union[str, pathlib.Path], fmt: Union[seutil.IOUtils.IOUtils.Format, str] = Format.jsonPretty) Any [source]
- classmethod load_json_stream(file_path: Union[str, pathlib.Path], fmt: Union[seutil.IOUtils.IOUtils.Format, str] = Format.jsonPretty)[source]
Reads large json file containing a list of data iteratively. Returns a generator function.
- classmethod mk_dir(dirname, mode=511, is_remove_if_exists: bool = False, is_make_parent: bool = True)[source]
Makes the directory. :param dirname: the name of the directory. :param mode: mode of the directory. :param is_remove_if_exists: if the directory with name already exists, whether to remove. :param is_make_parent: if make parent directory if not exists.
- classmethod rm(path: pathlib.Path, ignore_non_exist: bool = True, force: bool = True)[source]
Removes the file/dir. :param path: the path to the file/dir to remove. :param ignore_non_exist: ignores error if the file/dir does not exist. :param force: force remove the file even it’s protected / dir even it’s non-empty.
seutil.LoggingUtils module
- class seutil.LoggingUtils.LoggingUtils[source]
Bases:
object
- CRITICAL = 50
- DEBUG = 10
- ERROR = 40
- INFO = 20
- WARNING = 30
- default_handlers = []
- default_level = 30
- classmethod get_handler_console(stream=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>, level=30) logging.Handler [source]
- loggers = [<Logger GitHubUtils (DEBUG)>]
- logging_format = '[{relativeCreated:6.0f}{levelname[0]}]{name}: {message}'
- logging_format_detail = '[{asctime}|{relativeCreated:.3f}|{levelname:7}]{name}: {message} [@{filename}:{lineno}|{funcName}|pid {process}|tid {thread}]'
seutil.MiscUtils module
- seutil.MiscUtils.get_num_params(vocab_size, num_layers, num_neurons)[source]
Returns the number of trainable parameters of an LSTM.
- Parameters
vocab_size (int) – The vocabulary size
num_layers (int) – The number of layers in the LSTM
num_neurons (int) – The number of neurons / units per layer
- Returns
The number of trainable parameters
- Return type
int
seutil.Stream module
- class seutil.Stream.Stream[source]
Bases:
object
Streams help manipulate sequences of objects.
- filter(predicate_func: Callable[[object], bool])[source]
Returns a stream consisting of the elements of this stream that match the given predicate.
- classmethod of(one_or_more_items)[source]
Get a new stream from the item / items. :param one_or_more_items: is converted to list with builtin list function.
- classmethod of_dirs(dir_path: Union[str, pathlib.Path])[source]
Get a stream of the sub-directories under the directory.
- classmethod of_files(dir_path: Union[str, pathlib.Path])[source]
Get a stream of the files under the directory.
- sorted(key: typing.Callable[[str], object] = <function Stream.<lambda>>, reverse: bool = False)[source]
Sorts the list of files in the dataset.
- split(fraction_list: typing.List[float], count_func: typing.Callable[[str], float] = <function Stream.<lambda>>)[source]
Splits the dataset as each part specified by the fractions (assumed to sum up to 1). Splitting is done by finding the cutting points. If randomization is needed, call shuffle first. :param count_func: customize the number of data counts in each file.
seutil.TimeUtils module
seutil.bash module
- exception seutil.bash.BashError(cmd: str, completed_process: subprocess.CompletedProcess, check_returncode: int)[source]
Bases:
RuntimeError
- seutil.bash.run(cmd: str, check_returncode: Optional[int] = None, warn_nonzero: bool = True, update_env: bool = False, update_env_clear_existing: bool = False, **kwargs) subprocess.CompletedProcess [source]
Run a bash command using subprocess.run. The command will be run using “bash -c”.
Some arguments’ default values are changed (but can be overridden with kwargs): * capture_output=True, text=True: capture all stdout and stderr.
This function is able to check if return code match a given value (subprocess only supports checking non-zero values, but this function supports any). Nevertheless, this function warns about any non-zero values if check_returncode is not set, to avoid silent failures; this behavior can be turned off via warn_nonzero=False.
In addition, this function can try to update the environment variables in this process with the ones after running the command (if the command finished successfully). The retrival of the sub shell’s environments is done by env into a temporary file.
- Parameters
cmd – the command to run
check_returncode – the return code to expect from the command
warn_nonzero – whether to warn about non-zero exit codes
update_env – whether to update the environment variables in this process
update_env_clear_existing – whether to clear existing environment variables before updating
kwargs – other arguments passed to subprocess.run
- Returns
the subprocess.CompletedProcess object, has stdout, stderr, returncode fields
- Raises
BashError if the command’s output did not match check_returncode
- Raises
subprocess.TimeoutExpired if the command timed out
seutil.debug module
seutil.io module
- exception seutil.io.DeserializationError(data, clz: Optional[Union[Type, str]], reason: str)[source]
Bases:
RuntimeError
- class seutil.io.Fmt(value)[source]
Bases:
seutil.io.FmtProperty
,enum.Enum
An enumeration.
- binary: bool
Alias for field number 3
- exts: List[str]
Alias for field number 2
- json = FmtProperty(writer=<function Fmt.<lambda>>, reader=<function Fmt.<lambda>>, exts=['json'], binary=False, line_mode=False, serialize=True)
- jsonFlexible = FmtProperty(writer=<function Fmt.<lambda>>, reader=<function Fmt.<lambda>>, exts=['json'], binary=False, line_mode=False, serialize=True)
- jsonList = FmtProperty(writer=<function Fmt.<lambda>>, reader=<function Fmt.<lambda>>, exts=['jsonl'], binary=False, line_mode=True, serialize=True)
- jsonNoSort = FmtProperty(writer=<function Fmt.<lambda>>, reader=<function Fmt.<lambda>>, exts=['json'], binary=False, line_mode=False, serialize=True)
- jsonPretty = FmtProperty(writer=<function Fmt.<lambda>>, reader=<function Fmt.<lambda>>, exts=['json'], binary=False, line_mode=False, serialize=True)
- line_mode: bool
Alias for field number 4
- pickle = FmtProperty(writer=<function Fmt.<lambda>>, reader=<function Fmt.<lambda>>, exts=['pkl', 'pickle'], binary=True, line_mode=False, serialize=False)
- reader: Union[Callable[[io.IOBase], Any], Callable[[str], Any]]
Alias for field number 1
- serialize: bool
Alias for field number 5
- txt = FmtProperty(writer=<function Fmt.<lambda>>, reader=<function Fmt.<lambda>>, exts=['txt'], binary=False, line_mode=False, serialize=False)
- txtList = FmtProperty(writer=<function Fmt.<lambda>>, reader=<function Fmt.<lambda>>, exts=['txt'], binary=False, line_mode=True, serialize=False)
- writer: Union[Callable[[io.IOBase, Any], None], Callable[[Any], str]]
Alias for field number 0
- yaml = FmtProperty(writer=<function Fmt.<lambda>>, reader=<function Fmt.<lambda>>, exts=['yml', 'yaml'], binary=False, line_mode=False, serialize=True)
- class seutil.io.cd(path: Union[str, pathlib.Path])[source]
Bases:
object
Temporally changes directory, for use with with:
``` with cd(path):
# cwd moved to path <statements>
- seutil.io.deserialize(data, clz: Optional[Union[Type, str]] = None, error: str = 'ignore')[source]
Deserializes some data (with only primitive types, list, dict) to an object with proper types.
- Parameters
data – the data to be deserialized.
clz – the targeted type of deserialization (or its name); if None, will return the data as-is.
error – what to do when the deserialization has problem: * raise: raise a DeserializationError. * ignore (default): return the data as-is.
- Returns
the deserialized data.
- seutil.io.dump(path: Union[str, pathlib.Path], obj: object, fmt: Optional[seutil.io.Fmt] = None, serialization: Optional[bool] = None, parents: bool = True, append: bool = False, exists_ok: bool = True, serialization_fmt_aware: bool = True) None [source]
Saves an object to a file. The format is automatically inferred from the file name, if not otherwise specified. By default, serialization (i.e., converting to primitive types and data structures) is automatically performed for the formats that needs it (e.g., json).
- Parameters
path – the path to save the file.
obj – the object to be saved.
fmt – the format of the file; if None (default), inferred from path.
serialization – whether or not to serialize the object before saving: * True: always serialize; * None (default): only serialize for the formats that needs it; * False: never serialize.
parents – what to do if parent directories of path do not exist: * True (default): automatically create them; * False: raise Exception.
append – whether to append to an existing file if any (default False).
exists_ok – what to do if path already exists and append is False: * True (default): automatically rewrites it; * False: raise Exception.
serialization_fmt_aware – let the serialization function be aware of the target format to fit its constraints (e.g., dictionaries in json format can only have str keys).
- seutil.io.load(path: Union[str, pathlib.Path], fmt: Optional[seutil.io.Fmt] = None, serialization: Optional[bool] = None, clz: Optional[Type] = None, error: str = 'ignore', iter_line: bool = False) Union[object, Iterator[object]] [source]
Loads an object from a file. The format is automatically inferred from the file name, if not otherwise specified. By default, if clz is given, deserialization (i.e., unpackingn from primitive types and data structures) is automatically performed for the formats that needs it (e.g., json).
- Parameters
path – the path to load the object.
fmt – the format of the file; if None (default), inferred from path.
serialization – whether or not to deserialize the object after loading: * True: always serialize; * None (default): only serialize for the formats that needs it; * False: never serialize.
clz – the class to use for deserialization; if None (default), deserialization is a no-op.
error – what to do if deserialization fails: * raise: raise a DeserializationError. * ignore (default): return the data as-is.
iter_line – whether to iterate over the lines of the file instead of loading the whole file.
- seutil.io.mkdir(path: Union[str, pathlib.Path], parents: bool = True, fresh: bool = False)[source]
Creates a directory.
- Parameters
path – the path to the directory.
parents – if True, automatically creates parent directories; otherwise, raise error if any parent is missing.
fresh – if True and if the directory already exists, removes it before creating.
- seutil.io.mktmp(prefix: Optional[str] = None, suffix: Optional[str] = None, separator: str = '-', dir: Optional[pathlib.Path] = None) pathlib.Path [source]
Makes a temp file. A wrapper for tempfile.mkstemp.
- seutil.io.mktmp_dir(prefix: Optional[str] = None, suffix: Optional[str] = None, separator: str = '-', dir: Optional[pathlib.Path] = None) pathlib.Path [source]
Makes a temp directory. A wrapper for tempfile.mkdtemp.
- seutil.io.rm(path: Union[str, pathlib.Path], missing_ok: bool = True, force: bool = True)[source]
Removes a file/directory.
- Parameters
path – the name of the file/directory.
missing_ok – (-f) ignores error if the file/directory does not exist.
force – (-rf) force remove the directory even it’s not empty.
- seutil.io.rmdir(path: Union[str, pathlib.Path], missing_ok: bool = True, force: bool = True)[source]
Removes a directory.
- Parameters
path – the name of the directory.
missing_ok – (-f) ignores error if the directory does not exist.
force – (-f) force remove the directory even it’s non-empty.
- seutil.io.serialize(obj: object, fmt: Optional[seutil.io.Fmt] = None) object [source]
Serializes an object into a data structure with only primitive types, list, dict. If fmt is provided, its formatting constraints are taken into account. Supported fmts: * json, jsonPretty, jsonNoSort, jsonList: dict only have str keys.
- Parameters
obj – the object to be serialized.
fmt – (optional) the target format.
- Returns
the serialized object.
seutil.log module
This module assists the logging standard library. The main functionality is:
maintain two frequently used handlers: a stderr handler and a file handler, both with rich and customizable formats.
setup method to attach them to the root logger.
get_logger method to quickly create a logger with customized level.
- seutil.log.get_logger(name: str, level: Union[int, str] = 0)[source]
Get a logger with specified name and level.
- seutil.log.setup(log_file: Optional[Union[str, pathlib.Path]] = None, level_stderr: Union[int, str] = 20, level_file: Union[int, str] = 10, fmt_stderr: str = '[{asctime}{levelname[0]}]{name}: {message}', datefmt_stderr: str = '%H:%M:%S', fmt_file: str = '[{asctime}|{relativeCreated:.3f}|{levelname:7}]{name}: {message} [@{filename}:{lineno}|{funcName}|pid {process}|tid {thread}]', datefmt_file: str = '%Y-%m-%d %H:%M:%S', clear_handlers: bool = True, **kwargs_file: dict)[source]
Setup the stderr and file handlers, and attach them to the root logger.
- Parameters
log_file – the log file to use; if None, no file handler is created (and any existing one would be removed)
level_stderr – the level filter of the stderr handler
level_file – the level filter of the file handler
fmt_stderr – the format of the stderr handler (with {} style)
fmt_file – the format of the file handler (with {} style)
clear_handlers – if True, remove all existing handlers of the root logger; otherwise, keep them as is
kwargs_file – other optional kwargs to the file handler (RotatingFileHandler)
seutil.pbar module
- class seutil.pbar.PBarManager(out: TextIO, switch_interval: float = 2.5)[source]
Bases:
object
- add(instance: seutil.pbar.tqdm_managed)[source]
- remove(instance: seutil.pbar.tqdm_managed)[source]
- seutil.pbar.tqdm
alias of
seutil.pbar.tqdm_managed
- class seutil.pbar.tqdm_managed(*_, **__)[source]
Bases:
tqdm.std.tqdm
- display(*_, **__)[source]
Use self.sp to display msg in the specified pos.
Consider overloading this function when inheriting to use e.g.: self.some_frontend(**self.format_dict) instead of self.sp.
- Parameters
msg (str, optional. What to display (default: repr(self)).) –
pos (int, optional. Position to moveto) – (default: abs(self.pos)).
Module contents
- class seutil.BashUtils[source]
Bases:
object
Utility functions for running Bash commands.
- PRINT_LIMIT = 1000
- class RunResult(return_code, stdout, stderr)[source]
Bases:
tuple
- return_code: int
Alias for field number 0
- stderr: str
Alias for field number 2
- stdout: str
Alias for field number 1
- classmethod run(cmd: str, expected_return_code: Optional[int] = None, is_update_env: bool = False, timeout: Optional[float] = None) seutil.BashUtils.BashUtils.RunResult [source]
Runs a Bash command and returns the stdout. :param cmd: the command to run. :param expected_return_code: if set to an int, will raise exception if the return code mismatch. :param is_update_env: if true, the environment in this python process (os.environ) will be updated upon the successful execution of cmd (i.e., returns 0), to reflect the changes to the enrionment variables cmd may make. Note it can not change the environment of the process that invoked this python process. It is useful because the updated environment will be used for later BashUtils.run executions. :param timeout: if not None, kill the process after timeout seconds and raise TimeoutExpire exception. :return: the run result, which is a named tuple with field return_code, stdout, stderr.
- class seutil.GitHubUtils[source]
Bases:
object
- DEFAULT_ACCESS_TOKEN = None
- DEFAULT_GITHUB_OBJECT = None
- GITHUB_SEARCH_ITEMS_MAX = 1000
- T
alias of TypeVar(‘T’)
- classmethod ensure_github_api_call(call: Callable[[github.MainClass.Github], seutil.GitHubUtils.T], github: Optional[github.MainClass.Github] = None, max_retry_times: int = inf) seutil.GitHubUtils.T [source]
- logger = <Logger GitHubUtils (DEBUG)>
- classmethod search_repos(q: str = '', sort: str = 'stars', order: str = 'desc', is_allow_fork: bool = False, max_num_repos: int = 1000, github: Optional[github.MainClass.Github] = None, max_retry_times: int = inf, *_, **qualifiers) List[github.Repository.Repository] [source]
Searches the repos by querying GitHub API v3. :return: a list of full names of the repos match the query.
- classmethod search_repos_of_language(language: str, max_num_repos: int = inf, is_allow_fork: bool = False, max_retry_times: int = inf, strategies: Optional[List[str]] = None) List[github.Repository.Repository] [source]
Searches for all the repos of the language. :return: a list of full names of matching repos.
- classmethod search_users(q: str = '', sort: str = 'repositories', order: str = 'desc', max_num_users: int = 1000, github: Optional[github.MainClass.Github] = None, max_retry_times: int = inf, *_, **qualifiers) List[github.NamedUser.NamedUser] [source]
Searches the users by querying GitHub API v3. :return: a list of usernames (login) of the users match the query.
- class seutil.IOUtils[source]
Bases:
object
Utility functions for I/O.
- DEJSONFY_FUNC_NAME = 'dejsonfy'
- class Format(value)[source]
Bases:
enum.Enum
An enumeration.
- classmethod from_str(string: str) seutil.IOUtils.IOUtils.Format [source]
- json = (4,)
- jsonList = (5,)
- jsonNoSort = (3,)
- jsonPretty = (2,)
- pkl = (1,)
- txt = (0,)
- txtList = (6,)
- yaml = (7,)
- IO_FORMATS: Dict[seutil.IOUtils.IOUtils.Format, Dict] = defaultdict(<function IOUtils.<lambda>>, {<Format.pkl: (1,)>: {'mode': 'b', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.jsonPretty: (2,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.jsonNoSort: (3,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.json: (4,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.yaml: (7,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.jsonList: (5,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}, <Format.txtList: (6,)>: {'mode': 't', 'dumpf': <function IOUtils.<lambda>>, 'loadf': <function IOUtils.<lambda>>}})
- JSONFY_ATTR_FIELD_NAME = 'jsonfy_attr'
- JSONFY_FUNC_NAME = 'jsonfy'
- class cd(path: Union[str, pathlib.Path])[source]
Bases:
object
Change directory. Usage:
- with IOUtils.cd(path):
<statements>
# end with
Using a string path is supported for backward compatibility. Using pathlib.Path should be preferred.
- classmethod dejsonfy(data, clz: Optional[Union[Type, str]] = None)[source]
Turns a json-compatible data structure to an object of class {@code clz}. If {@code clz} is not assigned, the data will be casted to dict or list if possible. Otherwise the data will be casted to the object through (try each option in order, if applicable): 1. DEJSONFY function, which takes the data as argument and returns a object;
should have the name {@link IOUtils#DEJSONFY_FUNC_NAME};
JSONFY_ATTR field, which is a dict of attribute name-type pairs, that will be extracted from the object to a dict; should have the name {@link IOUtils#JSONFY_ATTR_FIELD_NAME};
- classmethod dump(file_path: Union[str, pathlib.Path], obj: object, fmt: Union[seutil.IOUtils.IOUtils.Format, str] = Format.jsonPretty, append: bool = False) None [source]
Saves an object to the file in the specified format. By default, the format is json pretty-print, and the existing content in the file will be erased. :param file_path: the file to save the object into. :param obj: the object to save. :param fmt: the format, one of IOUtils.Format. :param append: if true, appends to the file instead of erasing existing content in the file.
- classmethod extend_json(file_name, data)[source]
Updates the json data file. The data should be list like (support extend).
- classmethod jsonfy(obj)[source]
Turns an object to a json-compatible data structure. A json-compatible data can only have list, dict (with str keys), str, int and float. Any object of other classes will be casted through (try each option in order, if applicable): 1. JSONFY function, which takes no argument and returns a json-compatible data;
should have the name {@link IOUtils#JSONFY_FUNC_NAME};
JSONFY_ATTR field, which is a dict of attribute name-type pairs, that will be extracted from the object to a dict; should have the name {@link IOUtils#JSONFY_ATTR_FIELD_NAME};
cast to a string.
- classmethod load(file_path: Union[str, pathlib.Path], fmt: Union[seutil.IOUtils.IOUtils.Format, str] = Format.jsonPretty) Any [source]
- classmethod load_json_stream(file_path: Union[str, pathlib.Path], fmt: Union[seutil.IOUtils.IOUtils.Format, str] = Format.jsonPretty)[source]
Reads large json file containing a list of data iteratively. Returns a generator function.
- classmethod mk_dir(dirname, mode=511, is_remove_if_exists: bool = False, is_make_parent: bool = True)[source]
Makes the directory. :param dirname: the name of the directory. :param mode: mode of the directory. :param is_remove_if_exists: if the directory with name already exists, whether to remove. :param is_make_parent: if make parent directory if not exists.
- classmethod rm(path: pathlib.Path, ignore_non_exist: bool = True, force: bool = True)[source]
Removes the file/dir. :param path: the path to the file/dir to remove. :param ignore_non_exist: ignores error if the file/dir does not exist. :param force: force remove the file even it’s protected / dir even it’s non-empty.
- class seutil.LoggingUtils[source]
Bases:
object
- CRITICAL = 50
- DEBUG = 10
- ERROR = 40
- INFO = 20
- WARNING = 30
- default_handlers = []
- default_level = 30
- classmethod get_handler_console(stream=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>, level=30) logging.Handler [source]
- loggers = [<Logger GitHubUtils (DEBUG)>]
- logging_format = '[{relativeCreated:6.0f}{levelname[0]}]{name}: {message}'
- logging_format_detail = '[{asctime}|{relativeCreated:.3f}|{levelname:7}]{name}: {message} [@{filename}:{lineno}|{funcName}|pid {process}|tid {thread}]'
- class seutil.Stream[source]
Bases:
object
Streams help manipulate sequences of objects.
- filter(predicate_func: Callable[[object], bool])[source]
Returns a stream consisting of the elements of this stream that match the given predicate.
- classmethod of(one_or_more_items)[source]
Get a new stream from the item / items. :param one_or_more_items: is converted to list with builtin list function.
- classmethod of_dirs(dir_path: Union[str, pathlib.Path])[source]
Get a stream of the sub-directories under the directory.
- classmethod of_files(dir_path: Union[str, pathlib.Path])[source]
Get a stream of the files under the directory.
- sorted(key: typing.Callable[[str], object] = <function Stream.<lambda>>, reverse: bool = False)[source]
Sorts the list of files in the dataset.
- split(fraction_list: typing.List[float], count_func: typing.Callable[[str], float] = <function Stream.<lambda>>)[source]
Splits the dataset as each part specified by the fractions (assumed to sum up to 1). Splitting is done by finding the cutting points. If randomization is needed, call shuffle first. :param count_func: customize the number of data counts in each file.