My processinvokes module
The source code you can found here. It is available as part of my AlexBerUtils s project.
You can install AlexBerUtils from PyPi:
python3 -m pip install -U alex-ber-utils
See here for more details explanation on how to install.
Brief Description
The main motivation for processInvokes
module is following: You have script in Python that launches in another process some long-living task, something that take time and may fail. You want to have an ability to monitor what’s going on in sub-process. The easiest way is to redirect stdout
and stderr
of subprocess to logging built-in module — you can than, save everything to the file, or write some filter and if you see ERROR
the logger
will send e-mail. If you need something simple, there is also variant that will bypass logging and will save stdout
and stderr
of subprocess to file-like object. Of course, you can also write your own variant.
processInvokes
module has one primary function — run_sub_process()
. This function run subprocess and logs it’s out to the logger. This method is sophisticated decorator to subprocess.run()
. It is useful, when your subprocess run’s a lot of time and you’re interesting to receive it’s stdout
and stderr
. By default, it’s streamed to log. You can easily customize this behavior, see initConig()
method.
Code Example
Put the files above to the same directory as the script above. Configuration files, such as config.yml
should be also to the same directory.
Note:
- It is not required to have multiple configuration file, 1 file may be enough.
- You can also have no configuration file at all, and configure logger handler with
logging
API.
Lines 1–8 some generic import.
On line 3 we’re importing fixabscwd() function . See Making relative path to file to work.
On lines 10–11 we set initial value to the logger
and process_invokes_logger
. It will be used to redirect into it stdout
and stderr
of the subprocess. It will initialize with real value later, after parsing of configurations file will be done.
On lines 13–15 we’re extending init_app_conf.conf
helper class, I’m adding some this example specific key to existing one.
On line 19 we have process_invokes()
function, it will simulate running sub process with (almost) only default parameters.
On line 28 we have process_file_pipe()
function, it will simulate running sub process with non-default parameters.
On line 44 we have run()
function, it will extract application params and call process_invokes()
and process_file_pipe().
On line 53 we have _log_config()
function, this function will instantiate ((global) module level) logger
and process_invokes_logger.
On line 69 we have main()
function, that receives optional args parameter. It has default None value. If no explicit value is passed than sys.args will be used implicitly. This function can be called from another module.
On lines 87–88 we have standard code snippet: if this module executes as __main__ (and not imported to another module), than after all methods and (global) module-level attributes will be defined we’re calling main() function.
When this code is executed, after all methods and (global) module-level attributes are define, main() function is called with args=None.
On lines 70–72 I’m making temporary logger initialization, when all logs with level INFO and above are redirected to stderr, all warnings are redirected to stderr. This will be effective till we’ll parse log’s configuration and reinitialize logger. This is done, in order to see log’s output, especially warnings. See Integrating Python’s logging and warnings packages for more details.
On line 74 we makes relative path to file to work.
On lines 76–79 we’re parsing configuration files. See My major init_app_conf module for details.
On line 81 we’re passing it to _log_config() function. Their we’re poping general.log part from the dict config and we’re calling logging.config.dictConfig() to reinitialize logger. We’re initializing (global) module-level attributes logger
(standard logger) process_invokes_logger
(this logger will be used to redirect into it stdout
and stderr
of the subprocess). At the end we’re pretty-print parsed configuration. Uncommented line uses ymlparsers.as_str() as convenient method to print logger configuration to the logger. Commented-out line us pretty-print pprint
module (it is available in standard Python). It has caveat, though.
Note: Actual type is
OrderedDict
and notdict,
but it is mainly for historical reasons…Side note: Up to (not included) Python 3.6 the order in which key/value are stored was undefined. In Python 3.6 it was stated that this is implementation detail of CPython (and best practice is not to relay on this behavior). Started from Python 3.7 dictionary order is guaranteed to be insertion order.
See https://stackoverflow.com/a/58676384/1137529 for the differences between
dict
andOrderedDict
.
https://medium.com/analytics-vidhya/my-parser-module-429ed1457718
pprint
was written way before 3.6. In order to produce consistent result, it sorts out dict by key. I have found this unappropriated, I want to see the configuration in exact same order as it defined in the configuration file. In Python 3.8, however, you can specify sort_dicts=False
in order to disable this sorting functionality. So, if you’re using Python 3.8 or above this will also works.
On line 65 we just remove general.log from parsed result. You can change remove this line, if you want to keep log configuration.
On line 82 we’re calling run()
method with our business logic.
On line 45 we’re logging “run()” message to regular log.
On line 46 we just get our application configuration for the usage in the application code.
On line 48 we’re calling process_invokes()
function for the first simulation of running sub process.
On line 4 we’re calling process_file_pipe()
function for the second simulation of running sub process.
On line 20 we’re emitting log message process_invokes()
.
On lines 21–22 we have
exp_log_msg = "simulating run_sub_process"
process_invoke_run = f"echo '{exp_log_msg}'"
process_invoke_run
is quoted string, that will be used later toecho
exp_log_msg
to immulate long-running process that send something to stdout
.
On line 22 we have
cmd = shlex.split(process_invoke_run)
cmd
contains process_invoke_run
as array. This form is suitable for the usage of the subprocess standard module.
Quote:
The
shlex
class makes it easy to write lexical analyzers for simple syntaxes resembling that of the Unix shell. This will often be useful for writing minilanguages, (for example, in run control files for Python applications) or for parsing quoted strings.The
shlex
module defines the following functions:
shlex.split
(s, comments=False, posix=True)Split the string
s
using shell-like syntax. If comments isFalse
(the default), the parsing of comments in the given string will be disabled (setting thecommenters
attribute of theshlex
instance to the empty string). This function operates in POSIX mode by default, but uses non-POSIX mode if the posix argument is false.
https://docs.python.org/3/library/shlex.html
Another quote:
On POSIX, if args [
cmd
in out case] is a string, the string is interpreted as the name or path of the program to execute. However, this can only be done if not passing arguments to the program.Note
It may not be obvious how to break a shell command into a sequence of arguments, especially in complex cases.
shlex.split()
can illustrate how to determine the correct tokenization for args:>>>
>>> import shlex, subprocess
>>> command_line = input()
/bin/vikings -input eggs.txt -output "spam spam.txt" -cmd "echo '$MONEY'"
>>> args = shlex.split(command_line)
>>> print(args)
['/bin/vikings', '-input', 'eggs.txt', '-output', 'spam spam.txt', '-cmd', "echo '$MONEY'"]
>>> p = subprocess.Popen(args) # Success!Note in particular that options (such as -input) and arguments (such as eggs.txt) that are separated by whitespace in the shell go in separate list elements, while arguments that need quoting or backslash escaping when used in the shell (such as filenames containing spaces or the echo command shown above) are single list elements.
https://docs.python.org/3/library/subprocess.html?highlight=subprocess#subprocess.Popen
On line 25 we have
process_invoke_cwd = _os.getcwd()
We’re storing current working directory to the variable.
On line 26 we have
processinvokes.run_sub_process(*cmd, **{'kwargs': {'cwd': process_invoke_cwd}})
We’re sending cmd
that is echo
command with message “simulating run_sub_process” in sequence of arguments representation (see above). The “simulating run_sub_process” message will be send to stodout
of sub-process. It will be piped to stdout
of our Python process by be processinvokes
module (specifically to logging.getLogger(‘process_invoke_run’)
because of processinvokes.initConfig(**{‘default_log_name’: ‘process_invoke_run’})
at line 61.
On line 29 we’re emitting log message process_file_pipe()
.
On lines 30–21 we have
exp_log_msg = "simulating run_sub_process"
process_invoke_run = f"echo '{exp_log_msg}'"
process_invoke_run
is quoted string, that will be used later toecho
exp_log_msg
to immulate long-running process that send something to stdout
.
On line 32 we have
cmd = shlex.split(process_invoke_run)
cmd
contains process_invoke_run
as array. This form is suitable for the usage of the subprocess standard module. See explanation about shlex
above.
On line 34 we have
process_invoke_cwd = _os.getcwd()
We’re storing current working directory to the variable.
On lines 35–36 we have:
This lines initialize processinvokes
module with log name ‘process_invoke_run’ (this is the same as before) and with `alexber.utils.processinvokes.FilePipe’
as default_logpipe_cls.
This means that `alexber.utils.processinvokes.FilePipe’
will be used instead of LogPipe
(see details below)
On lines 38–40 we have
We’re sending cmd
that is echo
command with message “simulating run_sub_process” in sequence of arguments representation (see above). The “simulating run_sub_process” message will be send to stodout
of sub-process. It will be piped to stdout
of our Python process by be processinvokes
module (specifically to logging.getLogger(‘process_invoke_run’)
because of
at lines 35–36.
Note: You can also pass log_name
and logpipe_cls
to run_sub_process
without invokinginitConfig
.
cwd
that we’re supplying will be forwarded to subprocess.run()
and then to subprocess.Popen().
We are provided our main process cwd
to subprocess in order to resolve filename “my.log” to the working directory of the main process. This filename is provided as kwargs parameter of logpipe: logPipe={‘kwargs’: {‘fileName’: “my.log”}
.
Now, let’s look on initConfig()
method first.
This method can be optionally called prior any call to another function in this module. It is indented to be called in the MainThread. If running from the MainThread, this method is idempotent. This method can be call with empty params.
default_log_name
: Optional. — name of the logger where the messages will be streamed to. Default values is:processinvokes
.default_log_level
: Optional. — log level to be used in logger. Default values is:logging.INFO
.default_logpipe_cls
: can be class or str. Optional. You can use your custom class for the logging. For example,FilePipe
. Default values is:LogPipe
.
Note: FilePipe
is example how you can add your own custom class. While BasePipe
looks complicated, you typically need to overwrite only few methods and you’re done.
default_log_subprocess_cls
: can be class or str. Optional. Default valuesis: LogSubProcessCall
.executor
: internally used to run sub-process.
Default values aremax_workers
:1,
thread_name_prefix
:processinvokes
This means, by default:
We’re using up to 1 worker.
In log message generated from the workerprocessinvokes-xxx
will be used asthread_name
.
run_sub_process()
— this is primary function of the processinvokes module.
This method is sophisticated decorator to subprocess.run()
. It is useful, when your subprocess run’s a lot of time and you’re interesting to receive it’s stdout
and stderr
. By default, it’s streamed to log. You can easily customize this behavior, see description ofinitConfig()
above.
As I’ve said this method is sophisticated decorator to subprocess.run()
(that is decorator for subprocess.Popen()
). Note, that some parameters (cwd
, for example) that can be used in popenkwargs
are listed in Popen constructor. See example usage above.
logPipe
is customizable object that essentially forwards output from subprocess to the logger using logName
and logLevel
(see below).
See description ofinitConfig()
above for description of default parameters.
Default parameters that used in processinvokes module for call to subprocess.run()
are (they can be overridden in popenkwargs
):
‘stdout’:logPipe,
‘stderr’:STDOUT,
‘text’:True,
‘bufsize’: 1,
‘check’: True
This means:
logPipe
is used as subprocess’s standard output and standard error.- Do decode
stdin,
stdout
andstderr
using the given encoding or the system default otherwise. 1
is supplied as the buffering argument to theopen()
function when creating thestdin
/stdout
/stderr
pipe file objects. Essentially, no OS-level buffering between process, provided that call to write contains a newline character.- If the exit code of subprocess is non-zero, it raises a CalledProcessError.
Note: It is generally not advice to override them, but you can if you know, what you’re doing.
Parameters of the run_sub_process()
function:
args
: will be passed aspopenargs
tosubprocess.run()
method.- roughly kwargs[
‘kwargs’
] will be passed aspopenkwargs
tosubprocess.run()
method. - roughly kwargs
[‘logPipe’][‘cls’]
ordefault_logpipe_cls
(if first one is empty) will be passed aslogPipeCls
to createlogPipe
. Can be class or str. - roughly kwargs
[‘logPipe’][‘kwargs’]
will be passed askwargs
tologPipeCls
to createlogPipe
. - roughly kwargs
[‘logSubprocess’][‘cls’]
ordefault_log_subprocess_cls
(if first one is empty) will be passed aslogSubProcessCls
to createLogSubProcessCall
. Can be class or str. - roughly kwargs
[‘logSubprocess’][‘kwargs’]
will be passed askwargs
tologSubprocess
.
See also:
importer
module or How to write easily customizable code?fixabscwd()
function inmains
module or Making relative path to file to work.fix_retry_env()
function inmains
module or Make path to file on Windows works on Linux.- parser module or Description of one the oldest AlexBerUtils project
- ymlparsers module or Description of low-level API module for another modules
- init_app_conf module, or My major init_app_con module
- deploys module, or My deploys module
- emails module,or My emails module
- processinvokes module, or My processinvokes module
- stdLogging module, or My stdLogging Module