Running a python script as systemctl

Daredevil · Sep 21, 2023

Hi there

I've got a python script running as part of systemctl (sorry not sure of correct terminology), it starts with the machine and runs. It's a processor and gpu intensive process when it gets used. It runs by simply waiting for work, when it receives something it works and then waits again, the python code runs in a loop.

My question is, when it gets hit multiple times, is each hit a separate process and will multiple hits run in parallel on the system?
or
does it wait for one to complete before it starts another.

I've noticed that it is a little slower with consecutive hits but it's not double the time.

Thanks

JasKinasis · Sep 22, 2023

It definitely sounds like the type of application that should take advantage of multi-threading.
But without actually seeing the python script and how it's starting/managing the threads/processes, it's impossible to know how it operates, how it manages the worker threads, or whether it uses parallel multi-threading at all.

If multiple processes were ran in series it would take a LOT longer than if they were ran in parallel.

But depending on how many processor cores/threads are available on the machine, you may notice some slowdowns here and there. Especially under a heavy workload.

For example, if only one of these tasks is running - all of the available processor/GPU cores can be dedicated to that one process so it will complete faster. Whereas if you have more than one of these intensive processes running at once - the available cores/threads wil be divided between the running processes, so it will take slightly longer to complete all of the running tasks.
Above just assumes that there is little in the way of thread management.

But you may have a more sophisticated script which manages a pool of worker threads and perhaps under a heavier workload, with multiple hits, the script waits until a worker thread is free before starting a new process. Again, without seeing the script, it's impossible to know.
And also, as each of these threads are running extremely intensive tasks, there could be other performance bottlenecks in the actual workload.

For example, after a certain number of these threads have been started, some of the threads may end up getting stalled whilst waiting for other threads to finish using certain system resources (read/writes to disk, reading/writing certain files, accessing other devices.)
Yet again, without seeing the workload - it's difficult to try to predict where any performance bottlenecks might be.

Daredevil · Sep 22, 2023

Thanks for getting back to me.

I've not added any thread logic, the script runs in a loop and waits for work, when work arrives it happens within the loop, stopping it untill the work is done, once the work is done, the loop continues.

I'm using a loop because I'm not sure how else to keep it active and waiting.

You made good points about each worker loop having full access to the hardware until it's done. The only problem I can see is the queue of work getting too long. This would take a few simultaneous hits which is unlikely but can happen.

I guess it is possible to run the exact same script twice on the machine, both sitting in a loop waiting for work. Then when one is busy the other could pickup a second job, this may ensure that there are only ever two jobs working in parallel.

JasKinasis · Sep 22, 2023

Again, I still don't know how you're instantiating the threads. The method you're using may, or may not be using parallelism. Without seeing it, I couldn't tell you.

I'd recommend taking a look at this:

multiprocessing — Process-based parallelism

Source code: Lib/multiprocessing/ Availability: not Emscripten, not WASI. This module does not work or is not available on WebAssembly platforms wasm32-emscripten and wasm32-wasi. See WebAssembly p...

docs.python.org

With multiprocessing.pool, you can create a pool of worker threads. So for example, if your machine has a maximum of 8 threads, you might want to reserve 3 threads for the OS and have your script use the remaining 5 in a pool.
So you'd create a pool of 5 worker threads.
I imagine you'd probably need your main script to listen for hits/connections. When a hit/connection is received - adds a job to a queue.
And then you just need a mechanism that starts the next job in the queue whenever a worker thread becomes available in the pool. That way, you could potentially have up to 5 child processes running simultaneously. If the queue has more than 5 in it - the 6th job (and any subsequent jobs) would have to wait in the queue until another one of the running worker threads has finished it's task.
So a thread finishes it's task and then gets the next one in the queue.
And if there's no work, the queue/pool just waits for work.

And obviously - if you have one of these monster machinees with loads of processor cores/threads- then you can potentially create a much larger pool of worker threads.

I haven't done anything serious with Python for quite a while though. Most of the last 18 years of my life have been spent working with C++. So I can't really give any concrete/working code examples. I haven't done anything like this in a while. I could have a go at creating a similar script myself, but I don't really have time to so ATM,

Daredevil · Sep 23, 2023

Thanks so much for your time, I'm new at this way of running things on Linux, I'll explore that link and see if there is a better way for me to implement it and to make use of more than one thread or cpu at a time.

This has been a project that has been good fun and I'm only keen to improve it where I can.

Running a python script as systemctl

Daredevil

New Member

JasKinasis

Super Moderator

Daredevil

New Member

JasKinasis

Super Moderator

multiprocessing — Process-based parallelism

Daredevil

New Member

Members online

Latest posts