Hello all,
I am trying to get some basic code that uses the multiprocessing module to work in a Pro 3.1 toolbox but it just won't play nice. The code:
import time
import arcpy
import multiprocessing as mp
def work_log(work_data):
arcpy.AddMessage(" Process %s waiting %s seconds" % (work_data[0], work_data[1]))
time.sleep(int(work_data[1]))
arcpy.AddMessage(" Process %s Finished." % work_data[0])
def pool_handler(work):
p = mp.Pool(2)
p.map(work_log, work)
if __name__ == '__main__':
work = (["A", 5], ["B", 2], ["C", 1], ["D", 3])
pool_handler(work)
If I run this in an .atnx or .pyt I get the same error:
PicklingError: Can't pickle <function work_log at 0x00000134B287D430>: attribute lookup work_log on __main__ failed
I also get 2 new instances of Pro open automatically if I run in a .pyt toolbox, I assume due to lack of __main__.
Does anyone have any advice on how to get something like this running in a Pro toolbox (preferably .pyt)?
I came across a similar question here https://community.esri.com/t5/python-questions/using-multiprocessing/m-p/1280108 but it has never been answered.
Thanks!
The multiprocessing module has a lot of caveats due to how Pro manages its Python environment. This link has a lot advice and some sample scripts, this might be enough to get you started.
Not sure if it's possible within a pyt due to the nature of the geprocessing environment loading the pyt classes, the __main__ requirement. As a script tool- this would be the script named MultiProc.py and referenced (not embedded) in the script tool. The first Parameter is a value table, sting int.
The messages wont display from the thread until the threads are done so you can put them in the result dictionary to print them all.
import time
import arcpy
import multiprocessing as mp
import os
import sys
mp.set_executable(os.path.join(sys.exec_prefix, 'pythonw.exe'))
def work_log(work_data):
res_dict = {'process': f"Process {work_data[0]} waiting {work_data[1]} seconds", 'result': ''}
time.sleep(work_data[1])
res_dict['result'] = f"Process {work_data[0]} Finished."
return res_dict
if __name__ == "__main__":
import Multiproc
work_table = arcpy.GetParameterAsText(0)
# Create a value table with 2 columns
value_table = arcpy.ValueTable(2)
# Set the values of the table with the contents of the first argument
if work_table:
value_table.loadFromString(arcpy.GetParameterAsText(0))
else:
for i in [["A", 5], ["B", 2], ["C", 1], ["D", 3]]:
value_table.addRow(i)
pairs = []
# Loop through the list of inputs
for i in range(0, value_table.rowCount):
pairs.append([value_table.getValue(i, 0), int(value_table.getValue(i, 1))])
arcpy.AddMessage(pairs)
with mp.Pool(2) as pool:
jobs = [pool.apply_async(Multiproc.work_log, (pair, )) for pair in pairs]
res = [j.get() for j in jobs]
for r in res:
arcpy.AddMessage(f"{r['process']} : {r['result']}")
Thanks StaticK. It is a shame ESRI haven't streamlined some of this via arcpy - especially considering how powerful it'd be in an arcpy toolbox.