First sorry for my not perfect English.
My problem is simple to explain, I think.
result={}
list_tuple=[(float,float,float),(float,float,float),(float,float,float)...]#200k tuples
threshold=[float,float,float...] #max 1k values
for tuple in list_tuple:
for value in threeshold:
if max(tuple)>value and min(tuple)<value:
if value in result:
result[value].append(tuple)
else:
result[value]=[]
result[value].append(tuple)
list_tuple contains arround 200k tuples, i have to do this operation very fast(2/3 seconds max on a normal pc).
My first attemp was to do this in cython with prange() (so i could have benefits from the cython optimization and from the paralell execution), but the problem is (as always), GIL: in prange() i can manage lists and tuples using cython memviews, but i can't insert my result in a dict.
In cython i also tried using unordered_map of the c++ std, but now the problem is that i can't make a vector of array in c++ (that would the value of my dict).
The second problem is similar:
list_tuple=[((float,float),(float,float)),((float,float),(float,float))...]#200k tuples of tuples
result={list_tuple[0][0]:[]}
for tuple in list_tuple:
if tuple[0] in result:
result[tuple[0]].append(tuple)
else:
result[tuple[0]]=[]
Here i have also another problem,if a want to use prange() i have to use a custom hash function to use an array as key of a c++ unordered_map
As you can see my snippets are very simple to run in paralell.
I thought to try with numba, but probably will be the same because of GIL, and i prefer to use cython because i need a binary code (this library could be a part of a commercial software so only binary libraries are allowed).
In general i would like avoid c/c++ function, what i hope to find is a way to manage something like dicts/lists in parallel,with the cython performance, remaining as much as possible in the Python domain; but i'm open to every advice.
Thanks
Aucun commentaire:
Enregistrer un commentaire