-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Hi,
I have a strange issue with an algorithm I'm writing.
Basically the code I have written works good and now I'm testing the memory limit input for data to know when cores/shared memory go out of memory.
So in this case I use a for loop (in Cpython) to increase my input data:
lineList=None
define_on_device(lineList)
myMat=None
define_on_device(myMat)
myVect=None
define_on_device(myVect)
myPattern1=None
define_on_device(myPattern1)
myPattern2=None
define_on_device(myPattern2)
@offload
def append(index,myID,myCore):
from util import range
import array
from parallel import coreid
if myCore==coreid():
myMat[index][0]=myID
for i in range(0,len(lineList)-1):
myMat[index][i+1]=lineList[i]
@offload
def init_pattern(mySizeDataSet,myPattDim):
import array
import parallel
from util import range
if coreid()<=mySizeDataSet%(numcores()-1):
myLength=(mySizeDataSet/(numcores()-1))+1
else:
myLength=mySizeDataSet/(numcores()-1)
lineList=[0]*(myPattDim)
myPattern1=[0]*(myPattDim)
myPattern2=[0]*myPattDim
myMat=array(myLength,len(lineList)+1)
for i in range(0,myLength-1):
for j in range(0,len(lineList)):
myMat[i][j]=0
sizePatternStop=...
sizePattern in range(1,sizePatternStop):
...
input=list(itertools.chain.from_iterable([[numpy.random.uniform(size=sizePattern).tolist()] for i in range(0,256)]))
input1=[]
index=range(0,len(input))
input1= zip(index,input)
...
arrayListNode=input1
indexNode,patternNode=zip(*arrayListNode)
lenDataSetNode=len(patternNode)
pattern_dim=len(patternNode[0])
init_pattern(lenDataSetNode,pattern_dim)
cnt=0
for i in range(0,len(patternNode),NUM_CORES):
last=len(patternNode)-i
if last>=NUM_CORES:
targetCores=NUM_CORES
else: targetCores=last
for cores in range(targetCores):
copy_to_device("lineList",patternNode[i+cores],target=[cores])
patternID=float(indexNode[i+cores])
append(cnt,patternID,cores)
cnt+=1
...
The problem is that it starts and always complete a full iteration. But at some iteration(that are not so particular in my opinion), the "copy_to_device" has problems: or it block all and I must exit manually from the shell, or it stucks complaining that
Error from core 8: Too many array indexes in expression
or sometimes (in multicluster with mpi) it says core is out-of-memory for a given sizePattern. But if then manually set that values as the starting one, it complete the full iteration and gives the same error for the next one.
I thought It could be a wrong memory management when increase sizePattern: the old matrix(myMat and others) still in memory from a cycle to another and then prematurely end the memory.
Can somebody give a look to that please?