numpy - memory error in python for array of size 150 000 -
i trying difference between each pair of element of 2 different numpy arrays (size: 1 x 150000). here code used me:
#the input numpy array , b c = - b.reshape((-1,1)) #for array a=np.array([1,2,7,6]) , b=np.array([1,2,7,6]) # c = array([[ 0, 2, 6, 5], [-2, 0, 4, 3], [-6, -4, 0, -1], [-5, -3, 1, 0]])
i understand why gives memory error using code. how update code don't error?
i tried using itertools combinations_with_replacement
still not able desired results.
you're trying create 150000 x 150000
array. i'm not sure dtype used in case of int32 (4 bytes per number) , neglecting overhead of array try allocate:
>>> 150000 * 150000 * 4 # bytes 90000000000
which translates to
>>> 150000 * 150000 * 4 / 1024 / 1024 / 1024 # gigabytes 83.82
so if don't have 84gb of (free) ram you'll memoryerror operation.
with itertools
it's worse because need list contains 1 pointer per element (on 64bit computers that's 8 byte) , depending on python version , computer each integer requires ~20-30 bytes:
>>> import sys >>> sys.getsizeof(1) 28
essentially lead ram requirement of:
>>> (28 + 8) * 150000 * 150000 # bytes 810000000000 >>> (28 + 8) * 150000 * 150000 / 1024 / 1024 / 1024 # gigabytes 754.37
if have enough hard disk storage try calculate distance each point of 1 array points in other array , save disk , go next point in first array , calculate distances points in second array, , on. depending on way store values (for example if save them in txt format) might have bigger memory consumption. allows calculate distances - won't able keep values in ram.
the most-straightforward solution memoryerror buy more ram (you can calculate how need based on above numbers). if that's not option need change approach.
Comments
Post a Comment