|
Post by TaliaPerkins on Jun 27, 2018 10:54:09 GMT -5
I have an application which would most straightforwardly implemented having a 3D array of bytes of 1024 x 1024 x 10240 size.
Doesn't seem like a good fit for LB ?
It is possible an array of as little as 128 x 128 x 1280 would be useful, not sure how best to do an interpolation between adjacent cells in the 3D array though.
Speculation is welcome. I should add, I believe I can interface with SQLite and get a 10GB file?
|
|
|
Post by tsh73 on Jun 27, 2018 16:05:10 GMT -5
Hello tomdperkins Lol www.youtube.com/watch?v=fMCDcKxMVGU 1024 x 1024 x 10240 is 10^10 NUMBERS and each number take some space (I think 8 bytes dor real number, may be 4 for integer?) So this amount of memory exceeds 32bit address space I wonder what programming platform would be on ease with that. (but it sure would be 64 bit) Ok now that's reasonable 128 x 128 x 1280 is 20 million numbers For me, dim(128 * 128 * 1280) works OK (but try to increase it 10x and it fails). So. It really stores numbers in that array and reads them back N=128*128*1280 'N=N*10 'increasing array size 10x fails for me dim a(N) 'memory usage goes from 6Mb to 88Mb Nexp = 10 dim addr(Nexp) 'randomly using array for i = 1 to Nexp addr=int(rnd(1)*N) print addr addr(i)=addr a(addr)=i next print "Reading it back" for i = 1 to Nexp print i, addr(i), a(addr(i)) next but it was a single-dimension array Alas LB supports only 1 and 2d arrays So we have to convert 3 indexes to single linear address print "now with simulated 3d array" 'sisez are szX=128:szY=128:szZ=1280 'last one never actually used for address calculating - but as a range for 3rd dimension sz(1)=szX sz(2)=szY sz(3)=szZ N=szX*szY*szZ print "N = ",N dim a(N)
Nexp = 10 dim addr3(Nexp, 3) 'randomly using array for i = 1 to Nexp for j = 1 to 3 addr3(i,j)=int(rnd(1)*sz(j)) print addr3(i,j);", "; next 'x y z to linear address addr = (addr3(i,1)*szX+addr3(i,2))*szY+addr3(i,3) 'this could be made a function 'if you don't mind slowing things down print "linear addr ";addr a(addr)=i next print "Reading it back" for i = 1 to Nexp addr = (addr3(i,1)*szX+addr3(i,2))*szY+addr3(i,3) print i, addr3(i,1);", ";addr3(i,2);", ";addr3(i,3);" ", a(addr) next
Other problem is speed N=1000000 dim a(N) t0=time$("ms") print "N=",N for i =1 to N next t1=time$("ms") emptyLoopTime=t1-t0 print "emptyLoopTime" , emptyLoopTime
for i =1 to N a(i)=i next t2=time$("ms") settingArrayTime = t2-t1 print "settingArrayTime", settingArrayTime print "emptyLoopTime" , emptyLoopTime
That's for one million items. My machine takes 5 second for empty loop. 7 second for loop with assigment to array If you machine is newer you can get 2x faster (hardly faster then 3x) But you already want 20x more items. So please do some speed testing first (I really think turning to SQLite - which likely be on disk - will be MUCH slower then working in-memory with native arrays Though I would really like to see that compared, say for same 1000 0000 numbers. ) But may be you are not actually going to *use* that much numbers? Like, using it to store some stuff - leaving most of array unused? Then speed will be less of issue.
|
|
|
Post by tenochtitlanuk on Jun 27, 2018 16:16:25 GMT -5
Anatoly has explained and shown what I would have said. The 3 dimensions aren't the problem, its the sheer size. That's an awful lot of data- if you can give us an idea of what it represents we might think of a way. Perhaps store file of x,y data for every z value.. it depends whether the huge speed penalty of opening and closing files is offset- perhaps you only need to interrogate/change withein x.y plane at a fixed value of z?
|
|
|
Post by TaliaPerkins on Jun 27, 2018 19:41:35 GMT -5
I have a Dell Studio XPS I bought many years ago specifically to do physical modeling on. It has 24GB* of RAM, and a RAID of 120GB SSD's. Now I need to do my own modelling code, I've run out of 3rd party written code (that I can afford to pay for).
Application is CFD modelling of pulsed combustion.
I'm looking at Ch and Win32Forth as well.
*I went and checked because at first I didn't remember if it was 16 or 24GB, I haven't hit a limit there so I didn't remember.
(1- ((PI)/4)) fraction of the array would be a null value never operated on, and so would any other part of the array with 0 for the value. Plan is to iteratively run data array files until changes are below a small threshold, for the differing physical values involved. Don't care if it take a week to resolve. Need to find out how fiddly crap like exact positioning of frustum angles on tube sections affect charge flow before I go welding up a flight article.
|
|
|
Post by Rod on Jun 28, 2018 7:09:58 GMT -5
How many data elements will be none zero? Perhaps only a tiny part of this vast array contains data? If so I could envision processing and storing >0 and loosing <=0 elements. You would need to store the xyz location and value. No xyz location means 0
Even on disc there will be a hard limit of 4Gb as this is the most the file pointers SEEK EOF() LOF() etc can address.
|
|
|
Post by TaliaPerkins on Jun 28, 2018 8:34:27 GMT -5
Probably 50% will be active. I am told SQLite can have 10GB files, don't know how and EOF fits into that.
|
|
|
Post by meerkat on Jun 28, 2018 9:16:11 GMT -5
Just to give you some idea of the time to create a SQL file, I created database "testArray" with table: test ( a integer(7), b integer(7), c integer(7) )
I created 2,097,152 (128*128*128) records and executed some stupid calcs:
sqliteconnect #sql, "c:\rbp101\projects\a_project\data\testArray.db" sql$ = "DELETE FROM test" #sql execute(sql$) t0 = time$("seconds")
for a = 1 to 128 x$ = "" for b = 1 to 128 for c = 1 to 128 x$ = x$ + ",(";a;",";b;",";c;")" next c next b sql$ = "INSERT INTO test VALUES ";mid$(x$,2) #sql execute(sql$) next a t1 = time$("seconds") print t0;" ";t1;" ";t1-t0;" ";(t1-t0)/60
t0 = time$("seconds") sql$ = "SELECT avg(a) as aa, avg(b) as ab, avg(c) as ac, sum(a) as sa, sum(b) as sb, sum(c) as sc, avg(a / b * c) as aabc, avg(a * b % c) as rabc" #sql execute(sql$) t1 = time$("seconds") print t0;" ";t1;" ";t1-t0;" ";(t1-t0)/60 end
Insert time was about 11 minutes: 3707645685 3707646327 642 10.6999993 Calc time was 3 seconds: 3707647006 3707647009 3 0.05
Event though it took time to create the data, you only need to do this once.
Also: if you use mySQL it will easily cut the time in half.
Hope this helps.. BTW: This is written in RB only because SQL commands are so simple, but it can easily be change to LB.
|
|
|
Post by TaliaPerkins on Jun 28, 2018 9:27:47 GMT -5
Thank you. If that time scales linearly, I'd need approx 40 days to run a sim on a 1024 based grid, and that is longer than I had thought it might take.
Looking like (512^3)*10 is the highest resolution I can stand for a final work grid, and it's (256x256x256)*10 for "rough ins". But knowing LB can get the interface to SQL done as easily as that is invaluable.
|
|
|
Post by meerkat on Sept 4, 2018 8:56:07 GMT -5
Thank you. If that time scales linearly, I'd need approx 40 days to run a sim on a 1024 based grid, and that is longer than I had thought it might take. Might want to try memSQL and see what it does. They claim 4X faster and that would be 10 days, but you would have to test it with your data to see what it really does.. www.memsql.com/product/
|
|