matlab - Sending data to workers -
i trying create piece of parallel code speed processing of large (couple of hundred million rows) array. in order parallelise this, chopped data 8 (my number of cores) pieces , tried sending each worker 1 piece. looking @ ram usage however, seems each piece send each worker, multiplying ram usage 8. minimum working example:
a = 1:16; ii = 1:8 data{ii} = a(2*ii-1:2*ii); end now, when send data workers using parfor seems send full cell instead of desired piece:
output = cell(1,8); parfor ii = 1:8 output{ii} = data{ii}; end i use function within parfor loop, illustrates case. matlab send full cell data each worker, , if so, how make send desired piece?
in personal experience, found using parfeval better regarding memory usage parfor. in addition, problem seems more breakable, can use parfeval submitting more smaller jobs matlab workers.
let's have workercnt matlab workers gonna handle jobcnt jobs. let data cell array of size jobcnt x 1, , each of elements corresponds data input function getoutput analysis on data. results stored in cell array output of size jobcnt x 1.
in following code, jobs assigned in first for loop , results retrieved in second while loop. boolean variable donejobs indicates job done.
poolobj = parpool(workercnt); jobcnt = length(data); % number of jobs output = cell(jobcnt,1); jobno = 1:jobcnt future(jobno) = parfeval(poolobj,@getoutput,... nargout('getoutput'),data{jobno}); end donejobs = false(jobcnt,1); while ~all(donejobs) [idx,result] = fetchnext(future); output{idx} = result; donejobs(idx) = true; end also, can take approach 1 step further if want save more memory. after fetching results of done job, can delete corresponding member of future. reason object stores input , output data of getoutput function going huge. need careful, deleting members of future results index shift.
the following code wrote porpuse.
poolobj = parpool(workercnt); jobcnt = length(data); % number of jobs output = cell(jobcnt,1); jobno = 1:jobcnt future(jobno) = parfeval(poolobj,@getoutput,... nargout('getoutput'),data{jobno}); end donejobs = false(jobcnt,1); while ~all(donejobs) [idx,result] = fetchnext(future); furure(idx) = []; % remove done future object oldidx = 0; % find index offset , correct index accordingly while oldidx ~= idx donejobsinidxrange = sum(donejobs((oldidx + 1):idx)); oldidx = idx idx = idx + donejobsinidxrange; end output{idx} = result; donejobs(idx) = true; end
Comments
Post a Comment