python - for every point in a list, compute the mean distance to all other points -


i have numpy array points of shape [n,2] contains (x,y) coordinates of n points. i'd compute mean distance of every point other points using existing function (which we'll call cmp_dist , use black box).

first verbose solution in "normal" python illustrate want (written top of head):

mean_dist = [] i,(x0,y0) in enumerate(points):     dist = [     j,(x1,y1) in enumerate(points):         if i==j: continue         dist.append(comp_dist(x0,y0,x1,y1))     mean_dist.append(np.array(dist).mean()) 

i found "better" solution using list comprehensions (assuming list comprehensions better) seems work fine:

mean_dist = [np.array([cmp_dist(x0,y0,x1,y1) j,(x1,y1) in enumerate(points) if not i==j]).mean()                             i,(x0,y0) in enumerate(points)] 

however, i'm sure there's better solution in pure numpy, function allows operation every element using other elements.

how can write code in pure numpy/scipy?

i tried find myself, quite hard google without knowing how such operations called (my respective math classes quite while back).

edit: not duplicate of fastest pairwise distance metric in python

the author of question has 1d array r , satisfied scipy.spatial.distance.pdist(r, 'cityblock') returns (an array containing distances between points). however, pdist returns flat array, is, is not clear of distances belong point (see my answer).

(although, explained in answer, pdist looking for, doesnt solve problem i've specified in question.)

based on @ali_m's comment question ("take @ scipy.spatial.distance.pdist"), found "pure" numpy/scipy solution:

from scipy.spatial.distance import cdist ... fct = lambda p0,p1: great_circle_distance(p0[0],p0[1],p1[0],p1[1]) mean_dist = np.sort(cdist(points,points,fct))[:,1:].mean(1) 

definitely that's sure improvement on list comprehension "solution".

what don't this, though, have sort , slice array remove 0.0 values result of computing distance between identical points (so that's way of removing diagonal entries of matrix cdist).

note 2 things above solution:

  • i'm using cdist, not pdist suggested @ali_m.
  • i'm getting array of same size points, contains mean distance every point other points, specified in original question.

pdist unfortunately returns array contains these mean values in flat array, is, mean values unlinked points referring to, necessary problem i've described in original question.


however, since in actual problem @ hand need mean on means of points (which did not mention in question), pdist serves me fine:

from scipy.spatial.distance import pdist ... fct = lambda p0,p1: great_circle_distance(p0[0],p0[1],p1[0],p1[1]) mean_dist_overall = pdist(points,fct).mean() 

though sure definite answer if had asked mean of means, i've purposely asked array of means points. because think there's still room improvement in above cdist solution, won't accept answer.


Comments

Popular posts from this blog

html - Firefox flex bug applied to buttons? -

html - Missing border-right in select on Firefox -

python - build a suggestions list using fuzzywuzzy -