python - sklearn issue: Found arrays with inconsistent numbers of samples when doing regression -


this question seems have been asked before, can't seem comment further clarification on accepted answer , couldn't figure out solution provided.

i trying learn how use sklearn own data. got annual % change in gdp 2 different countries on past 100 years. trying learn using single variable now. trying use sklearn predict gdp % change country given percentage change in country b's gdp.

the problem receive error saying:

valueerror: found arrays inconsistent numbers of samples: [ 1 107]

here code:

import sklearn.linear_model lm import numpy np import scipy.stats st import matplotlib.pyplot plt import matplotlib.dates mdates   def bytespdate2num(fmt, encoding='utf-8'):#function convert bytes string dates.     strconverter = mdates.strpdate2num(fmt)     def bytesconverter(b):         s = b.decode(encoding)         return strconverter(s)     return bytesconverter  datacsv = open('combined_data.csv')  comb_data = []  line in datacsv:     comb_data.append(line)  date, chngdpchange, ausgdpchange = np.loadtxt(comb_data, delimiter=',', unpack=true, converters={0: bytespdate2num('%d/%m/%y')})   chntrain = chngdpchange[:-1] chntest = chngdpchange[-1:]  austrain = ausgdpchange[:-1] austest = ausgdpchange[-1:]  regr = lm.linearregression() regr.fit(chntrain, austrain)  print('coefficients: \n', regr.coef_)  print("residual sum of squares: %.2f"       % np.mean((regr.predict(chntest) - austest) ** 2))  print('variance score: %.2f' % regr.score(chntest, austest))  plt.scatter(chntest, austest,  color='black') plt.plot(chntest, regr.predict(chntest), color='blue')  plt.xticks(()) plt.yticks(())  plt.show() 

what doing wrong? tried apply sklearn tutorial (they used diabetes data set) own simple data. data contains date, country a's % change in gdp specific year, , country b's % change in gdp same year.

i tried solutions here , here (basically trying find more out solution in first link), receive exact same error.

here full traceback in case want see it:

traceback (most recent call last):   file "d:\my stuff\dropbox\python\python projects\test regression\tester.py", line 34, in <module>     regr.fit(chntrain, austrain)   file "d:\programs\installed\python34\lib\site-packages\sklearn\linear_model\base.py", line 376, in fit     y_numeric=true, multi_output=true)   file "d:\programs\installed\python34\lib\site-packages\sklearn\utils\validation.py", line 454, in check_x_y     check_consistent_length(x, y)   file "d:\programs\installed\python34\lib\site-packages\sklearn\utils\validation.py", line 174, in check_consistent_length     "%s" % str(uniques)) valueerror: found arrays inconsistent numbers of samples: [  1 107] 

in fit(x,y),the input parameter x supposed 2-d array. if x in data one-dimension, can reshape 2-d array this:regr.fit(chntrain_x.reshape(len(chntrain_x), 1), chntrain_y)


Comments

Popular posts from this blog

html - Firefox flex bug applied to buttons? -

html - Missing border-right in select on Firefox -

c# - two queries in same method -