Issue
How can I fix ‘ValueError: x and y must be the same size` error?
The idea of the code is that from different sensors of temperature and NO data applied the model of Multivariate Linear Regression. To train the model and see the results correlated among them, as well as the prediction as a whole.
from sklearn import linear_model
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
import pandas as pd
import matplotlib.pyplot as plt
# Name of de file
filename = 'NORM_AC_HAE.csv'
file = 'NORM_NABEL_HAE_lev1.csv'
# Read the data
data=pd.read_csv(filename)
data_other=pd.read_csv(file)
col = ['Aircube.009.0.no.we.aux.ch6', 'Aircube.009.0.sht.temperature.ch1']
X = data.loc[:, col]
Y = data_other.loc[:,'NO.ppb']
# Fitting the Liner Regression to training set
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.3, train_size = 0.6, random_state = np.random.seed(0))
mlr = LinearRegression()
mlr.fit(X_train, y_train)
# Visualization of the test set results
plt.figure(2)
plt.scatter(y_test, X_test) #The VALUE ERROR appears here
The Error Code is:
Traceback (most recent call last):
File "C:\Users\andre\Desktop\UV\4o\TFG\EMPA\dataset_Mila\MLR_no_temp_hae_no.py", line 65, in <module>
plt.scatter(y_test, X_test)
File "C:\Users\andre\AppData\Local\Programs\Python\Python37-32\lib\site-packages\matplotlib\pyplot.py", line 2864, in scatter
is not None else {}), **kwargs)
File "C:\Users\andre\AppData\Local\Programs\Python\Python37-32\lib\site-packages\matplotlib\__init__.py", line 1810, in inner
return func(ax, *args, **kwargs)
File "C:\Users\andre\AppData\Local\Programs\Python\Python37-32\lib\site-packages\matplotlib\axes\_axes.py", line 4182, in scatter
raise ValueError("x and y must be the same size")
ValueError: x and y must be the same size
[Finished in 6.9s]
Solution
X_test.shape = [36648 rows x 2 columns]
Both data arguments in plt.scatter
(here y_test
and X_test
) must be 1-dimensional arrays; from the docs:
x, y : array_like, shape (n, )
while here you attempt to pass a 2-dimensional matrix for X_test
, hence the error of different size.
You cannot get a scatter plot of a matrix with an array/vector; what you could do is produce two separate scatter plots, one for each column in your X_test
:
plt.figure(2)
plt.scatter(y_test, X_test.iloc[:,0].values)
plt.figure(3)
plt.scatter(y_test, X_test.iloc[:,1].values)
Answered By - desertnaut
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.