Issue
I try to use regression model until fit. by macos (M1) it work until fit()
in last row.
import pandas as pd
import numpy as np
df=pd.read_csv('USA_Housing.csv')
column=df.columns
X=df[['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
'Avg. Area Number of Bedrooms', 'Area Population', 'Address']]
y=df['Price']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=101)
from sklearn.linear_model import LinearRegression
lm=LinearRegression()
lm.fit(X_train,y_train) # this throws an error
error show after run by PyCharm show result.
Traceback (most recent call last): File "/Users/krit/PycharmProjects/PythonRefresh/main.py", line 21, in lm.fit(X_train,y_train)
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/base.py", line 1151, in wrapper return fit_method(estimator, *args, **kwargs)
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/linear_model/_base.py", line 678, in fit X, y = self._validate_data(
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/base.py", line 621, in _validate_data X, y = check_X_y(X, y, **check_params)
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/utils/validation.py", line 1147, in check_X_y X = check_array(
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/utils/validation.py", line 917, in check_array array = _asarray_with_order(array, order=order, dtype=dtype, xp=xp)
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/utils/_array_api.py", line 380, in _asarray_with_order array = numpy.asarray(array, order=order, dtype=dtype)
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/pandas/core/generic.py", line 2084, in array arr = np.asarray(values, dtype=dtype) ValueError: could not convert string to float: '1836 Shaw Lane Apt. 733\nGracetown, PW 83118-5264'
it work with window OS, but when I install PyCharm in macOS it does not work. how can I fix it?
Solution
You are trying to perform linear regression on string data. Refer this answer for a similar question as yours. As the error clearly states-
ValueError: could not convert string to float: '1836 Shaw Lane Apt. 733\nGracetown, PW 83118-5264'
The library you are using tries to convert this string to floating number which is not possible and hence the cause of your error.
SOLUTION
A very quick fix would be to remove all the columns like Address that may be containing string values.
Also, I don't think the full address of the house is required for a good prediction. I would either remove that column or just use some bits like "Shaw Lane Apt" etc.
Therefore, either remove that column or convert it into numbers. Free Advice- if you are thinking of using the address column categorize it by area and use one-hot encoding (though it would increase the complexity of your project).
Answered By - rr_goyal
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.