Issue
I am trying my hand at scikit-learn. I have a very simple dataset of timestamps and gas concentrations in the form of ppm.
Error:
ValueError: Expected 2D array, got 1D array instead:
array=[396.4 394. 395.8 395.3 404.2 400.6 397.7 401.5 394.7 398.9 402.5 394.6
401.2 401. 399. 398.5 401.3 401.7 406.5 395.9 401.2 399.8 398.2 401.9
405.4 396.1 402.8 404.4 402.5 400.9 402.8 397.8 399.7 398.4 403.4 401.4
393.1].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
code:
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
data = pd.read_csv(r"myfilepath.csv")
print(data.shape)
kmeans = KMeans(n_clusters = 2, random_state = 0)
X = data['reading']
kmeans.fit(X)
#clusters = kmeans.fit_predict(data)
print(kmeans.cluster_centers_.shape)
Solution
I did some more digging and discovered that converting my dataframe to a numpy array and then using python negative indexing fixed my problem
updated code:
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
# CHANGES
data = pd.read_csv(r"myfilepath.csv").to_numpy()
print(data.shape)
kmeans = KMeans(n_clusters = 2, random_state = 0)
#CHANGES
X = data[:-1]
kmeans.fit(X)
#clusters = kmeans.fit_predict(data)
print(kmeans.cluster_centers_.shape)
plt.scatter(X[:, 0], X[:, 1], s=50, cmap='viridis')
centers = kmeans.cluster_centers_
plt.scatter(centers[:, 0], centers[:, 1], s=200, alpha=0.5)
Answered By - the_guy_who_posts
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.