Issue
I want to classify the extracted features from a CNN with k-nearest neighbors classifier from sklearn.neighbors.KNeighborsClassifier. But when I used predict() function on test data it gives a class different than the majority votes that can be found by kneighbors(). I am using the following Resnet50 pretrained model to extract the features which is a branch of a siamese network. Details of the siamese network can be found here.
def embedding_model():
baseModel = ResNet50(weights="imagenet", include_top=False,input_tensor=Input(shape=(IMAGE_SIZE, IMAGE_SIZE, 3)))
for layer in baseModel.layers[:165]:
layer.trainable = False
headModel = baseModel.output
headModel = GlobalAveragePooling2D()(headModel)
model = Model(inputs=baseModel.input, outputs=headModel, name = 'embedding_model')
return model
#get embedding model weights from saved weights
embeddings_weights = siamese_test.get_layer('embedding_model').get_weights()
embeddings_branch = siamese_test.get_layer('embedding_model')
input_shape = (224,224,3)
input = Input(shape=input_shape)
x = embeddings_branch(input)
model = Model(input, x)
model.set_weights(embeddings_weights )
out_shape = model.layers[-1].output_shape
Model summary can be found here. I used the following function to extract the features using the model.
def create_features(dataset, pre_model,out_shape,batchSize=16):
features = pre_model.predict(dataset, batchSize)
features_flatten = features.reshape((features.shape[0], out_shape[1] ))
return features, features_flatten
train_features, train_features_flatten = create_features(x_train,model,out_shape, batchSize)
test_features, test_features_flatten = create_features(x_test,model,out_shape, batchSize)
Then I used KNN classifier to predict on test features
from sklearn.neighbors import KNeighborsClassifier
KNN_classifier = KNeighborsClassifier(n_neighbors=3)
KNN_classifier.fit(train_features_flatten, y_train)
y_pred = KNN_classifier.predict(test_features_flatten)
I used keighbors() function to find the nearest neighbors distance and their corresponding index. But it gives me different results than the predicted one.
neighbors_dist, neighbors_index = KNN_classifier.kneighbors(test_features_flatten)
#replace the index with actual class
data2 = np.zeros(neighbors_index.shape, dtype=object)
for i in range(neighbors_index.shape[0]):
for j in range(neighbors_index.shape[1]):
data2[i,j] = str(y_test[neighbors_index[i][j]])
#get the majority class
from collections import Counter
majority_class = np.array([Counter(sorted(row, reverse=True)).most_common(1)[0][0] for row in data2])
As you can see the predicted class is not same as the majority class for first 10 samples
for i, pred in enumerate(y_pred):
print(i,pred)
for i, c in enumerate(majority_class):
print(i,c)
Predicted output for first 10 samples: 0 corduroy 1 wool 2 wool 3 brown_bread 4 wood 5 corduroy 6 corduroy 7 corduroy 8 wool 9 wood 10 corduroy
Majority class for first 10 samples: 0 corduroy 1 cork 2 cork 3 lettuce_leaf 4 linen 5 corduroy 6 wool 7 corduroy 8 brown_bread 9 linen 10 wool
Is there anything I am doing wrong ? Any help would be appreciated. Thank you.
Solution
This is incorrect:
data2[i,j] = str(y_test[neighbors_index[i][j]])
The kneighbors
method (and also predict
) finds the nearest training points to the inputs, so you should reference y_train
here.
Answered By - Ben Reiniger
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.