Saturday, November 18, 2023

[FIXED] Found array with dim 3. check_pairwise_arrays expected <= 2

November 18, 2023 keras, numpy, python, scikit-learn, tensorflow No comments

Issue

I have one text file 'commands.txt' which included some commands, two python files (train.py and main.py). Here, train.py will create a model named commands_model.h5. And then, using main.py I want to do is when I enter some command, it (main.py) will use that model and then return the correct command to me which is in the commands.txt file. . But when I'm using the main.py it is showing this error,

Started
2023-09-06 13:44:05.970618: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Enter an command: check connection
1/1 [==============================] - 0s 327ms/step
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 47ms/step
Traceback (most recent call last):
  File "D:\Advanced Robot\commands\test.py", line 32, in <module>
    similarity_scores = cosine_similarity(user_input_embedding, command_embeddings)
  File "C:\Users\hp\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\metrics\pairwise.py", line 1393, in cosine_similarity
    X, Y = check_pairwise_arrays(X, Y)
  File "C:\Users\hp\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\metrics\pairwise.py", line 163, in check_pairwise_arrays
    Y = check_array(
  File "C:\Users\hp\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\utils\validation.py", line 915, in check_array
    raise ValueError(
ValueError: Found array with dim 3. check_pairwise_arrays expected <= 2.

Here is my main.py,

print("Started")
import tensorflow as ten
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

tokenizer = Tokenizer()

loaded_model = ten.keras.models.load_model('commands_model.h5')

max_sequence_length = 100

inp = input("Enter an command: ")
user_input_sequence = tokenizer.texts_to_sequences([inp])
padded_user_input = pad_sequences(user_input_sequence, maxlen=max_sequence_length, padding='post', truncating='post')

user_input_embedding = loaded_model.predict(padded_user_input)

command_embeddings = []

with open('commands.txt', 'r') as file:
    commands = file.read().splitlines()

for command in commands:
    command_sequence = tokenizer.texts_to_sequences([command])
    paded_command = pad_sequences(command_sequence, maxlen=max_sequence_length, padding='post', truncating='post')
    command_embedding = loaded_model.predict(paded_command)
    command_embeddings.append(command_embedding)

command_embeddings = np.array(command_embeddings)
similarity_scores = cosine_similarity(user_input_embedding, command_embeddings)
most_similar_command_index = np.argmax(similarity_scores)
most_similar_command = commands[most_similar_command_index]

print("Most similar: ", most_similar_command)

And here is my train.py,

print("Started")

import tensorflow as tf
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from sklearn.preprocessing import LabelEncoder
print('Imported')
# Load the file which contain the list of commands
with open('commands.txt', 'r') as command_file:
    text_data = command_file.read().splitlines()

vocab_size = 1000
embedding_dim = 16
num_epochs = 500
batch_size = 32
labels = []
for word in text_data:
    labels.append(word)

lbl_encoder = LabelEncoder()
lbl_encoder.fit(labels)
labels = lbl_encoder.transform(labels)

# Tokenize the text
tokenizer = Tokenizer()
tokenizer.fit_on_texts(text_data)
sequences = tokenizer.texts_to_sequences(text_data)

# Pad equences to make them the same length
max_sequence_length = 100
paded_sequences = pad_sequences(sequences, maxlen=max_sequence_length, padding='post', truncating='post')

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_sequence_length),
    tf.keras.layers.LSTM(units=64),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
print("Compiling....")
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
from sklearn.model_selection import train_test_split

x_train, x_val, y_train, y_val = train_test_split(paded_sequences, np.array(labels), test_size=0.2, random_state=42)
model.fit(x_train, y_train, epochs=num_epochs, batch_size=batch_size, validation_data=(x_val, y_val))
print("Model saving....")
model.save("commands_model.h5")
print("Model saved")

I couldn't find which file contained the error, I mean did I create the model in a wrong way? or is it an error in main.py?

Solution

The error tells you that it found an array with dimension 3, but it expected no more than 2 dimensions, happening in this line:

similarity_scores = cosine_similarity(user_input_embedding, command_embeddings)

In your code user_input_embedding is a 3D array (a tensor), and command_embeddings is a list of 3D arrays, but cosine_similarity() expects both inputs to be 2D arrays.

You could try to reshape both the user input and each command embedding to be 2D arrays and then find the similarity like this:

similarity_scores = []

for command_embedding in command_embeddings:
    similarity = cosine_similarity(user_input_embedding.reshape(1, -1), command_embedding.reshape(1, -1))
    similarity_scores.append(similarity[0][0])

most_similar_command_index = np.argmax(similarity_scores)
most_similar_command = commands[most_similar_command_index]

Answered By - Ada

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, November 18, 2023

[FIXED] Found array with dim 3. check_pairwise_arrays expected <= 2

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels