Thursday, September 29, 2022

[FIXED] Is there any difference between the DNN model by keras and the DNN model by pytorch?

September 29, 2022 deep-learning, keras, python, pytorch No comments

Issue

Here are my codes for DNN by torch and keras. I use them to train the same data, but finally get the totally different AUC results(keras version reaches 0.74 and torch version reaches 0.67). So I'm so confused! And I have tried many times that my results remain differences. Is there any difference between the two models?

categorical_embed_sizes = [589806, 21225, 2565, 2686, 343, 344, 10, 2, 8, 8, 7, 7, 2, 2, 2, 17, 17, 17]

#keras model
cat_input, embeds = [], []
for i in range(cat_len):
    input_ = Input(shape=(1, ))
    cat_input.append(input_)
    nums = categorical_embed_sizes[i]
    embed = Embedding(nums, 8)(input_)
    embeds.append(embed)
cont_input = Input(shape=(cont_len,), name='cont_input', dtype='float32')
cont_input_r = Reshape((1, cont_len))(cont_input)

embeds.append(cont_input_r)

#Merge_L=concatenate([train_emb,trainnumber_emb,departstationname_emb,arrivestationname_emb,seatname_emb,orderofftime_emb,fromcityname_emb,tocityname_emb,daytype_emb,num_input_r])
Merge_L=concatenate(embeds, name='cat_1')
Merge_L=Dense(256,activation=None,name='dense_0')(Merge_L)
Merge_L=PReLU(name='merge_0')(Merge_L)
Merge_L=BatchNormalization(name='bn_0')(Merge_L)
Merge_L=Dense(128,activation=None,name='dense_1')(Merge_L)
Merge_L=PReLU(name='prelu_1')(Merge_L)
Merge_L=BatchNormalization(name='bn_1')(Merge_L)
Merge_L=Dense(64,activation=None,name='Dense_2')(Merge_L)
Merge_L=PReLU(name='prelu_2')(Merge_L)
Merge_L=BatchNormalization(name='bn_2')(Merge_L)
Merge_L=Dense(32,activation=None,name='Dense_3')(Merge_L)
Merge_L=PReLU(name='prelu_3')(Merge_L)
Merge_L=BatchNormalization(name='bn_3')(Merge_L)
Merge_L=Dense(16,activation=None,name='Dense_4')(Merge_L)
Merge_L=PReLU(name='prelu_4')(Merge_L)

predictions= Dense(1, activation='sigmoid', name='Dense_rs')(Merge_L)

predictions=Reshape((1,), name='pred')(predictions)

cat_input.append(cont_input)
model = Model(inputs=cat_input, 
                     outputs=predictions)

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=[tf.keras.metrics.BinaryAccuracy(), tf.keras.metrics.AUC()])

# torch model
class DNN(nn.Module):
    def __init__(self, categorical_length, categorical_embed_sizes, categorical_embed_dim, in_size):
        super(DNN, self).__init__()
        self.categorical_length = categorical_length
        self.categorical_embed_sizes = categorical_embed_sizes
        self.categorical_embed_dim = categorical_embed_dim
        self.in_size = in_size
        self.nn = torch.nn.Sequential(
            nn.Linear(self.in_size, 256),
            nn.PReLU(256),
            nn.BatchNorm1d(256),
            nn.Linear(256, 128),
            nn.PReLU(128),
            nn.BatchNorm1d(128),
            nn.Linear(128, 64),
            nn.PReLU(64),
            nn.BatchNorm1d(64),
            nn.Linear(64, 32),
            nn.PReLU(32),
            nn.BatchNorm1d(32),
            nn.Linear(32, 16),
            nn.PReLU(16)
        )
        self.out = torch.nn.Sequential(
            nn.Linear(16, 1),
            nn.Sigmoid()
        )
        self.embedding = nn.Embedding(self.categorical_embed_sizes, self.categorical_embed_dim)

    def forward(self, x):
        x_categorical = x[:, :self.categorical_length].long()
        x_categorical = self.embedding(x_categorical).view(x_categorical.size(0), -1)
        x = torch.cat((x_categorical, x[:, self.categorical_length:]), dim=1)
        x = self.nn(x)
        out = self.out(x)
        return out

Solution

Finally I find the real reason for the error I met. It has no business with the model structure or parameters. Actually, the wrong input for sklearn's roc_auc_score function is the direct cause of this error.

As we know, sklearn.metrics.roc_auc_score need at least y_true and y_score. y_true is the real labels of datasets and y_score is the predicted probabilities of label 1 (for binary tasks).

But when I use torch's outputs to calculate two metrics(Accuracy and AUC), I transform the outputs to 0-1 vectors. So my y_score is no longer probabilities but 0-1 vectors.

Then the error happened...

Answered By - Junming Liang

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, September 29, 2022

[FIXED] Is there any difference between the DNN model by keras and the DNN model by pytorch?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels