Issue
I'm working with pytorch now, and I'm missing a layer: tf.keras.layers.StringLookup
that helped with the processing of ids. Is there any workaround to do something similar with pytorch?
An example of the functionality I'm looking for:
vocab = ["a", "b", "c", "d"]
data = tf.constant([["a", "c", "d"], ["d", "a", "b"]])
layer = tf.keras.layers.StringLookup(vocabulary=vocab)
layer(data)
Outputs:
<tf.Tensor: shape=(2, 3), dtype=int64, numpy=
array([[1, 3, 4],
[4, 1, 2]])>
Solution
Package torchnlp,
pip install pytorch-nlp
from torchnlp.encoders import LabelEncoder
data = ["a", "c", "d", "e", "d"]
encoder = LabelEncoder(data, reserved_labels=['unknown'], unknown_index=0)
enl = encoder.batch_encode(data)
print(enl)
tensor([1, 2, 3, 4, 3])
Answered By - Damir Devetak
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.