Issue
I am trying to recreate something similar to the
sklearn.preprocessing.LabelEncoder
However I do not want to use sklearn
or pandas
. I would like to only use numpy
and the Python standard library. Here's what I would like to achieve:
import numpy as np
input = np.array([['hi', 'there'],
['scott', 'james'],
['hi', 'scott'],
['please', 'there']])
# Output would look like
np.ndarray([[0, 0],
[1, 1],
[0, 2],
[2, 0]])
It would also be great to be able to map it back as well, so a result would then look exactly like the input again.
If this were in a spreadsheet, the input would look like this:
Solution
Here's a simple comprehension, using the return_inverse
result from np.unique
arr = np.array([['hi', 'there'], ['scott', 'james'],
['hi', 'scott'], ['please', 'there']])
np.column_stack([np.unique(arr[:, i], return_inverse=True)[1] for i in range(arr.shape[1])])
array([[0, 2],
[2, 0],
[0, 1],
[1, 2]], dtype=int64)
Or applying along the axis:
np.column_stack(np.apply_along_axis(np.unique, 0, arr, return_inverse=True)[1])
Answered By - ALollz
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.