Issue
l have a csv
file that l process with pandas
. The column is called raw_value
l want to retrieve the unique chars in this column.
x=df.manual_raw_value.unique()
allows to retrieve unique rows. However, l'm looking to retrieve the whole chars in this columns . which is : alphabet= 6 , 3 5 1 8 V O T R E A 2 . é è / :
raw_value
6,35
11,68
VOTRE
AVEL AR VRO
2292
questions.
nb
les
937,99
à
et
TTC
1
620
Echéance
vos
ROB21
Pièce
AGRIAL
désignation
des
taux
13s
2
par
le
mois,
32
21/07/2016
FR
au
0
téléphonique
BROYEUR
et
ST
TVA
de
des
ECHEANCIER
à
ne
lieu
481,67
N°0016
de
ministère
de
20/11/2015
Si
vous
59
cas
EUR
3.19
2
contrôle
assurances
BAS
et
4423873
renseignements
6104219
C9DECOMPTEDIVERS
6635
DE
10825
EDIT_1
All the three solutions works perfectly. l chose the second one
set(df.raw_value.apply(list).sum())
Hwever it returns some encoded char. Is it related to encoding ? how to decode and display the real char . Here is what it prints
{' ',
'!',
'"',
'%',
'&',
"'",
'(',
')',
'*',
'+',
',',
'-',
'.',
'/',
'0',
'1',
'2',
'3',
'4',
'5',
'6',
'7',
'8',
'9',
':',
'=',
'>',
'?',
'@',
'_',
'a',
'b',
'c',
'd',
'e',
'f',
'g',
'h',
'i',
'j',
'k',
'l',
'm',
'n',
'o',
'p',
'q',
'r',
's',
't',
'u',
'v',
'w',
'x',
'y',
'z',
'\x82',
'\x87',
'\x94',
'\xa1',
'\xa7',
'\xaa',
'\xab',
'\xac',
'\xae',
'\xaf',
'\xb0',
'\xb4',
'\xb9',
'\xbb',
'\xc2',
'\xc3',
'\xe2'}
Solution
You can first convert the raw value to a string list, then stack to a char df and get unique elements.
df.applymap(list).raw_value.apply(pd.Series).stack().unique()
Out[620]: array(['6', ',', '3', ..., 'ô', 'D', 'M'], dtype=object)
You can also do this by converting the raw value to a list, concat the list and then get the set of the list.
set(df.raw_value.apply(list).sum())
A yet simpler approach is to directly concat raw values to a string and then apply set on it because string is essentially a list.
set(df.raw_value.sum())
Note, the first approach will include nan in the results while the second and third approach exclude nan.
Answered By - Allen Qin
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.