Issue
I already know you can pull specific keys from dictionary objects in pandas if you already know the exact value for the key, but what if you wanted to pull the median key's values of a dictionary without knowing the value (or in this case, author name)?
ex.
author name: books: year:
fred how to fish 2010
how to bike 2012
how to skate 2009
bob sam I am 1990
george white fang 1980
animals and I 2000
ted a guide to computer programming 1984
harry the future queen 1812
So I would want to get the median author name. There are five authors, so the 3rd author is the one I want (george), and just print all the data associated with him. Then eventually I'd also want to print the number of books he's published (which is two). Do I have to convert the dictionary object back to a csv file or something? Tips or helpful tutorials anyone knows on pandas dictionary objects would be great, thanks!
Solution
If you were taking about an integer/float column then you can just use the median method:
In [11]: df['year:'].median()
Out[11]: 1995.0
However, this isn't well-defined for a column of strings, at least using the normal definition*.
If you just want the "middle" item then just take that (I'm not sure what you want to do with a draw...):
In [12]: df['author name:'].iloc[int(len(df) / 2.)]
Out[12]: 'george'
Note: it actually is a draw in this case...
Or you can get the unique names (in the order they were given, ignoring repeats out of order), again you have to worry about draws:
In [13]: names = df['author name:'].unique()
In [14]: names
Out[14]: array(['fred', 'bob', 'george', 'ted', 'harry'], dtype=object)
In [15]: names[int(len(names) / 2.)]
Out[15]: 'george'
* What's half way between 'bob' and 'george'?
Answered By - Andy Hayden
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.