Issue
I am writing an application that generates a report. One part of the output shows how many matches are between two columns (match = True, no match = False). In order to do that I use df.column.value_counts()
which output is
False 2
True 1
Name: column, dtype: int64
So I was thinking to show the result in the report something like
# This is a simplify example
print(" Position with same genotype: ", df.column.value_counts()[1])
print(" Position with different genotype: ", df.column.value_counts()[0])
I have noticed that the order of this output change depending on what False or True is found first in the table ( I think I can solve this with sort ). However, another bigger concern I am thinking of is if there is no False or non True are found in the table the output is:
# For example not Falses
True 3
Name: column, dtype: int64
I was expected this:
# For example not Falses
True 3
False 0
Name: column, dtype: int64
How can I solve this to avoid future errors? It is quite unlike that happen, it is going to be at least one false or true but I want to be sure that not error happens as this application is going to be run many times with different samples so theoretically this situation can happen.
Solution
Series.get
We can use the get
method of the pandas series and specify the default value as 0
s = df['column'].value_counts()
print("Position with same genotype: ", s.get(True, 0))
print("Position with different genotype: ", s.get(False, 0))
Result
Position with same genotype: 2
Position with different genotype: 1
Answered By - Shubham Sharma
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.