Issue
If I have this minimal reproducible example
import pandas as pd
df = pd.DataFrame({"A":[12, 4, 5, None, 1],
"B":[7, 2, 54, 3, None],
"C":[20, 16, 11, 3, 8],
"D":[14, 3, None, 2, 6]})
index_ = ['Row_1', 'Row_2', 'Row_3', 'Row_4', 'Row_5']
df.index = index_
print(df)
# Option 1
result = df[['A', 'D']]
print(result)
# Option 2
result = df.loc[:, ['A', 'D']]
print(result)
What is the effect on using loc
or not. The results are quite similar.
I ask this in preparation for a more complex question in which I have been instructed to use loc.
Solution
The difference is that df[['A', 'D']]
is a weak reference to df
(here on pandas 2.1.2).
result1 = df[['A', 'D']]
print(result1._is_copy)
#<weakref at 0x7f34261b69d0; to 'DataFrame' at 0x7f34260e9590>
result2 = df.loc[:, ['A', 'D']]
print(result2._is_copy)
# None
In both cases, this is not a view:
print(result1._is_view, result2._is_view)
# False False
This behavior has changed with the pandas versions.
Is this important?
It depends what you want to do. In most cases no.
The first approach can however, in specific cases, trigger SettingWithCopyWarning
:
result1 = df[['A', 'D']]
s1 = result1['A']
s1[:] = 1
# SettingWithCopyWarning:
# A value is trying to be set on a copy of a slice from a DataFrame
result2 = df.loc[:, ['A', 'D']]
s2 = result2['A']
s2[:] = 1
# no Warning
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.