Issue
As I know, using "=" for copying objects actually just creates another reference to the same object. So if I do
a = [1, 2, 3]
b = a
print(id(a), id(b), a is b)
my output is 2367729946880 2367729946880 True
, which is fine and obvious.
If I make copies of list, they has different ids:
a = [1, 2, 3]
b = a.copy()
c = a.copy()
print(id(b), id(c), b is c)
Output: 2646790648192 2646790705984 False
.
So far so good. Though, if I try creating copies directly in the print, they unexpectedly has the same id:
a = [1, 2, 3]
print(id(a.copy()), id(a.copy()))
Output: 2209221063040 2209221063040
How does it happen?
I tried a bunch of different stuff, like:
- assigning copies to variables in the same line in case there is some one-line optimization as Python is an interpreted language
a = [1, 2, 3]
b, c = a.copy(), a.copy()
print(id(b), id(c))
Output: 2545996280192 2545996337984
- passing copies to the function to avoid using "="
def f(a, b):
print(id(a), id(b))
c = [1, 2, 3]
f(c.copy(), c.copy())
Output: 1518673867136 1518673852736
- passing copies to the function, using *args because as I know, print() gets arguments same way:
def f(*args):
print(id(args[0]), id(args[1]))
c = [1, 2, 3]
f(c.copy(), c.copy())
Output: 1764444352896 1764444338496
(difference in 3rd least valuable digit)
None seem to produce same behaviour. Even comparing ids using operator "is" prints False:
a = [1, 2, 3]
print(a.copy() is a.copy())
Output: False
But using "==" still gives True:
a = [1, 2, 3]
print(id(a.copy()) == id(a.copy()))
Output: True
Summing up all the text, I wonder about:
- What provokes this kind of behaviour? It doesn't seem intended. Is it result of some optimization?
- Can it potentially lead to some nasty unexpected bugs? Is there another way to get two copies to have same id?
Solution
id
returns an integer that is unique for the lifetime of the object. Here, that id
got re-used, b ecause the lifetime of the objects did not overlap. In the expression:
print(id(a.copy()), id(a.copy()))
First, a.copy()
is evaluated, it creates a new dict, that dict gets passed to id
, id
returns an integer, the dict is no longer referenced, and immediately reclaimed (this is a Cpython implementation detail). Then, a.copy()
is evaluated again, and again, it is passed to id
. It returns the same int
because that is perfectly in-line with the documented function of id
. You can look at the dissasembled bytecode and see how this works exactly:
>>> import dis
>>> dis.dis("print(id(a.copy()), id(a.copy()))")
0 0 RESUME 0
1 2 PUSH_NULL
4 LOAD_NAME 0 (print)
6 PUSH_NULL
8 LOAD_NAME 1 (id)
10 LOAD_NAME 2 (a)
12 LOAD_METHOD 3 (copy)
34 PRECALL 0
38 CALL 0
48 PRECALL 1
52 CALL 1
62 PUSH_NULL
64 LOAD_NAME 1 (id)
66 LOAD_NAME 2 (a)
68 LOAD_METHOD 3 (copy)
90 PRECALL 0
94 CALL 0
104 PRECALL 1
108 CALL 1
118 PRECALL 2
122 CALL 2
132 RETURN_VALUE
Of course, you don't get any information about when and where an object is garbage collected.
Another way to see similar behavior:
>>> for _ in range(3):
... print(id([]))
...
4370212416
4370212416
4370212416
Those were all distinct list
objects, but they were able to re-use the id because they have non-overlapping lifetimes.
So, this is all working as intended and documented.
Answered By - juanpa.arrivillaga
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.