Saturday, December 25, 2021

[FIXED] Numpy array indexing: view or copy - depends on scope?

December 25, 2021 numpy, numpy-ndarray, pass-by-reference, python, scope No comments

Issue

Consider the following array manipulations:

import numpy as np
def f(x):
     x += 1
x = np.zeros(1)
f(x)       # changes `x`
f(x[0])    # doesn't change `x`
x[0] += 1  # changes `x`

Why does x[0] behave differently depending on whether += 1 happens inside or outside the function f?

Can I pass a part of the array to the function, such that the function modifies the original array?

Edit: If we considered = instead of +=, we would probably maintain the core of the question while getting rid of some irrelevant complexity.

Solution

The issue is not scope, since the only thing that depends on scope is the available names. All objects can be accessed in any scope that has a name for them. The issue is one of mutability vs immutability and understanding what operators do.

x is a mutable numpy array. f runs x += 1 directly on it. += is the operator that invokes in-place addition. In other words, it does x = x.__iadd__(1)^*. Notice the reassignment to x, which happens in the function. That is a feature of the in-place operators that allows them to operate on immutable objects. In this case, ndarray.__iadd__ is a true in-place operator which just returns x, and everything works as expected.

Now let's analyze f(x[0]) the same way. x[0] calls x.__getitem__(0)^*. When you pass in a scalar int index, numpy extracts a one-element array and effectively calls .item() on it. The result is a python int (or float, or even possibly a tuple, depending on what your array's dtype is). Either way, the object is immutable. Once it's been extracted by __getitem__, the += operator in f replaces the name x in f with the new object, but the change is not seen outside the function, much less in the array. In this scenario, f has no reference to x, so no change is to be expected.

The example of x[0] += 1 is not the same as calling f(x[0]). It is equivalent to calling x.__setitem__(0, x.__getitem__(0).__iadd__(1))^*. The call to f was only the part with type(x).__getitem__(0).__iadd__(1), which returns a new object, but never reassigns as __setitem__ does. The key is that [] = (__setitem__) in python is an entirely different operator from [] (__getitem__) and = (assingment) separately.

To make the second example (f(x[0]) work, you would have to pass in a mutable object. An integer object extracts a single python object, and an array index makes a copy. However, a slice index returns a view that is mutable and tied to the original array memory. Therefore, you can do

f(x[0:1])  # changes `x`

In this case f does the following: x.__getitem__(slice(0, 1, None)).__iadd__(1). The key is that __getitem__ returns a mutable view into the original array, not an immutable int.

To see why it is important not only that the object is mutable but that it is a view into the original array, try f(x[[0]]). Indexing with a list produces an array, but a copy. In x[[0]].__iadd__ will modify the list you pass in in-place, but the list is not copied back into the original, so the change will not propagate.

^* This is an approximation. When invoked by an operator, dunder methods are actually called as type(x).__operator__(x, ...), not x.__operator__(...).

Answered By - Mad Physicist

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, December 25, 2021

[FIXED] Numpy array indexing: view or copy - depends on scope?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels