Friday, November 12, 2021

[FIXED] Why do I observe difference between direct (manual) vs looped evaluation of summation?

November 12, 2021 for-loop, numpy, python No comments

Issue

I am evaluating an extensive summation, by evaluating each term separately using a for-loop (Python 3.5 + NumPy 1.15.4). However, I obtained a surprising result when comparing manual term-by-term evaluation vs. using the for-loop. See MWE below.

S = sum(c_i x^i) for i=0..n (properly formatted LaTeX version here)

Main questions:

Where does the difference in the outputs y1 and y2 originate from?
How could I alter the code such that the for-loop yields the expected result (y1==y2)?

Comparing dy1 and dy2:

dy1:
[-1.76004137e-02  3.50290845e+01  1.50326037e+01 -7.25045852e+01
  2.08908445e+02 -3.31104542e+02  2.98005855e+02 -1.53154111e+02
  4.18203833e+01 -4.68961704e+00  0.00000000e+00]

dy2:
[-1.76004137e-02  3.50290845e+01  1.50326037e+01 -7.25045852e+01
 -3.27960559e-01 -4.01636743e-04  2.26525295e-07  4.80637463e-10
  1.93967535e-13 -1.93976497e-17 -0.00000000e+00]

dy1==dy2:
[ True  True  True  True False False False False False False  True]

Thanks!

MWE:

import numpy as np
coeff = np.array([
    [ 0.000000000000E+00, -0.176004136860E-01],
    [ 0.394501280250E-01,  0.389212049750E-01],
    [ 0.236223735980E-04,  0.185587700320E-04],
    [-0.328589067840E-06, -0.994575928740E-07],
    [-0.499048287770E-08,  0.318409457190E-09],
    [-0.675090591730E-10, -0.560728448890E-12],
    [-0.574103274280E-12,  0.560750590590E-15],
    [-0.310888728940E-14, -0.320207200030E-18],
    [-0.104516093650E-16,  0.971511471520E-22],
    [-0.198892668780E-19, -0.121047212750E-25],
    [-0.163226974860E-22,  0.000000000000E+00]
    ]).T

c = coeff[1]  # select appropriate coeffs
x = 900       # input

# manual calculation
y = c[0]*x**0 + c[1]*x**1 + c[2]*x**2 + c[3]*x**3 + c[4]*x**4 + \
    c[5]*x**5 + c[6]*x**6 + c[7]*x**7 + c[8]*x**8 + c[9]*x**9 + c[10]*x**10
print('y:',y)

# calc terms individually
dy1 = np.zeros(c.size)
dy1[0] = c[0]*x**0
dy1[1] = c[1]*x**1
dy1[2] = c[2]*x**2
dy1[3] = c[3]*x**3
dy1[4] = c[4]*x**4
dy1[5] = c[5]*x**5
dy1[6] = c[6]*x**6
dy1[7] = c[7]*x**7
dy1[8] = c[8]*x**8
dy1[9] = c[9]*x**9
dy1[10] = c[10]*x**10

# calc terms in for loop
dy2 = np.zeros(len(c))
for i in np.arange(len(c)):
    dy2[i] = c[i]*x**i

# summation and print
y1 = np.sum(dy1)
print('y1:',y1)
y2 = np.sum(dy2)
print('y2:',y2)

Output:

y: 37.325915370853856
y1: 37.32591537085385
y2: -22.788859384118823

Solution

It seems like raising a python int to a power of numpy integer (of specific size) leads to conversion of result to a numpy integer of the same size.

Example:

type(900**np.int32(10))

returns numpy.int32 and

type(900**np.int64(10))

returns numpy.int64

From this Stackoverflow question it seems that while Python int are variable sized, numpy integers are not (the size is specified by type as, for example, np.int32 or np.int64). So, while Python range function returns integers of variable size (Python int type), np.arange returns integers of specific type (if not specified, type is inferred).

Trying to compare the Python integer math vs numpy integer math:

900**10 returns 348678440100000000000000000000

while 900**np.int32(10) returns -871366656

Looks like you get integer overflow via np.arange function because the numpy integer dtype (in this case it is inferred as np.int32) is too small to store the resulting value.

Edit:

In this specific case, using np.arange(len(c), dtype = np.uint64) seems to output the right values:

dy2 = np.zeros(len(c))
for i in np.arange(len(c), dtype = np.uint64):
    dy2[i] = c[i]*x**i
dy1 == dy2

Outputs:

array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True])

Note: the accuracy might suffer using numpy in this case (int(900*np.uint64(10)) returns 348678440099999970966892445696 which is less than 900**10), so if that is of importance, I'd still opt to use Python built-in range function.

Answered By - dm2

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, November 12, 2021

[FIXED] Why do I observe difference between direct (manual) vs looped evaluation of summation?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels