Issue
I am evaluating an extensive summation, by evaluating each term separately using a for-loop (Python 3.5 + NumPy 1.15.4). However, I obtained a surprising result when comparing manual term-by-term evaluation vs. using the for-loop. See MWE below.
S = sum(c_i x^i) for i=0..n (properly formatted LaTeX version here)
Main questions:
- Where does the difference in the outputs y1 and y2 originate from?
- How could I alter the code such that the for-loop yields the expected result (y1==y2)?
Comparing dy1 and dy2:
dy1:
[-1.76004137e-02 3.50290845e+01 1.50326037e+01 -7.25045852e+01
2.08908445e+02 -3.31104542e+02 2.98005855e+02 -1.53154111e+02
4.18203833e+01 -4.68961704e+00 0.00000000e+00]
dy2:
[-1.76004137e-02 3.50290845e+01 1.50326037e+01 -7.25045852e+01
-3.27960559e-01 -4.01636743e-04 2.26525295e-07 4.80637463e-10
1.93967535e-13 -1.93976497e-17 -0.00000000e+00]
dy1==dy2:
[ True True True True False False False False False False True]
Thanks!
MWE:
import numpy as np
coeff = np.array([
[ 0.000000000000E+00, -0.176004136860E-01],
[ 0.394501280250E-01, 0.389212049750E-01],
[ 0.236223735980E-04, 0.185587700320E-04],
[-0.328589067840E-06, -0.994575928740E-07],
[-0.499048287770E-08, 0.318409457190E-09],
[-0.675090591730E-10, -0.560728448890E-12],
[-0.574103274280E-12, 0.560750590590E-15],
[-0.310888728940E-14, -0.320207200030E-18],
[-0.104516093650E-16, 0.971511471520E-22],
[-0.198892668780E-19, -0.121047212750E-25],
[-0.163226974860E-22, 0.000000000000E+00]
]).T
c = coeff[1] # select appropriate coeffs
x = 900 # input
# manual calculation
y = c[0]*x**0 + c[1]*x**1 + c[2]*x**2 + c[3]*x**3 + c[4]*x**4 + \
c[5]*x**5 + c[6]*x**6 + c[7]*x**7 + c[8]*x**8 + c[9]*x**9 + c[10]*x**10
print('y:',y)
# calc terms individually
dy1 = np.zeros(c.size)
dy1[0] = c[0]*x**0
dy1[1] = c[1]*x**1
dy1[2] = c[2]*x**2
dy1[3] = c[3]*x**3
dy1[4] = c[4]*x**4
dy1[5] = c[5]*x**5
dy1[6] = c[6]*x**6
dy1[7] = c[7]*x**7
dy1[8] = c[8]*x**8
dy1[9] = c[9]*x**9
dy1[10] = c[10]*x**10
# calc terms in for loop
dy2 = np.zeros(len(c))
for i in np.arange(len(c)):
dy2[i] = c[i]*x**i
# summation and print
y1 = np.sum(dy1)
print('y1:',y1)
y2 = np.sum(dy2)
print('y2:',y2)
Output:
y: 37.325915370853856
y1: 37.32591537085385
y2: -22.788859384118823
Solution
It seems like raising a python int
to a power of numpy
integer (of specific size) leads to conversion of result to a numpy integer of the same size.
Example:
type(900**np.int32(10))
returns numpy.int32
and
type(900**np.int64(10))
returns numpy.int64
From this Stackoverflow question it seems that while Python int
are variable sized, numpy integers are not (the size is specified by type as, for example, np.int32
or np.int64
). So, while Python range
function returns integers of variable size (Python int
type), np.arange
returns integers of specific type (if not specified, type is inferred).
Trying to compare the Python integer math vs numpy integer math:
900**10
returns 348678440100000000000000000000
while 900**np.int32(10)
returns -871366656
Looks like you get integer overflow via np.arange
function because the numpy integer dtype (in this case it is inferred as np.int32
) is too small to store the resulting value.
Edit:
In this specific case, using np.arange(len(c), dtype = np.uint64)
seems to output the right values:
dy2 = np.zeros(len(c))
for i in np.arange(len(c), dtype = np.uint64):
dy2[i] = c[i]*x**i
dy1 == dy2
Outputs:
array([ True, True, True, True, True, True, True, True, True,
True, True])
Note: the accuracy might suffer using numpy in this case (int(900*np.uint64(10))
returns 348678440099999970966892445696
which is less than 900**10
), so if that is of importance, I'd still opt to use Python built-in range
function.
Answered By - dm2
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.