Issue
I want to compute discrete difference of identity matrix. The code below use numpy and scipy.
import numpy as np
from scipy.sparse import identity
from scipy.sparse import csc_matrix
x = identity(4).toarray()
y = csc_matrix(np.diff(x, n=2))
print(y)
I would like to improve performance or memory usage. Since identity matrix produce many zeros, it would reduce memory usage to perform calculation in compressed sparse column(csc) format. However, np.diff() does not accept csc format, so converting between csc and normal format using csc_matrix would slow it down a bit.
Normal format
x = identity(4).toarray()
print(x)
[[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 1.]]
csc format
x = identity(4)
print(x)
(0, 0) 1.0
(1, 1) 1.0
(2, 2) 1.0
(3, 3) 1.0
Thanks
Solution
Here is my hacky solution to get the sparse matrix as you want.
L
- the length of the original identity matrix,n
- the parameter ofnp.diff
.
In your question they are:
L = 4
n = 2
My code produces the same y
as your code, but without the conversions between csc and normal formats.
Your code:
from scipy.sparse import identity, csc_matrix
x = identity(L).toarray()
y = csc_matrix(np.diff(x, n=n))
My code:
from scipy.linalg import pascal
def get_data(n, L):
nums = pascal(n + 1, kind='lower')[-1].astype(float)
minuses_from = n % 2 + 1
nums[minuses_from : : 2] *= -1
return np.tile(nums, L - n)
data = get_data(n, L)
row_ind = (np.arange(n + 1) + np.arange(L - n).reshape(-1, 1)).flatten()
col_ind = np.repeat(np.arange(L - n), n + 1)
y = csc_matrix((data, (row_ind, col_ind)), shape=(L, L - n))
I have noticed that after applying np.diff
to the identity matrix n
times, the values of the columns are the binomial coefficients with their signs alternating. This is my variable data
.
Then I am just constructing the csc_matrix.
Answered By - Vladimir Fokow
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.