Issue
I have a n-long list of arrays. Each array consists of two columns: A) index values between 1-500 B) measured values
Each A column is slightly different (i.e. missing or having extra values).
I want to create single large array where i) there is single A (index) column consisting of all the index values and ii) all the B (measured values) columns appropriately sorted, so they are in the same row as the original index value. The missing values would be filled with nan or 0s.
Array examples:
# A B
arr1 = np.array([[ 25, 64],
[ 45, 26]])
arr2 = np.array([[ 8, 54],
[ 25, 2],
[ 45, 84],
[ 128, 22]])
arr3 = np.array([[ 17, 530],
[255, 25]])
Array of my dreams:
# A B
# arr1 arr2 arr3
dreamArr = array([[8, 0, 54, 0],
[17, 0, 0, 530],
[25, 64, 2, 0],
[45, 26, 84, 0],
[128, 0, 22, 0],
[255, 0, 0, 25]])
I tried creating an np.zeros()
array and replaced the individual columns with small arrays and got stuck.
Then I tried getting all the A values upfront by np.vstack()
, removed duplicates with np.unique
, np.sort()
ed them and got stuck again.
All input is much appreciated!
Solution
It's quite simple using pandas:
import pandas as pd
arrs = [arr1, arr2, arr3]
out = (pd
.concat([pd.DataFrame(a).set_index(0) for a in arrs], axis=1)
.fillna(0, downcast='infer')
.sort_index().reset_index().to_numpy()
)
output:
array([[ 8, 0, 54, 0],
[ 17, 0, 0, 530],
[ 25, 64, 2, 0],
[ 45, 26, 84, 0],
[128, 0, 22, 0],
[255, 0, 0, 25]])
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.