Issue
I have this dataframe :
df = pd.DataFrame([list("ABCDEFGHIJ")])
0 1 2 3 4 5 6 7 8 9
0 A B C D E F G H I J
I got an error when trying to reshape the dataframe/array :
np.reshape(df, (-1, 3))
ValueError: cannot reshape array of size 10 into shape (3)
I'm expecting this array (or a dataframe with the same shape) :
array([['A', 'B', 'C'],
['D', 'E', 'F'],
['G', 'H', 'I'],
['J', nan, nan]], dtype=object)
Why NumPy can't guess the expected shape by completing the missing values with nan
?
Solution
Another possible solution, based on numpy.pad
, which inserts the needed np.nan
into the array:
n = 3
s = df.shape[1]
m = s // n + 1*(s % n != 0)
np.pad(df.values.flatten(), (0, m*n - s),
mode='constant', constant_values=np.nan).reshape(m,n)
Explanation:
s // n
is the integer division of the length of the original array and the number of columns (after reshape).s % n
gives the remainder of the divisions // n
. For instance, ifs = 9
, thens // n
is equal to 3 ands % n
equal to 0.However, if
s = 10
,s // n
is equal to 3 ands % n
equal to 1. Thus,s % n != 0
isTrue
. Consequently,1*(s % n != 0)
is equal to 1, which makesm = 3 + 1
.(0, m*n - s)
means the number ofnp.nan
to insert at the left of the array (0, in this case) and the number ofnp.nan
to insert at the right of the array (m*n - s
).
Output:
array([['A', 'B', 'C'],
['D', 'E', 'F'],
['G', 'H', 'I'],
['J', nan, nan]], dtype=object)
Answered By - PaulS
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.