Issue
I am having trouble sending my data frame to_Panel
. I'm performing preliminary operations on the data beforehand and am concerned that these may be contributing to the problem.
merge.head()
date catcode type di cid feccandid amount disposition bills
0 2005-12-31 G1100 24K D N00004045 H2MI11042 1500 support 1
1 2005-12-31 L1100 24K D N00004045 H2MI11042 8000 support 1
2 2005-12-31 L1100 24K D N00004155 H2MI02066 1000 oppose 1
3 2005-12-31 T1200 24K D N00004166 H4MI03045 3000 support 1
Then I form a pivot_table
mm = merge.pivot_table(index=['date', 'feccandid', 'disposition', \
'bills', 'cid', 'di', 'type'], columns='catcode',values='amount', \
fill_value=0)
catcode A0000 A1000 A1100 A1200 A1300 A1400 A1500 A1600 A2000 A2300 ... T9100 T9400 X3700 X4000 X4100 X4110 X5000 X7000 Y0000 Z5200
date feccandid disposition bills cid di type
2005-12-31 H2MI02066 oppose 1 N00004155 D 24K 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
H2MI11042 support 1 N00004045 D 24K 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
H4MI03045 support 1 N00004166 D 24K 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
3 rows × 315 columns
I then reset the index:
mm = mm.reset_index()
mm.head()
catcode date feccandid disposition bills cid di type A0000 A1000 A1100 ... T9100 T9400 X3700 X4000 X4100 X4110 X5000 X7000 Y0000 Z5200
0 2005-12-31 H2MI02066 oppose 1 N00004155 D 24K 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
1 2005-12-31 H2MI11042 support 1 N00004045 D 24K 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
2 2005-12-31 H4MI03045 support 1 N00004166 D 24K 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
I then send to csv:
mm.to_csv('i.test', index=False)
Read from csv:
hh = pd.read_csv('i.test')
Set index:
hh.set_index(['date', 'feccandid']).head(3)
disposition bills cid di type A0000 A1000 A1100 A1200 A1300 ... T9100 T9400 X3700 X4000 X4100 X4110 X5000 X7000 Y0000 Z5200
date feccandid
2005-12-31 H2MI02066 oppose 1 N00004155 D 24K 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
H2MI11042 support 1 N00004045 D 24K 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
H4MI03045 support 1 N00004166 D 24K 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
To panel:
hh.to_panel()
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-86-9358192e71a3> in <module>()
----> 1 hh.to_panel()
/home/jayaramdas/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in to_panel(self)
1210 if (not isinstance(self.index, MultiIndex) or # pragma: no cover
1211 len(self.index.levels) != 2):
-> 1212 raise NotImplementedError('Only 2-level MultiIndex are supported.')
1213
1214 if not self.index.is_unique:
NotImplementedError: Only 2-level MultiIndex are supported.
Any ideas, question, or critiques?
Solution
set_index
doesn't happen in place, so your hh
doesn't have a MultiIndex as an index.
>>> hh.to_panel()
Traceback (most recent call last):
File "<ipython-input-4-9358192e71a3>", line 1, in <module>
hh.to_panel()
File "/home/dsm/sys/pys/3.5.1/lib/python3.5/site-packages/pandas/core/frame.py", line 1224, in to_panel
raise NotImplementedError('Only 2-level MultiIndex are supported.')
NotImplementedError: Only 2-level MultiIndex are supported.
>>> hh.set_index(["date", "feccandid"]).to_panel()
<class 'pandas.core.panel.Panel'>
Dimensions: 20 (items) x 1 (major_axis) x 3 (minor_axis)
Items axis: catcode to Z5200
Major_axis axis: 2005-12-31 to 2005-12-31
Minor_axis axis: H2MI02066 to H4MI03045
You could add inplace=True
to the set_index
, but it's considered slightly more pandorable to just do hh = hh.set_index(...)
instead.
Aside: I think Panels are being gradually deprecated in favour of the richer xarray
N-d object, so you might want to consider installing xarray
and then doing
>>> hh.to_xarray()
<xarray.Dataset>
Dimensions: (date: 1, feccandid: 3)
Coordinates:
* date (date) object '2005-12-31'
* feccandid (feccandid) object 'H2MI02066' 'H2MI11042' 'H4MI03045'
Data variables:
catcode (date, feccandid) int64 0 1 2
disposition (date, feccandid) object 'oppose' 'support' 'support'
bills (date, feccandid) int64 1 1 1
cid (date, feccandid) object 'N00004155' 'N00004045' 'N00004166'
di (date, feccandid) object 'D' 'D' 'D'
type (date, feccandid) object '24K' '24K' '24K'
[...]
instead and experimenting that way.
Answered By - DSM
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.