Issue
I can load a data set from scikit-learn
using
from sklearn import datasets
data = datasets.load_boston()
print(data)
What I'd like to do is write this data set to a flat file (.csv
)
Using the open()
function,
f = open('boston.txt', 'w')
f.write(str(data))
works, but includes the description of the data set.
I'm wondering if there is some way that I can generate a simple .csv
with headers from this Bunch object so I can move it around and use it elsewhere.
Solution
data = datasets.load_boston()
will generate a dictionary. In order to write the data to a .csv
file you need the actual data data['data']
and the columns data['feature_names']
. You can use these in order to generate a pandas dataframe and then use to_csv()
in order to write the data to a file:
from sklearn import datasets
import pandas as pd
data = datasets.load_boston()
print(data)
df = pd.DataFrame(data=data['data'], columns = data['feature_names'])
df.to_csv('boston.txt', sep = ',', index = False)
and the output boston.txt
should be:
CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT
0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98
0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14
0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03
...
Answered By - Giorgos Myrianthous
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.