Issue
I have an excel file where I have 2 columns: 'Name' and 'size'. The 'Name'
column has multiple file types, namely ".apk, .dat, .vdex, .ttc"
etc. But I only want to populate the files with the file extension ending with .apk. I do not want any other file type in the new excel file.
I have written the below code:
import pandas as pd
import json
def json_to_excel():
with open('installed-files.json') as jf:
data = json.load(jf)
df = pd.DataFrame(data)
new_df = df[df.columns.difference(['SHA256'])]
new_xl = new_df.to_excel('abc.xlsx')
return new_xl
def filter_apk(): `MODIFIED CODE`
old_xl = json_to_excel()
data = pd.read_excel(old_xl)
a = data[data["Name"].str.contains("\.apk")]
a.to_excel('zybg.xlsx')
Above program does following:
json_to_excel()
, takes a Json file, converts it to a .xlsx format and save.filter_apk()
is suppose to create multiple excel file based on the file extension present in "Name" column.
- 1st function is doing what I intend to.
- 2nd function is not doing anything. Neither its throwing any error. I have followed this weblink
Below are the few samples of the "name" column
/system/product/<Path_to>/abc.apk
/system/fonts/wwwr.ttc
/system/framework/framework.jar
/system/<Path_to>/icu.dat
/system/<Path_to>/Normal.apk
/system/<Path_to>/Tv.apk
How to get that working? Or is there a better way to achieve the objective?
Please suggest.
ERROR
raise ValueError(msg)
ValueError: Invalid file path or buffer object type: <class 'NoneType'>
Note:
I have all the files at the same location.
modified code:
import pandas as pd
import json
def json_to_excel():
with open('installed-files.json') as jf:
data = json.load(jf)
df = pd.DataFrame(data)
new_df = df[df.columns.difference(['SHA256'])]
new_df.to_excel('abc.xlsx')
def filter_apk():
json_to_excel()
old_xl = pd.read_excel('abc.xlsx')
data = pd.read_excel(old_xl)
a = data[data["Name"].str.contains("\.apk")]
a.to_excel('zybg.xlsx')
t = filter_apk()
print(t)
New error:
Traceback (most recent call last):
File "C:/Users/amitesh.sahay/PycharmProjects/work_allocation/TASKS/Jenkins.py", line 89, in <module>
t = filter_apk()
File "C:/Users/amitesh.sahay/PycharmProjects/work_allocation/TASKS/Jenkins.py", line 84, in filter_apk
data = pd.read_excel(old_xl)
File "C:\Users\amitesh.sahay\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\util\_decorators.py", line 296, in wrapper
return func(*args, **kwargs)
File "C:\Users\amitesh.sahay\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel\_base.py", line 304, in read_excel
io = ExcelFile(io, engine=engine)
File "C:\Users\amitesh.sahay\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel\_base.py", line 867, in __init__
self._reader = self._engines[engine](self._io)
File "C:\Users\amitesh.sahay\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel\_xlrd.py", line 22, in __init__
super().__init__(filepath_or_buffer)
File "C:\Users\amitesh.sahay\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel\_base.py", line 344, in __init__
filepath_or_buffer, _, _, _ = get_filepath_or_buffer(filepath_or_buffer)
File "C:\Users\amitesh.sahay\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\common.py", line 243, in get_filepath_or_buffer
raise ValueError(msg)
ValueError: Invalid file path or buffer object type: <class 'pandas.core.frame.DataFrame'>
Solution
There is a difference between your use-case and use-case shown in the weblink. You want to apply a single filter (apk files), whereas the example you saw had multiple filters which were to be applied one after another (multiple species).
This will do the trick.
def filter_apk():
old_xl = json_to_excel()
data = pd.read_excel(old_xl)
a = data[data["Name"].str.contains("\.apk")]
a.to_excel("<path_to_new_excel>\\new_excel_name.xlsx")
Answered By - Kunal Shah
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.