Thursday, May 12, 2022

[FIXED] How can I split the document path to the foldername and the document name in python?

May 12, 2022 pandas, python, url No comments

Issue

I need to split the document path to the foldername and the document name in python. It is a large dataframe including many rows.For the filename with no document name followed, just leave the document name column blank in the result. For example, I have a dataframe like the follows:

     no  filename
     1  \\apple\config.csv
     2  \\apple\fox.pdf
     3  \\orange\cat.xls
     4  \\banana\eggplant.pdf
     5  \\lucy
...

I expect the output shown as follows:

    foldername  documentname
    \\apple     config.csv
    \\apple     fox.pdf
    \\orange    cat.xls
    \\banana    eggplant.pdf
    \\lucy 
...

I have tried the following code,but it does not work.


    y={'Foldername':[],'Docname':[]}
    def splitnames(x):
        if "." in x:
            docname=os.path.basename(x)
            rm="\\"+docname
            newur=x.replace(rm,'')
        else:
            newur=x
            docname=""
        result=[newur,docname]
        y["Foldername"].append(result[0])
        y["Docname"].append(result[1])
        return y;

    dff=df$filename.apply(splitnames)

Thank you so much for the help!!

Solution

Not sure how you're getting the paths, but you could create some Pathlib objects and use some class methods to grab the file name and folder name.

from pathlib import Path

data = """ no  filename
     1  \\apple\\config.csv
     2  \\apple\\fox.pdf
     3  \\orange\\cat.xls
     4  \\banana\\eggplant.pdf
     5  \\lucy"""

df = pd.read_csv(StringIO(data),sep='\s+')
df['filename'] = df['filename'].apply(Path)


df['folder'] = df['filename'].apply(lambda x : x.parent if '.' in x.suffix else x)
df['document_name'] = df['filename'].apply(lambda x : x.name if '.' in x.suffix  else np.nan)


print(df)

   no              filename   folder document_name
0   1     \apple\config.csv   \apple    config.csv
1   2        \apple\fox.pdf   \apple       fox.pdf
2   3       \orange\cat.xls  \orange       cat.xls
3   4  \banana\eggplant.pdf  \banana  eggplant.pdf
4   5                 \lucy    \lucy           NaN

Answered By - Umar.H

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, May 12, 2022

[FIXED] How can I split the document path to the foldername and the document name in python?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels