Thursday, June 30, 2022

[FIXED] How to properly line break long pandas lines?

June 30, 2022 pandas, python No comments

Issue

I'm struggling to make my pandas data loading code look "good", I would like to adhere as much as possible to Pep8 with for example at most 80 characters per line. But right now my lines are way too long because of the (unwieldy) way that pandas works. For example:

df_ndsi_feature = df_stations_date.loc[:, df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")]
ndsi_feature = df_ndsi_feature.to_numpy(dtype=np.float32).T

df_ndvi_feature = df_stations_date.loc[:, df_stations_date.columns.str.fullmatch("ndvi_avg-\\d")]
ndvi_feature = df_ndvi_feature.to_numpy(dtype=np.float32).T

ndsi_ndvi_feature = np.append(ndsi_feature, ndvi_feature, axis=1)
date_features = np.append(date_features, ndsi_ndvi_feature, axis=1)

As you can see they are very long, but I don't really know how to properly break them.

Solution

My advise is use a formatter tool like black across all the team and forget to manually try to format the code. Formatting code consistently by hand is really hard, and shifts the cognitive load to the developer, who has to follow a lot of written rules, when there are tools that can do it only good enough.

pep8 has "many" optional ways of doing things so it is hard to achieve a common format even complying with the standard. That is, there are different ways of writing the code that meet the standard

If you are using Jupyter as IDE you can try jupyter-black, plugin.

This is your code sample formatted with black:

df_ndsi_feature = df_stations_date.loc[
    :, df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")
]
ndsi_feature = df_ndsi_feature.to_numpy(dtype=np.float32).T

df_ndvi_feature = df_stations_date.loc[
    :, df_stations_date.columns.str.fullmatch("ndvi_avg-\\d")
]
ndvi_feature = df_ndvi_feature.to_numpy(dtype=np.float32).T

ndsi_ndvi_feature = np.append(ndsi_feature, ndvi_feature, axis=1)
date_features = np.append(date_features, ndsi_ndvi_feature, axis=1)

But take into account that it can be formatted in different ways and still comply with pep8:

For example:

df_ndsi_feature = df_stations_date.loc[
    :,
    df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")
]
ndsi_feature = df_ndsi_feature.to_numpy(dtype=np.float32).T

df_ndvi_feature = df_stations_date.loc[
    :,
    df_stations_date.columns.str.fullmatch("ndvi_avg-\\d")
]
ndvi_feature = df_ndvi_feature.to_numpy(dtype=np.float32).T

ndsi_ndvi_feature = np.append(ndsi_feature, ndvi_feature, axis=1)
date_features = np.append(date_features, ndsi_ndvi_feature, axis=1)

Also you can use a tool like flake8 (my preference go to use the plugin wemake-python-styleguide) to check for formatting and other issues.

Answered By - Francisco Puga

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, June 30, 2022

[FIXED] How to properly line break long pandas lines?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels