Issue
I'm struggling to make my pandas data loading code look "good", I would like to adhere as much as possible to Pep8 with for example at most 80 characters per line. But right now my lines are way too long because of the (unwieldy) way that pandas works. For example:
df_ndsi_feature = df_stations_date.loc[:, df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")]
ndsi_feature = df_ndsi_feature.to_numpy(dtype=np.float32).T
df_ndvi_feature = df_stations_date.loc[:, df_stations_date.columns.str.fullmatch("ndvi_avg-\\d")]
ndvi_feature = df_ndvi_feature.to_numpy(dtype=np.float32).T
ndsi_ndvi_feature = np.append(ndsi_feature, ndvi_feature, axis=1)
date_features = np.append(date_features, ndsi_ndvi_feature, axis=1)
As you can see they are very long, but I don't really know how to properly break them.
Solution
My advise is use a formatter tool like black across all the team and forget to manually try to format the code. Formatting code consistently by hand is really hard, and shifts the cognitive load to the developer, who has to follow a lot of written rules, when there are tools that can do it only good enough.
pep8 has "many" optional ways of doing things so it is hard to achieve a common format even complying with the standard. That is, there are different ways of writing the code that meet the standard
If you are using Jupyter as IDE you can try jupyter-black, plugin.
This is your code sample formatted with black:
df_ndsi_feature = df_stations_date.loc[
:, df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")
]
ndsi_feature = df_ndsi_feature.to_numpy(dtype=np.float32).T
df_ndvi_feature = df_stations_date.loc[
:, df_stations_date.columns.str.fullmatch("ndvi_avg-\\d")
]
ndvi_feature = df_ndvi_feature.to_numpy(dtype=np.float32).T
ndsi_ndvi_feature = np.append(ndsi_feature, ndvi_feature, axis=1)
date_features = np.append(date_features, ndsi_ndvi_feature, axis=1)
But take into account that it can be formatted in different ways and still comply with pep8:
For example:
df_ndsi_feature = df_stations_date.loc[
:,
df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")
]
ndsi_feature = df_ndsi_feature.to_numpy(dtype=np.float32).T
df_ndvi_feature = df_stations_date.loc[
:,
df_stations_date.columns.str.fullmatch("ndvi_avg-\\d")
]
ndvi_feature = df_ndvi_feature.to_numpy(dtype=np.float32).T
ndsi_ndvi_feature = np.append(ndsi_feature, ndvi_feature, axis=1)
date_features = np.append(date_features, ndsi_ndvi_feature, axis=1)
Also you can use a tool like flake8 (my preference go to use the plugin wemake-python-styleguide) to check for formatting and other issues.
Answered By - Francisco Puga
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.