Issue
My dataframe has 8 columns for 4 different data, each data has its depth column. basically, each data is a function of its depth. However, each depth has different data frequency. Now I'd like reconcile all these 4 data using same depth interval.
One method came to my mind is using interpolation which is complex as well.
Is there any easier method to reconcile all these data using same depth frequency for example, using the first depth(depth1).
Here is my data and my code.
df=pd.read_csv(file,sep='\t')
df
depth1 data1 depth2 data2 depth3 data3 depth4 data4
0 910.0 32.820 910 48.2 910.05 450.57 912.961414 -294.045478
1 910.1 33.610 911 48.2 910.20 1.14 922.966707 -447.780089
2 910.2 33.900 912 48.2 910.35 1.14 932.972000 -396.001844
3 910.3 34.190 913 48.4 910.50 1.43 942.976616 -391.830800
4 910.4 34.430 914 48.7 910.65 1.32 952.980427 -438.514022
5 910.5 34.670 915 48.9 910.80 1.54 962.984317 -679.421100
6 910.6 35.015 916 48.8 910.95 16.08 972.988514 -660.389044
7 910.7 35.360 917 49.0 911.10 8.16 982.993188 -671.841567
8 910.8 35.450 918 49.5 911.25 7.67 992.998200 -712.625933
9 910.9 35.540 919 49.4 911.40 8.86 1003.004001 -884.093533
10 911.0 35.825 920 49.5 911.55 8.70 1013.009802 -1124.780022
11 911.1 36.110 921 49.6 911.70 7.93 1023.015603 -1454.342144
Thank you so much
Solution
The method of choice will remain reindex - interpolate indeed and it is not that complex. Here the example of channel "depth2"
reindexed over the axis "depth1"
:
Craft the new index
Idx1 = pd.Index(df.set_index('depth1').index) # new index for data2
Idx2 = pd.Index(df.set_index('depth2').index) # original data2 index
# Merge indexes depth1 and depth2
NewIdx = pd.Index(Idx2.union(Idx1),
name = 'depth')
Output:
Index([910.0, 910.1, 910.2, 910.3, 910.4, 910.5, 910.6, 910.7, 910.8, 910.9,
911.0, 911.1, 912.0, 913.0, 914.0, 915.0, 916.0, 917.0, 918.0, 919.0,
920.0, 921.0],
dtype='float64', name='depth')
Operate the reindexing
df2 = df[['depth2', 'data2']].set_index('depth2' # the subset of df to be reindexed,
).reindex(NewIdx # reindexed, as a new dataframe
).interpolate(method='linear') # gaps filled by interpolation
Finally select interpolated data only
df2.loc[Idx1]
This will plot the results:
# Plot original and interpolated data separately:
_=plt.plot(df2.reindex(Idx2).index,
df2.reindex(Idx2).values,'+')
_=plt.plot(df2.reindex(Idx1).index,
df2.reindex(Idx1).values,'x')
_=plt.xlim(909.9, 912.1)
_=plt.xlabel('depth')
_=plt.ylabel('data')
plt.legend(['original data2','interpolated along depth1'])
Answered By - OCa
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.