Issue
Consider the simple example below, borrowed from How to use the ccf() method in the statsmodels library?
import pandas as pd
import numpy as np
import statsmodels.tsa.stattools as smt
import matplotlib.pyplot as plt
np.random.seed(123)
test = pd.DataFrame(np.random.randint(0,25,size=(79, 2)), columns=list('AB'))
I know how to create the forward
and backward
lags of the cross-correlation function (see SO link above) but the issue is how to obtain a proper dataframe containing the correct lag order. I came up with the solution below.
backwards = smt.ccf(test['A'][::-1], test['B'][::-1], adjusted=False)[::-1]
forwards = smt.ccf(test['A'], test['B'], adjusted=False)
#note how we skip the first lag (at 0) because we have a duplicate with the backward values otherwise
a = pd.DataFrame({'lag': range(1, len(forwards)),
'value' : forwards[1:]})
b = pd.DataFrame({'lag': [-i for i in list(range(0, len(forwards)))[::-1]],
'value' : backwards})
full = pd.concat([a,b])
full.sort_values(by = 'lag', inplace = True)
full.set_index('lag').value.plot()
However, this seems to be a lot of code for something that that conceptually is very simple (just appending two lists). Can this code be streamlined?
Thanks!
Solution
Well, you can try "just appending to lists":
# also
# cc = list(backards) + list(forwards[1:])
cc = np.concatenate([backwards, forwards[1:]])
full = pd.DataFrame({'lag':np.arange(len(cc))-len(backwards),
'value':cc})
full.plot(x='lag')
Also:
full = (pd.DataFrame({'value':np.concatenate([backwards, forwards[1:]])})
.assign(lag=lambda x: x.index - len(backwards) )
)
Output:
Note if all you want is to plot the two arrays, then this would do
plt.plot(-np.arange(len(backwards)), backwards, c='C0')
plt.plot(forwards, c='C0')
Answered By - Quang Hoang
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.