Issue
I'm looking to convert daily data into weekly data. Here is the code I've used to achieve this
daily_data['Week_Number'] = pd.to_datetime(daily_data['candle_date']).dt.week
daily_data['Year'] = pd.to_datetime(daily_data['candle_date']).dt.year
df2 = daily_data.groupby(['Year', 'Week_Number']).agg({'open': 'first', 'high': 'max', 'low': 'min', 'close': 'last', 'volume': 'sum', 'market_cap': 'sum'})
Currently, the dataframe output looks as below -
open high low close volume market_cap
Year Week_Number
2020 31 11106.793367 12041.230145 10914.007709 11059.660924 86939673211 836299315108
32 11059.658520 11903.881608 11011.841384 11653.660942 125051146775 1483987715241
33 11665.874956 12047.515879 11199.052457 11906.236593 141819289223 1513036354035
34 11915.898402 12382.422676 11435.685834 11671.520767 136888268138 1533135548697
35 11668.211439 11806.669046 11183.114210 11704.963980 122232543594 1490089199926
36 11713.540300 12044.196936 9951.201578 10277.329333 161912442921 1434502733759
I'd like the output to have a column week_date
that shows the date of Monday of the week as the start date. Ex: Show 27-07-2020 in place of 31st week of 2020 and so on. It's this final piece that I'm stuck with really badly. Please could I request some help to achieve this.
**
- SOLUTION FOR THOSE WHO NEED
**
The entire function used to convert daily data to weekly below
def convert_dailydata_to_weeklydata(daily_data):
# Print function name
SupportMethods.print_func_name()
# Loop over the rows until a row with Monday as date is present
row_counter_start = 0
while True:
if datetime.weekday(daily_data['candle_date'][row_counter_start]) == 0:
break
row_counter_start += 1
# # Loop over the rows until a row with Sunday as date is present
# row_counter_end = len(daily_data.index) - 1
# while True:
# if datetime.weekday(daily_data['candle_date'][row_counter_end]) == 6:
# break
# row_counter_end -= 1
# print(daily_data)
# print(row_counter_end)
# Copy all rows after the first Monday row of data is reached
daily_data_temp = daily_data[row_counter_start:]
# Getting week number
daily_data_temp['Week_Number'] = pd.to_datetime(daily_data_temp['candle_date']).dt.week
# Getting year. Weeknum is common across years to we need to create unique index by using year and weeknum
daily_data_temp['Year'] = pd.to_datetime(daily_data_temp['candle_date']).dt.year
# Grouping based on required values
df = daily_data_temp.groupby(['Year', 'Week_Number']).agg(
{'open': 'first', 'high': 'max', 'low': 'min', 'close': 'last', 'volume': 'sum', 'market_cap': 'sum'})
# Reset index
df = df.reset_index()
# Create week date (start of week)
# The + "1" isfor the day of the week.Week numbers 0-6 with 0 being Sunday and 6 being Saturday.
df['week_date'] = pd.to_datetime(df['Year'].astype(str) + df['Week_Number'].astype(str) + "1", format='%G%V%w')
# Set indexes
df = df.set_index(['Year', 'Week_Number'])
# Re-order columns into a new dataframe
weekly_data = df[["week_date", "open", "high", "low", "close", "volume", "market_cap"]]
weekly_data = weekly_data.rename({'week_date': 'candle_date'}, axis=1)
# Drop index columns
weekly_data.reset_index(drop=True, inplace=True)
# Return data by dropping curent week's data
if datetime.weekday(weekly_data.head(-1)['candle_date']) != 0:
return weekly_data.head(-1)
else:
return weekly_data
Solution
Try using pd.to_datetime
on the 'Year' and 'Week_Number' columns with a format string for Year, Week of Year, and Day of Week ('%G%V%w'
):
df = df.reset_index()
df['week_date'] = pd.to_datetime(
df['Year'].astype(str) + df['Week_Number'].astype(str) + "1",
format='%G%V%w'
)
df = df.set_index(['Year', 'Week_Number'])
The + "1"
is for the day of the week. Week numbers 0-6 with 0 being Sunday and 6 being Saturday. (Ref. Format Codes)
df
:
open close week_date
Year Week_Number
2020 31 11106.793367 11059.660924 2020-07-27
32 11059.658520 11653.660942 2020-08-03
Answered By - Henry Ecker
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.