Issue
need a bit of help with my function.
This is what I'm trying to do:
Build a predictive model that can give us the best guess at what the population growth rate in a given year might be. We will calculate the population growth rate as follows:
As such, we can only calculate the growth rate for the year 1961 onwards.
Write a function that takes the population_df and a country_code as input and computes the population growth rate for a given country starting from the year 1961. This function must return a return a 2-d numpy array that contains the year and corresponding growth rate for the country.
Function Specifications:
Should take a population_df and country_code string as input and return a numpy array as output. The array should only have two columns containing the year and the population growth rate, in other words, it should have a shape (?, 2) where ? is the length of the data.
πΊπππ€π‘β_πππ‘π = ππ’πππππ‘_π¦πππ_ππππ’πππ‘πππ β ππππ£πππ’π _π¦πππ_ππππ’πππ‘πππ / ππππ£πππ’π _π¦πππ_ππππ’πππ‘πππ
Should take a population_df and country_code string as input and return a numpy array as output. The array should only have two columns containing the year and the population growth rate, in other words, it should have a shape (?, 2) where ? is the length of the data.
My code:(Changeable)
def pop_growth_by_country_year(df,country_code):
country_data = df.loc[country_code]
for columnName, columnData in country_data.iteritems():
country_data = ((country_data[columnData] - country_data[columnData-1]) / country_data[columnData-1])
output = country_data.reset_index().to_numpy().reshape(-1, 2)
return output
Input function(Not changeable)
pop_growth_by_country_year(population_df,'ABW')
Expected output:
array([[ 1.961e+03, 2.263e-02],
[ 1.962e+03, 1.420e-02],
[ 1.963e+03, 8.360e-03],
[ 1.964e+03, 5.940e-03],
... ....
[ 2.015e+03, 5.260e-03],
[ 2.016e+03, 4.610e-03],
[ 2.017e+03, 4.220e-03]])
Solution
My input:
population_df = pd.DataFrame({
'1960': {'ABW': 54211.0, 'AFG': 8996351.0, 'AGO': 5643182.0, 'ALB': 1608800.0, 'AND': 13411.0},
'1961': {'ABW': 55438.0, 'AFG': 9166764.0, 'AGO': 5753024.0, 'ALB': 1659800.0, 'AND': 14375.0},
'1962': {'ABW': 56225.0, 'AFG': 9345868.0, 'AGO': 5866061.0, 'ALB': 1711319.0, 'AND': 15370.0},
'1963': {'ABW': 56695.0, 'AFG': 9533954.0, 'AGO': 5980417.0, 'ALB': 1762621.0, 'AND': 16412.0}
})
population_df
My solution:
def pop_growth_by_country_year(df,country_code):
current_population = df.loc[country_code]
previous_population = current_population.shift(1)
growth = (current_population-previous_population)/previous_population
return growth.dropna().reset_index().astype(float).values
Output of pop_growth_by_country_year(population_df,'ABW')
array([[1.96100000e+03, 2.26337828e-02],
[1.96200000e+03, 1.41960388e-02],
[1.96300000e+03, 8.35927079e-03]])
Note that, since you don't have the previous population for the first year (1960 in this case), you will miss the growth for that year and for this reason len(output)=len(input)-1
Answered By - Salvatore Daniele Bianco
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.