Issue
i have this data
Id | SpatialDimType | SpatialDim | TimeDim | Value | NumericValue |
---|---|---|---|---|---|
32256659 | COUNTRY | ATU | 2022 | No data | NaN |
32256661 | COUNTRY | AND | 2022 | No data | NaN |
32256658 | COUNTRY | BHS | 2022 | No data | NAN |
32256642 | COUNTRY | AUS | 2022 | No data | NAN |
I want to retrieve a cont for each data from spatialdim if the numercvalue= nan
for ex:
SpatialDim /// cont //// year
and /// 36 //// 2022 \n and /// 21 //// 2002 \n aut /// 89 //// 2010 . . . this is output should be
Solution
see Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
- filter for rows where
NumericValue
is NaN - group data by
SpatialDim
andTimeDim
and find counts for unique pairs
import pandas as pd
df = pd.read_csv('./space.csv', sep=',')
df = df[df['NumericValue'].isna()].groupby(
by=['SpatialDim', 'TimeDim']
).size().reset_index(name='count')
print(df)
suppose ./space.csv
is the following.
Id,SpatialDimType,SpatialDim,TimeDim,Value,NumericValue,Low,High
32256659,COUNTRY,AND,2022,No data,,,
32256659,COUNTRY,AND,2022,No data,,,
32256659,COUNTRY,AND,2023,No data,,,
32256661,COUNTRY,ATG,2022,No data,,,
32256664,COUNTRY,AUS,2001,No data,,,
32256664,COUNTRY,AUS,2001,No data,,,
32256664,COUNTRY,AUS,2001,No data,,,
32256664,COUNTRY,AUS,2004,No data,,,
32256664,COUNTRY,AUS,2004,No data,,,
32256665,COUNTRY,AUT,2004,No data,,,
this is the frequency/counts dataframe you get
SpatialDim TimeDim count
0 AND 2022 2
1 AND 2023 1
2 ATG 2022 1
3 AUS 2001 3
4 AUS 2004 2
5 AUT 2004 1
Answered By - hashir_k
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.