Issue
I'm a Python beginner. I had inspired by some Python courses. This is the example CSV file below.
Name | Location | Number |
---|---|---|
Andrew Platt Andrew | A B C | 100 |
Steven Thunder Andrew | A B C | 50 |
Jeff England Steven | A B C | 30 |
Andrew England Jeff | A B C | 30 |
I want to get a result like that
['Andrew', 180
'Platt', 100
'Steven', 50
'Jeff', 60
'England', 60
'Andrew Platt', 100
'Platt Andrew', 100
'Steven Thunder', 50
'Thunder Andrew', 50
........]
Logic:
- One-word name, e.g. 'Andrew', as it shows rows 1, 2 and 4, so the result is 180 (100+50+30)
- Two-word name, e.g. 'Andrew Platt', as it shows row 1 only, so the result is 100
- Export result to a new CSV file
import csv
#from itertools import chain
#find one-word
filename=open('sample.csv', 'r')
file = csv.DictReader(filename)
one_word=[]
for col in file:
one_word.append(col['Name'].split()) #find one-word
print(one_word)
#list(chain.from_iterable(one_word)) #this is another code I learned
#get result
#find two-word
#get result
#combine
#sorted by value
#export to a new CSV file
My problem is how to get value, i.e. 180..., which means I need to match the word, then get 'Number' and sum them all?
Note: the location is useless, it's just a coding practice.
Updated:
Maybe make 2 lists, i.e. one-word and two-word, then zip
them
Solution
Looking at your expected result, I'm not sure how you get:
'Andrew Platt', 100
'Platt Andrew', 50
I see "Andrew Platt" and "Platt Andrew" in the first row, but both two-word combos should have the same value of 100, yes?
import csv
from collections import Counter
from itertools import combinations
from pprint import pprint
one_words = Counter()
two_words = Counter()
with open("input.csv", newline="") as f:
reader = csv.DictReader(f)
for row in reader:
items = row["Name"].split(" ")
# Unique one-word
for item in set(items):
one_words[item] += int(row["Number"])
for two_word in combinations(items, 2):
# Skip combos like [Andrew Andrew]
if len(set(two_word)) == 1:
continue
print(f"row is {type(row)}")
print(f"two_word is {type(two_word)}")
print(f"two_words is {type(two_words)}")
two_words[" ".join(two_word)] += int(row["Number"])
pprint(one_words)
pprint(two_words)
I got:
Counter({'Andrew': 180,
'Platt': 100,
'Steven': 80,
'England': 60,
'Jeff': 60,
'Thunder': 50})
Counter({'Andrew Platt': 100,
'Platt Andrew': 100,
'Steven Thunder': 50,
'Steven Andrew': 50,
'Thunder Andrew': 50,
'Jeff England': 30,
'Jeff Steven': 30,
'England Steven': 30,
'Andrew England': 30,
'Andrew Jeff': 30,
'England Jeff': 30})
My debug-print statements print:
row is <class 'dict'>
two_word is <class 'tuple'>
two_words is <class 'collections.Counter'>
Answered By - Zach Young
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.