Issue
The background is I have a scheduled job on Unix and this job sent me hundreds of the emails overnight. Every morning I want to save the attachment of these emails. I have written a python snippet for the purpose to automate the process. However, it seems each time when I run the script, only half of the emails(in my target directory) will get processed. So this is what I got from log this morning:
2021-11-18 06:13:30,688 : INFO : utils.util : Before clean up. 335 items in total
2021-11-18 06:13:42,098 : INFO : utils.util : After clean up. 167 items remained
2021-11-18 06:14:17,968 : INFO : utils.util : Before clean up. 167 items in total
2021-11-18 06:14:25,660 : INFO : utils.util : After clean up. 83 items remained
2021-11-18 06:14:34,762 : INFO : utils.util : Before clean up. 83 items in total
2021-11-18 06:14:38,591 : INFO : utils.util : After clean up. 41 items remained
2021-11-18 06:14:47,633 : INFO : utils.util : Before clean up. 41 items in total
2021-11-18 06:14:49,745 : INFO : utils.util : After clean up. 20 items remained
2021-11-18 06:14:56,348 : INFO : utils.util : Before clean up. 20 items in total
2021-11-18 06:14:57,426 : INFO : utils.util : After clean up. 9 items remained
2021-11-18 06:15:15,807 : INFO : utils.util : Before clean up. 9 items in total
2021-11-18 06:15:16,260 : INFO : utils.util : After clean up. 4 items remained
2021-11-18 06:15:22,981 : INFO : utils.util : Before clean up. 4 items in total
2021-11-18 06:15:23,215 : INFO : utils.util : After clean up. 1 items remained
2021-11-18 06:15:36,117 : INFO : utils.util : Before clean up. 1 items in total
2021-11-18 06:15:36,164 : INFO : utils.util : After clean up. 0 items remained
335->167->83->41->20->9->4->1->0
Can you please give me some hints, with regards to what is the potential issue?
Here's my code:
import os
import win32com.client
def get_target_folder(folder: str):
outlook = win32com.client.Dispatch('outlook.application')
mapi = outlook.GetNamespace("MAPI")
target = mapi
dir = folder.split("\\")
for d in dir:
try:
target = target.Folders(d)
except:
logger.error("Current folder path {}. The sub folder {} doesn't exist".format(target.FolderPath, d))
target = None
break
return target
def save_job_status():
mailfolder= config["OUTLOOK"]["JOBS"]
keyword = config["OUTLOOK"]["KEYWORD"]
destination = config["OUTLOOK"]["LOCALDATA"]
criteria = f"@SQL=\"urn:schemas:httpmail:subject\" like '%{keyword}%'"
folder = get_target_folder(folder=mailfolder)
items = folder.items
emails = items.restrict(criteria)
logger.info("Before clean up. {} items in total".format(emails.count))
for email in emails:
try:
attachments = email.attachments
for attachment in attachments:
filename = attachment.FileName
attachment.SaveAsFile(os.path.join(destination, filename ))
email.Delete()
except:
logger.error("Can't operate on the email {}".format(email.Subject))
items = folder.items
logger.info("After clean up. {} items remained".format(items.count))
As I have also setup outlook rules, all mails go to that "JOBS" folder, in my outlook, are the ones satisfied the condition and should be processed. I setup a "criteria" only for the purpose to avoid operation mistake.
Thank you in advance!
========================Solution Identified======================= After referring to @Dmitry Streblechenko answer, I understood. It's a collection, and I was modifying the collection, while I'm accessing it.
The key modification I did, to resolve this issue is as below
count = emails.count
for i in range(count-1, -1, -1):
print("{} {}".format(i, emails[i].subject))
emails[i].Delete()
Please note this is a reverse index access, i.e from the largest to the smallest, using a range. As when the bottom of the collection is removed the top ones can still be accessed using the original index.
Solution
Do not use for email in emails:
loop if you are modifying the collection (by calling email.Delete()
). Use a down loop from emails.Count
down to 1.
Answered By - Dmitry Streblechenko
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.