Issue
I have the reader read data from database time to time and generate csv files. I want to create the compress files while reading from database.
Currently I am creating the csv file and then creating the compress file.
def create_csv_file(data):
filename = time.strftime("%Y%m%d-%H%M%S") + ".csv"
filename_zip = time.strftime("%Y%m%d-%H%M%S") + ".zip"
try:
with open(filename, "w") as f:
writer = csv.writer(f)
for row in data:
writer.writerow(row)
f.flush()
with zipfile.ZipFile(filename_zip, 'w', zipfile.ZIP_DEFLATED) as myzip:
myzip.write(filename, basename(filename))
except Exception, e:
print 'Error', e.message
I want to directly create zip file without .csv file and release the file open handle.
How can I do it?
Solution
Since there's no way to write a csv file incrementally with the zipfile
module, you'll need to store all the CVS-formatted data somewhere. If the amount of data isn't really huge, memory is an obvious choice. @Davis Herring basically has the right idea, except in Python 2 you need to use BytesIO
and in Python 3, StringIO
, as an intermediate buffer, before adding the formatted results stored in the buffer to the final ZipFile
you want created.
Here's the whole thing, in all its glory. Note, I've left some debugging code in it, which you should be able to easily remove, since I've left your original code in as comments. BTW, it's possible for the two timestamps to be different since you call time.strftime("%Y%m%d-%H%M%S")
twice.
import csv
import io
from pprint import pprint
from random import randint, seed
import time
import zipfile
import sys
InMemoryIO = getattr(io, 'BytesIO' if sys.version_info < (3,) else 'StringIO')
def create_csv_file(data):
#filename = time.strftime("%Y%m%d-%H%M%S") + ".csv"
#filename_zip = time.strftime("%Y%m%d-%H%M%S") + ".zip"
# Use the same filenames everytime for testing.
filename = "compress_me.csv"
filename_zip = filename + ".zip"
with InMemoryIO() as buffer:
csv.writer(buffer).writerows(data) # Convert data to csv format.
with zipfile.ZipFile(filename_zip, 'w', zipfile.ZIP_DEFLATED) as myzip:
myzip.writestr(filename, buffer.getvalue())
# Generate some random values to put in the csv file.
seed(42) # Causes random numbers always be the same for testing.
data = [[randint(0, 100) for _ in range(10)] for _ in range(10)]
pprint(data)
create_csv_file(data)
Answered By - martineau
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.