Issue
I have tried with this code to compare 2 files of .py code but it only limits itself to giving me the last lines of code, for example if file1 has 2014 lines and file2 has 2004 lines then it returns the last 10 lines of file1, but that is It's not what I need I need to extract those lines that are in file1 but are not in file2.
import shutil
file1 = 'bot-proto7test.py'
file2 = 'bot-proto7.py'
with open(file1, 'r') as file1:
with open(file2) as file2:
with open ("output.txt", "w") as out_file:
file2.seek(0, 2)
file1.seek(file2.tell())
shutil.copyfileobj(file1, out_file)
Solution
You can use sets to do that:
with open(file1, 'r') as f:
set1 = {*f.readlines()}
with open(file2, 'r') as f:
set2 = {*f.readlines()}
print(set1 - set2) # it contains only line that are in first file
btw. you can use single with
statement to open multiple files!
with open("f1.txt", "r") as f1, open("f2.txt", "r") as f2:
set1, set2 = {*f1.readlines()}, {*f2.readlines()}
If we want to preserve multiple lines, we can use Counter
from collections import Counter
with open(file1, 'r') as f:
c = Counter(f.readlines())
# simple substraction won't work here if first file contains more occurences than secod
res = Counter({k: v for k, v in c.items() if k not in set2})
print(list(res.elements()))
FInally, if you want to preserve order as well, you need to use original content:
with open(file1, 'r') as f:
original = f.readlines()
res = {*original} - set2
res = [el for el in original if el not in res]
Answered By - kosciej16
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.