Issue
I'm looking for a clean, Pythonic, way to eliminate from the following list:
li = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0]
all contiguous repeated elements (runs longer than one number) so as to obtain:
re = [0, 1, 2, 4, 3, 1]
but although I have working code, it feels un-Pythonic and I am quite sure there must be a way out there (maybe some lesser known itertools
functions?) to achieve what I want in a far more concise and elegant way.
Solution
Here is a version based on Karl's which doesn't requires copies of the list (tmp
, the slices, and the zipped list). izip
is significantly faster than (Python 2) zip
for large lists. chain
is slightly slower than slicing but doesn't require a tmp
object or copies of the list. islice
plus making a tmp
is a bit faster, but requires more memory and is less elegant.
from itertools import izip, chain
[y for x, y, z in izip(chain((None, None), li),
chain((None,), li),
li) if x != y != z]
A timeit
test shows it to be approximately twice as fast as Karl's or my fastest groupby
version for short groups.
Make sure to use a value other than None
(like object()
) if your list can contain None
s.
Use this version if you need it to work on an iterator / iterable that isn't a sequence, or your groups are long:
[key for key, group in groupby(li)
if (next(group) or True) and next(group, None) is None]
timeit
shows it's about ten times faster than the other version for 1,000 item groups.
Earlier, slow versions:
[key for key, group in groupby(li) if sum(1 for i in group) == 1]
[key for key, group in groupby(li) if len(tuple(group)) == 1]
Answered By - agf
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.