[파이썬][중급] Chapter32. itertools 고급 패턴

about_IT 2025. 5. 24. 00:03

728x90

itertools는 반복 처리에 최적화된 도구 모음으로, 많은 반복 연산을 간결하고 메모리 효율적으로 처리할 수 있습니다. 이 장에서는 고급 패턴과 실전에서 자주 활용되는 함수들을 다루어봅니다.

● groupby로 데이터 그룹화

groupby()는 정렬된 데이터를 기준으로 연속된 항목들을 그룹화합니다.

from itertools import groupby

data = ['apple', 'apricot', 'banana', 'blueberry', 'cherry']
data.sort(key=lambda x: x[0])

for key, group in groupby(data, key=lambda x: x[0]):
    print(key, list(group))

정렬되지 않은 데이터를 groupby에 넣으면 의도한 결과가 나오지 않으므로 반드시 정렬 후 사용해야 합니다.

● accumulate로 누적 값 계산

from itertools import accumulate
import operator

nums = [1, 2, 3, 4]
print(list(accumulate(nums)))  # [1, 3, 6, 10]
print(list(accumulate(nums, operator.mul)))  # [1, 2, 6, 24]

accumulate()는 합뿐 아니라 곱, 최대값 등 사용자 정의 함수도 적용할 수 있습니다.

● combinations과 combinations_with_replacement

from itertools import combinations, combinations_with_replacement

items = [1, 2, 3]
print(list(combinations(items, 2)))  # [(1, 2), (1, 3), (2, 3)]
print(list(combinations_with_replacement(items, 2)))  # [(1, 1), (1, 2), ...]

조합을 구할 때 중복 허용 여부에 따라 적절한 함수를 선택합니다.

● tee로 반복자 복제

tee()는 반복자를 복제하여 독립적으로 순회할 수 있는 복수의 이터레이터를 만듭니다.

from itertools import tee

iter1, iter2 = tee([1, 2, 3])
print(list(iter1))  # [1, 2, 3]
print(list(iter2))  # [1, 2, 3]

주의: tee로 복제한 반복자는 메모리에 캐시를 유지하므로 너무 많은 복제는 피해야 합니다.

● filterfalse와 compress

from itertools import filterfalse, compress

data = range(10)
print(list(filterfalse(lambda x: x % 2, data)))  # [0, 2, 4, 6, 8]

selectors = [1, 0, 1]
print(list(compress(['a', 'b', 'c'], selectors)))  # ['a', 'c']

filterfalse()는 조건을 만족하지 않는 항목만 반환하고, compress()는 마스크와 함께 필터링을 수행합니다.

● 마무리

itertools는 반복 처리를 단순화하고 성능을 높여주는 강력한 툴킷입니다. 특히 반복자 패턴을 사용하여 메모리 사용을 줄이면서도 복잡한 연산을 처리할 수 있어, 대용량 데이터 처리나 알고리즘 문제 해결에 매우 유용합니다.

728x90

저작자표시 비영리 변경금지 (새창열림)