List comprehensions in Python

If you listen to some people, they say that in Python, everything is a dictionary. This may be true, but for me the most powerful thing in the language is its list processing capabilities.

Python’s syntax for lists is simple. Anything in between square brackets ([ and ]) is treated as a list. That list can then be easily iterated over with the simple for item in list: syntax:

list = [1, 4, 9, 16, 25]
for item in list:
    print item

However, there’s a really cunning thing you can do with lists called “List Comprehension.” This allows you to create a list from another list, applying an arbitrary set of transformation and filters. A simple example is [x*2 for x in otherlist], which creates a list containing all the values in otherlist, each value doubled.

The source list and the expression applied to each entry can be arbitrarily complex. For example, I can rewrite the code in my first example (printing a list of the squares of numbers 1–5) as:

for item in [x*x for x in range(1, 5)]:
    print item

Things get even more interesting when you use the if syntax. This allows arbitrary elements to be filtered from the source list. Take my example from HTML scraping. I grab a list of all <div> DOM node elements, and filter them by their class name in a single line:

rows = [div for div in tree.getElementsByTagName("div")\
        if div.getAttribute("class").strip() == "row"]

Another example; parsing a key=value style file and reading it into a dictionary, ignoring blank lines:

params = dict([param.split('=', 1) for param \
               in open('file').read().split('\n')
               if param])

List comprehension lets you write concise code without sacrificing readability. Well, once you know a little about them, that is.

Filed under: Coding
Posted at 14:10:00 GMT on 8th November 2007.

About Matt Godbolt

Matt Godbolt is a C++ developer living in Chicago. He works for Hudson River Trading on super fun but secret things. He is one half of the Two's Complement podcast. Follow him on Mastodon or Bluesky.