Do you enumerate?

# Junior for index in range(len(population)): print(index) # Senior for index, _ in enumerate(population): print(index)

Simple is better than complex. Zen of Python. You know that, right?
Well, we all do, but it’s less consensual when it comes to enumerate().

“What is enumerate()?” In a nutshell, it’s a function that returns something similar to a list of index-value pairs*. It is mostly used in for loops to iterate over a list to get all values and its associated index. But what if you only need indexes?

*Enumerate returns an iterator. From the Python glossary: “[An Iterator is] an object representing a stream of data.”

I need indexes only

I’m new to Python, let’s do it

Most of the time, when developers discovers Python, they solve this issue with the combination of two functions: range() and len():

for i in range(len(population)):
    print(i)

Great, but many Pythoneers will then complains about your code as being “non-pythonic”.

“What the fuck is non-pythonic? or pythonic?”

[Pythonic means] exploiting the features of the Python language to produce code that is clear, concise and maintainable.

James, StackOverflow answer to What does Pythonic means?

So, this means that you didn’t use the “Python” language “correctly”. Great, we have room for improvement (Yes, it’s a good thing!).

A pythonic approach

So, what is the pythonic approach to solve this issue (a big one, I know 😉)?

Well, let’s go back to the enumerate() function and see how it can handle your problem:

for index_value_pair in enumerate(population):
    print(index_value_pair[0])

But it’s ugly

“Ok, but I though that the goal of “Pythonic” code was to be clear, concise and maintainable, this isn’t at all!”

No, it’s not. Because there is many things to consider when you want to produce a clear and meaningful code. First, a pair of value (as a list or a tuple) can be “unpacked” by Python into two variables. How does it works?

Yes, but less with unpacking

Let’s see an example of unpacking:

>>> spam, eggs = ["ham", "bacon"]
>>> spam
ham
>>> eggs
bacon

As explained in the Official Python Tutorial about Tuples and Sequences:

The statement t = 12345, 54321, 'hello!' is an example of tuple packing: the values 12345, 54321 and 'hello!' are packed together in a tuple. The reverse operation is also possible:

>>> x, y, z = t

This is called, appropriately enough, sequence unpacking and works for any sequence on the right-hand side. Sequence unpacking requires that there are as many variables on the left side of the equals sign as there are elements in the sequence. Note that multiple assignment is really just a combination of tuple packing and sequence unpacking.”

So, how to be more pythonic with our enumerate()? Use unpacking in the for loop of course (yes, it’s possible!):

for index, value in enumerate(population):
    print(index)

“Wow, it’s great!” Yes, but not enough.

Great but not enough for Pythoneers

Yes, that’s a huge improvement compared to accessing the first element of the pair to get the index. But it’s not enough if you follow the pythonic philosophy. Why? Because you don’t use the value variable, and nothing is explicit about that.

“But, how can I say that I don’t use something that I’m forced to use, because if I don’t, Python give me a pair of value?”

By using a Python convention: an underscore (“_“) instead of a proper variable name to indicate that the variable is a placeholder to something you won’t use. Rembember Zen of Python “Explicit is better than implicit”.

Let see how it looks like now:

for index, _ in enumerate(population):
    print(index)

“Now, it’s perfect, except that… We use something that give more stuff than needed, when range() and len() give exactly what we need. Zen of Python didn’t also stated that Simple is better than complex?”

Is enumerate() Pythonic for index only?

Disclaimer: Here is my opinion, and some may disagree. But let’s discuss pros and cons.

One call vs two

Simple is better than complex, and one function call instead of two is de facto easier to read and interpret.

1 for enumerate(), 0 for range(len())

Garbage vs Perfect fit

The enumerate() function do return useless values where range(len()) provide an iterator over all indexes only. Ok, one point for range(len()).

1 for enumerate(), 1 for range(len())

Beginner friendly

Yes, you need to read documentation to understand how enumerate works. Yes, many tutorials, schools, and training courses only present the range(len()) way of getting indexes. Yes, unpacking is not as well known as we would want. Fair point, range(len()) do win this round too.

1 for enumerate(), 2 for range(len())

Efficiency

Nope, I won’t address this point here. Why? Because premature optimization is the root of all evil! Maybe in a future article. No point change.

1 for enumerate(), 2 for range(len())

Does it works for everything?

This is the most complex part: is the solution able to handle every object that a for loop can handle?

Sequences and iterables

First, let’s see what kind of object for loop do accepts, by looking into the official documentation: The for statement is used to iterate over the elements of a sequence (such as a string, tuple or list) or other iterable object.

So for loops iterate over sequences and iterable objects. What is an iterable object? Again, official documentation: An object capable of returning its members one at a time.

“But, this is amazingly broad!” Yes, it is. Let me give you a great example:

Lists are sequences. Lists can’t be infinite, because our computers have a limited amount of memory, and can’t store an infinite amount of data at the same time.

But iterables aren’t meant to store any data, just being able to give its values, one at a time. So we can have infinite iterable, such as itertools.count(), itertools.cycle() and itertools.repeat(). You can even create custom infinite iterables yourself, we will see that in a future article.

Infinite length

“But… Infinite iterable cannot have a length, do they?”

No, they don’t. So range(len()) will crash. And unfortunately, iterables doesn’t needs to be infinite to crash when we use len() on them, as the underlying magic method* __len__ is not part of the iterable API**.

*https://docs.python.org/3/glossary.html#term-magic-method
**called the iterable protocol, consisting of the magic method __iter__ and __next__, documentation about protocols for each type here.

So, back to our topic, enumerate can:

  • be used on both sequences and iterables,
  • be used on every object that could have been used in a for loop,

When range(len()) don’t and will crash if you got a iterable instead.

For me, this cost 10 points to range(len()), for a final score of 1 for enumerate(), -8 for range(len())

Conclusion

I strongly believe that enumerate() should be used instead of range(len()) when you want to get only indexes, even after considering all of its downsides.

for index, _ in enumerate(population):
    print(index)

You can continue your reading on those other articles:

Leave a Reply

Discover more from The Way of Python

Subscribe now to keep reading and get access to the full archive.

Continue reading