Generators in Python 3
In this article I will give you an introduction to generators in Python 3. Generators are functions which produce a sequence of results instead of a single value.
A good example for uses of generators are calculations which require CPU (eventually for larger input values) and / or are endless fibonacci numbers or prime numbers. How do you determine what is the maximal number to generate?
Let's see a basic example for fibonacci numbers and prime generation -- first without generators.
def fibonacci(n): if n <= 1: return 1 return fibonacci(n-1) + fibonacci(n-2) def generate_fibonacci(n): fibonacci_numbers =  for i in range(n): fibonacci_numbers.append(fibonacci(i)) return fibonacci_numbers
The code example above generates you a list of fibonacci numbers with a recursive calculation. If we want to display the first 10 fibonacci numbers we could write something like this:
>>> generate_fibonacci(10) [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
If we raise the number to 40 it will take some time. On my machine it took around 54 seconds. Obviously this code has some performance bottleneck in it but let's stick with this code for this article.
The second example is prime generation / prime checking. Consider following code:
def is_prime(n): for i in range(2, n): if n % i == 0: return False return True def generate_primes(n): primes = [2, 3, 5] num = 6 while len(primes) < n: if is_prime(num): primes.append(num) num += 1 return primes
If we want to generate the first 10 primes we could call the generate_primes function like this:
>>> generate_primes(10) [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
I hope you can see the drawback: with increasing numbers you get increasing response times -- and in both cases you could have an infinite set of numbers which can be generated.
Now it is time to take a closer look at generators. What is the main difference between the solutions above and generators? Generators yield results after one is available instead of returning them when all the values are present. This is because generators return an iterator, a generator object. This won't reduce the time it takes to execute the function but you get results on demand. For example you want to sum up all the prime numbers up to 10000 (every prime which is smaller than 10000). In this case how many primes should the function generate_primes create? You will add a big enough number and iterate through the results and exit if the next prime is greater than 10000. But as we have seen previously with a large number the generation gets slower. We will solve this problem later in this article.
Another nice feature that comes with generators is when you print numbers to the console. The original version prints everything if it is done. Generators return each element on demand. So you can print the numbers of primes to the screen as they are calculated. The same goes for iterating: you can iterate through the results as they are generated, you do not have to wait until every element is processed.
Generators can be itareted in a for - in loop as ranges or any other iterables (lists, sets, maps). Or you can add them to a loop and call the next() function in your generator.
Generator functions are functions which return a generator. The examples above return a list -- which is not really what we want. But we can change them to generator functions by converting the return statements in the generate_* functions to yield statements:
def generate_fibonacci(n): for i in range(n): yield fibonacci(i) def generate_primes(n): primes = 0 num = 2 while primes < n: if is_prime(num): primes += 1 yield num num += 1
However this is only a limited solution because you have to provide an ending condition when the generator stops. I mentioned previously that generators are most of the time infinite -- but we limit the generators in the examples above to a given number. Let's fix this and remove the limitations:
def generate_fibonacci(): i = 0 while True: yield fibonacci(i) i += 1 def generate_primes(): num = 2 while True: if is_prime(num): yield num num += 1
As you can see, I simply removed the list which contained the resulting elements, removed the parameter from the argument list and added an endless loop to keep the calculation going. Now we have two generators which can go on as long as we want numbers.
Naturally there is a Pythonic way to create generators where you do not need an extra function to do the trick: you can write generator expressions too.
g = (p for p in generate_primes())
Now the variable g holds a generator expression for generating primes. There is no list generated in the background, the generate_primes() function is not called.
Now you can use the next() function to get the first element of the generator -- just what you can do with a generator function too:
This would print 2 to the console.
Something to notice
Good to keep in mind, that once you start using a generator and request a number from it there is no way you can go back on the items. There is no previous() function which will calculate you the previously calculated number.
To reset your generator you have to re-create it. There is no other way around.
To see the difference between the two solutions let's do a twist on the applications and calculate the sum of numbers (fibonacci or prime) which are smaller or equal to 10000:
sum_numbers = 0 for i in generate_fibonacci(40): if 10000 < i: break sum_numbers += i print(sum_numbers) sum_primes = 0 for i in generate_primes(5000): if 10000 < i: break sum_primes += i print(sum_primes)
The results of the original (not generator) applications if I run them on my computer are:
>>> python3 fibonacci.py 17710 duration: 50.830469369888306 seconds >>> python3 primes.py 5736396 duration: 10.034490585327148 seconds
If I run the examples after switching to generators I get the following results on my computer:
>>> python3 fibonacci.py 17710 duration: 0.0 seconds >>> python3 primes.py 5736396 duration: 0.4524550437927246 seconds
And this is because the functions do not calculate all the fibonacci / prime numbers up to the given range but stop after we reach 10000.
We have taken a look at generators in Python 3 through two examples which have shown how we can use them and utilize the lazyness of them to delay CPU-bound calculations to the time when they are demanded.
Generators are good if you want to generate a long (perhaps infinite) list of values where you perhaps do no need all the values right at the beginning of the usage.