Tuesday, 21 November 2017

Fun with Python generator expressions

In the past, I have rhapsodised over Python’s list comprehensions. I had also heard of its generators, but never looked at them seriously. Recently I have been thinking about a problem in stream programming, where I will need the generators’ lazy evaluation. So I have been looking at them in more detail. And discovered that (in Python 3 at least) the list comprehension is just syntactic sugar for a generator expression output turned into a list. That is: [ <expr> for i in <iter> if <cond> ] is just syntactic sugar for list( <expr> for i in <iter> if <cond> ). Let’s look at this in more detail.

With a list comprehension, I can construct, say, a list of the first 10 square numbers.
squares = [n*n for n in range(1,11)]
print(squares)
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
Since this is just syntactic sugar, it is the same as writing:
squares = list(n*n for n in range(1,11))
print(squares)
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
So far, so trivial. But the advantage of using the generator form is that it lazily iterates (it "generates") its items one at a time, as they are asked for. If you do not ask, it does not produce. This is useful if you don’t know how many items you want, so cannot specify the stop value of the range(). A generator can be unbounded.
from itertools import *
squares = (n*n for n in count(1))
for s in squares:
    print(s, end=", ")
    if s >= 100:
        print('...')
        break
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, ...
Here count() imported from itertools is a iterable like range() except that it does not have a stop value. It is a generator that just keeps incrementing until you stop asking it for more values.

The module itertools has many such useful functions you can use with potentially infinite generators. For example, we can islice() off the first few items of a potentially unbounded generator, to give a bounded generator that returns only those items. Here we slice off the first 10 values:
squares = (n*n for n in count(1))
firstfew = islice(squares,10)
print(list(firstfew))
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
This is fine if we know how many items we want. Sometimes instead we want all the items up to a particular value. We can use takewhile() for this, here to get the squares less than 150.
squares = (n*n for n in count(1))
firstfew = takewhile(lambda n: n<=150, squares)
print(list(firstfew))
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144]
Let’s use islice() and takewhile to define a couple of "pretty print" functions:
# print the first n items of the generator, comma separated
def print_for(gen,n=10):
    print(*list(islice(gen,n)), sep=', ', end=", ...\n")

# print the generator up to value nmax, comma separated
def print_while(gen,nmax=250):
    print(*list(takewhile(lambda n: n<=nmax, gen)), sep=', ', end=", ...\n")

print_for(n*n for n in count(1))
print_while(n*n for n in count(1))
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, ...
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, ...
squares = (n*n for n in count(1))
print_for(squares)

squares = (n*n for n in count(1))
print_while(squares)
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, ...
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, ...
There are analogous functions dropwhile(), which drops items until its predicate is True, and filterfalse(), which drops all items for which its predicate is false.
print_for(dropwhile(lambda n: n<=5, count(1)))  # drop items until they get to 5
print_for(filterfalse(lambda n: n%3, count(1))) # filter out items not divisible by 3
6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
3, 6, 9, 12, 15, 18, 21, 24, 27, 30, ...

Restarting generators

I have been quite careful above to keep redefining squares. This is because once an item is consumed, it is gone. If I take a generator instance, like squares, and slice off the first ten item, then slice again, I get the next ten items. For example:
squares = (n*n for n in count(1))

print_for(squares)
print_for(squares)  # continues from next value of same generator

print_for(n*n for n in count(1))
print_for(n*n for n in count(1)) # restarts, with new generator
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, ...
121, 144, 169, 196, 225, 256, 289, 324, 361, 400, ...
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, ...
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, ...
You do need to ensure you consume the generated items for this to occur. Consider:
squares = (n*n for n in count(1))
# slice off the first 10?
islice(squares, 10)
print_for(squares)  # maybe not what is expected
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, ...
The islice() produced a generator. But since nothing accessed that generator, nothing actually got sliced off squares. Compare:
squares = (n*n for n in count(1))
# slice off and consume the first 10
list(islice(squares, 10))
print_for(squares)
121, 144, 169, 196, 225, 256, 289, 324, 361, 400, ...

tee() for two

Once it’s gone, it’s gone. But what if you want it back? You could always iterate again. But what if the items are expensive to compute? For example, you may be reading a large file, and don’t want to read it all again. Here tee() is useful for "remembering" earlier items. (Note: this doesn’t make $n$ copies of an iterator, rather it gives $n$ pointers into a single iterator.)
sq_ptr = tee((n*n for n in count(1)))

print_for(sq_ptr[0])
print_for(sq_ptr[0],5)
print_for(sq_ptr[1])
print_for(sq_ptr[1])
print_for(sq_ptr[1],5)
print_for(sq_ptr[0])
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, ...
121, 144, 169, 196, 225, ...
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, ...
121, 144, 169, 196, 225, 256, 289, 324, 361, 400, ...
441, 484, 529, 576, 625, ...
256, 289, 324, 361, 400, 441, 484, 529, 576, 625, ...
One application is building a "sliding window" over the iterated data. Here is an example for a window of size 3:
def triples(iterable):
    a, b, c = tee(iterable,3)
    next(b)
    list(islice(c,2))
    return zip(a, b, c)

print_for(triples(count(1)))
print_for(triples('abcdefghijklmnopqrstuvwxyz'))
(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7), (6, 7, 8), (7, 8, 9), (8, 9, 10),
(9, 10, 11), (10, 11, 12), ...
('a', 'b', 'c'), ('b', 'c', 'd'), ('c', 'd', 'e'), ('d', 'e', 'f'), ('e', 'f', 'g'), 
('f', 'g', 'h'), ('g', 'h', 'i'), ('h', 'i', 'j'), ('i', 'j', 'k'), ('j', 'k', 'l'), ...
and here it is for a user-defined window size:
def sliding_window(iterable,n=2):
    windows = tee(iterable,n)
    for i in range(n):
        list(islice(windows[i],i))
    return zip(*windows)

print_for(sliding_window(count(1),4))
print_for(sliding_window('abcdefghijklmnopqrstuvwxyz',3))
(1, 2, 3, 4), (2, 3, 4, 5), (3, 4, 5, 6), (4, 5, 6, 7), (5, 6, 7, 8), (6, 7, 8, 9), 
(7, 8, 9, 10), (8, 9, 10, 11), (9, 10, 11, 12), (10, 11, 12, 13), ...
('a', 'b', 'c'), ('b', 'c', 'd'), ('c', 'd', 'e'), ('d', 'e', 'f'), ('e', 'f', 'g'), 
('f', 'g', 'h'), ('g', 'h', 'i'), ('h', 'i', 'j'), ('i', 'j', 'k'), ('j', 'k', 'l'), ...

Sums and products

The itertools module has a function accumulate() that takes an iterator, and returns a generator that comprises the sum of the values up to that point. We could use this to implement a simple version of count() "the hard way", by using repeat(), which simply repreats its argument endlessly:
print_for(repeat(3))
print_for(accumulate(repeat(1)))
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
The triangle numbers are 1, 1+2, 1+2+3, ... We can use count() and accumulate() to generate these:
print_for(accumulate(count(1)))
1, 3, 6, 10, 15, 21, 28, 36, 45, 55, ...
accumulate() defaults to summing the items, but other functions can be used.
import operator
factorial = accumulate(count(1), operator.mul)
print_for(factorial)
1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800, ...

Arithmetic series and products

We can use accumulate() to calculate infinite sums and products (or at least, give us the first howevermany terms).

sum of reciprocal squares

$$ \sum_{n=1}^\infty \frac{1}{n^2} = \frac{\pi^2}{6} $$
import math

recip_squares = (1/(n*n) for n in count(1))
print_for(accumulate(recip_squares))

# or all in one go, now with 30 terms:
print_for(accumulate(1/(n*n) for n in count(1)),30)

# it converges rather slowly
print('limit =', math.pi*math.pi/6)
1.0, 1.25, 1.3611111111111112, 1.4236111111111112, 1.4636111111111112, 1.4913888888888889,
1.511797052154195, 1.527422052154195, 1.5397677311665408, 1.5497677311665408, ...
1.0, 1.25, 1.3611111111111112, 1.4236111111111112, 1.4636111111111112, 1.4913888888888889,
1.511797052154195, 1.527422052154195, 1.5397677311665408, 1.5497677311665408, 
1.558032193976458, 1.5649766384209025, 1.5708937981842162, 1.5759958390005426, 
1.580440283444987, 1.584346533444987, 1.587806741057444, 1.5908931608105303, 
1.5936632439130234, 1.5961632439130233, 1.5984308176091684, 1.6004969333116477, 
1.6023872924798896, 1.6041234035910008, 1.6057234035910009, 1.6072026935318293, 
1.6085744356443121, 1.6098499458483937, 1.6110390064904865, 1.6121501176015975, ...
limit = 1.6449340668482264

sum of reciprocal powers of 2

$$ \sum_{n=1}^\infty \frac{1}{2^n} = 1 $$
print_for(accumulate(1/(2**n) for n in count(1)),20)
0.5, 0.75, 0.875, 0.9375, 0.96875, 0.984375, 0.9921875, 0.99609375, 0.998046875, 
0.9990234375, 0.99951171875, 0.999755859375, 0.9998779296875, 0.99993896484375, 
0.999969482421875, 0.9999847412109375, 0.9999923706054688, 0.9999961853027344, 
0.9999980926513672, 0.9999990463256836, ...

factorial

$$ \prod_{i=1}^n i = n! $$
print_for(accumulate((n for n in count(1)), operator.mul))
1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800, ...

a product for +++\pi+++

$$ \prod_{n=1}^\infty \left( \frac{4n^2}{4n^2-1} \right) = \frac{\pi}{2} $$
print_for(accumulate((1/(1-1/(4*n**2)) for n in count(1)), operator.mul),20)

# it also converges rather slowly
print('limit =', math.pi/2)
1.3333333333333333, 1.422222222222222, 1.4628571428571429, 1.4860770975056687, 
1.5010879772784533, 1.51158509600068, 1.5193368144417092, 1.525294998027755, 
1.5300172735634447, 1.533851903321749, 1.5370275801402207, 1.539700671583943, 
1.5419817096159192, 1.5439510349155563, 1.5456684442981097, 1.5471793616434646, 
1.5485189108743247, 1.549714678373069, 1.5507886317191353, 1.5517584807696163, ...
limit = 1.5707963267948966
This is all very neat and nifty, and we have barely scratched the surface of what can be done. But that's probably (more than) enough for now.


Saturday, 18 November 2017

jumping Jehosaphat!

I for one welcome our new jumping robot overlords.




For all my social networking posts, see my Google+ page

Monday, 13 November 2017

Inspired by Nature

Hard copies of our new book have just arrived!  Now to send them off to each of the chapter authors (and one to Julian Miller, of course!).

Happy Birthday, Julian!

Here are the bibliographic details, and the preface, of this Festschrift:

Susan Stepney, Andrew Adamatzky (eds).  Inspired by Nature: Essays Presented to Julian F. Miller on the Occasion of his 60th Birthday, Springer, 2018
This book is a tribute to Julian Francis Miller’s breadth of ideas and achievements in computer science, evolutionary algorithms and genetic programming, electronics, unconventional computing, artificial chemistry, and theoretical biology. Well-known for both Cartesian Genetic Programming and evolution in materio, Julian has further interests from quantum computing to artificial chemistries. He has over 200 refereed publications; here, we highlight just a few of his major accomplishments.

Julian started his life in science as mathematical physicist working on the interaction of solitons in various nonlinear partial differential equations such as the sine-Gordon equation, and the modified Korteweg-de Vries equation. He entered classical computer science with his paper on synthesis and optimisation of networks implemented with universal logic modules. Julian’s interest in optimisation led him to genetic algorithms, which he employed for optimisation of field-programmable arrays, Reed-Muller logical functions, finite-state machines, and evolving combinatorial logic circuits and non-uniform cellular automata.

Julian combined his interests in physics and computer science in work on constant complexity algorithm for solving Boolean satisfiability problems on quantum computers, and quantum algorithm for finding multiple matches. Julian’s ideas in optimisation of circuits and quantum computing are reflected in Younes’ Chapter “Using Reed-Muller Expansions in the Synthesis and Optimization of Boolean Quantum Circuits”.

Julian’s interest in combining natural processes and computation expanded from physics to include the exciting world of biological processes, such as evolution and morphogenesis. He used principles of morphogenesis to evolve computing circuits and programs. These aspects of Julian’s work are reflected in Chapters “Evolvable Hardware Challenges: Past, Present and the Path to a Promising Future” by Haddow and Tyrell, “Artificial Development” by Kuyucu et al., and Banzhaf’s “Some Remarks on Code Evolution with Genetic Programming”.

In 2000, Julian, together with Peter Thomson, presented a fully developed concept of Cartesian Genetic Programming (CGP). There, a program is genetically represented as a directed graph, including automatically defined functions and self-modifying operators. This approach has become very popular, because it allows the discovery of efficient solutions across a wide range of mathematical problems and algorithms. Several chapters of the book manifest the success of CGP in diverse application areas: “Designing Digital Systems Using Cartesian Genetic Programming and VHDL” by Henson et al.; “Breaking the Stereotypical Dogma of Artificial Neural Networks with Cartesian Genetic Programming” by Khan and Ahmad; “Approximate Computing: An Old Job for Cartesian Genetic Programming?” by Sekanina; “Medical Applications of Cartesian Genetic Programming” by Smith and Lones; “Multi-step Ahead Forecasting Using Cartesian Genetic Programming” by Dzalbs and Kalganova; “Cartesian Genetic Programming for Control Engineering” by Clarke; “Bridging the Gap Between Evolvable Hardware and Industry Using Cartesian Genetic Programming” by Vasicek; “Combining Local and Global Search: A Multi-objective Evolutionary Algorithm for Cartesian Genetic Programming” by Kaufmann and Platzner.

In 2001, Miller and Hartman published “Untidy evolution: Evolving messy gates for fault tolerance”. Their ideas of exploiting of “messiness” to achieve “optimality”—“natural evolution is, par excellence, an algorithm that exploits the physical properties of materials”—gave birth to a new field of unconventional computing: evolution in materio. The evolution in materio approach has proved very successful in discovering logical circuits in liquid crystals, disordered ensembles of carbon nanotubes (Chapter “Evolution in Nanomaterio: The NASCENCE Project” by Broersma), slime mould (Chapter “Discovering Boolean Gates in Slime Mould” by Harding et al.), living plants (Chapter “Computers from Plants We Never Made: Speculations” by Adamatzky et al.), and reaction-diffusion chemical systems (“Chemical Computing Through Simulated Evolution” by Bull et al.).

Julian’s inspiration from nature has not neglected the realm of chemistry: he has exploited chemical ideas in the development of a novel form of artificial chemistry, used to explore emergent complexity. Chapter “Sub-Symbolic Artificial Chemistries” by Faulkner et al. formalises this approach.

The book will be a pleasure to explore for readers from all walks of life, from undergraduate students to university professors, from mathematicians, computers scientists, and engineers to chemists and biologists.


Sunday, 12 November 2017

film review: Blade Runner 2049

The IMDB plot summary for the original 1982 Blade Runner starts: “In the futuristic year of 2019, Los Angeles has become a dark and depressing metropolis, filled with urban decay.” It is now thirty years later: the the entire planet has undergone ecological collapse, and is a wasteland populated with Wallace’s synthetic protein farms to feed the millions living in the cities. (Are there also carb, fibre, and vitamin farms?) The urban decay has also worsened, post the mysterious Blackout that wiped or corrupted most digital data, and is filled with those who cannot afford to migrate to a better life off-world. Despite the bankruptcy of the Tyrell corporation after the previous trouble with replicants, Wallace [Jared Leto] has started up a new line of more obedient replicants. Police agent KD6-3.7 [Ryan Gosling] is one of these new replicants, a Blade Runner hunting down the earlier, dangerous models. His latest mission finds an old replicant living on a protein farm, but what he discovers there leads to an explosive revelation that could destroy humanity. He needs to find Deckard [Harrison Ford] to solve the mystery.

K walks away from shot The new film is long (2hr 44mins) and has a rather dream-like quality. There are a few action scenes, but this is mostly K uncovering dark secrets, moving slowly through huge bleak strangely-lit exteriors, or densely-detailed strangely-lit interiors. This is surprisingly effective at generating a mood of quiet desperation. Gosling plays the part of a seemingly unemotional and obedient replicant very well: without cracking a facial expression, he nevertheless manages to convey the loneliness and misery of being a despised un-person. And Wallace, as both original saviour and now potential destroyer of humanity, is an interesting villain.

Joshi and Luv Does it pass the Bechdel test? Well, K’s boss police Lieutenant Joshi [Robin Wright] and Wallace’s spooky replicant enforcer Luv [Sylvia Hoeks] do talk about recent events at the forensics lab, but that’s the only brief scene I can recall; the short conversation between prostitute Mariette [Mackenzie Davis] and sex hologram Joi [Ana de Armas] is about K. And, of course, a major female character is “refrigerated” before the film even starts.

While watching, the film is very engaging, pulling you along with the plot, despite its slow unfolding. It also manages a great plot twist that subverts a lot of the initial setup. It is only afterwards that some of the more ridiculous elements become clear. Why is Wallace blind? He owns replicant technology; surely he can make himself some eyes? The Blackout doesn’t seem to have been a very drastic event integrated into the past history, but only there to make tracking down details of earlier replicants more difficult. The deeper systemic effect of a digital purge is handled more effectively in Dark Angel. Given replicants are so dangerous and difficult to spot, why have their only identifying mark a serial number in one eye? Why not just give them, say, a bright blue skin, and tattoo their serial number across their foreheads? The one scene that did jar at the time was the comment that “no two humans have the same DNA”; I guess they’ve never hear of identical twins? Yet we were supposed to think twins, despite them being a boy and a girl.

car in dark city
However, for all these niggles (which occur in essentially any SF), the good more than outweighs the bad, and I enjoyed the film. With the evocation of a decaying future, and the underlying questions about what makes someone human, this makes a good sequel to the original masterpiece.




For all my film reviews, see my main website.

Saturday, 11 November 2017

sequestering carbon, several books at a time LXXVII

The latest batch (although there are some covert deliveries of books being smuggled into hidden parts of the house, in preparation for slightly-post-solstice celebrations):




Thursday, 9 November 2017

why not use trees?

BBC news headline: Plant captures CO2 out of the air.  Well, yes, that’s what they do!

Oh, okay.  So it’s not quite as snappy as the apocryphal newspaper headline Shell found on beach, but it requires the same sort of anti-default processing to understand.




For all my social networking posts, see my Google+ page

Wednesday, 8 November 2017

pendulous apples

Best use of “golden delicious” apples ever.
High school physics teacher shows his awesome home made marble tracks




For all my social networking posts, see my Google+ page