Skip to content

Yet another mindlessly crazy idea

Was in the middle of writing a DSL for unit testing a somewhat complex piece of code and then thought of extending Python syntax with (noop) keywords such as just, even, only, but, again etc. This would allow one to write intention-oriented code:

expect_condition(some_param=3, other_param=4)
expect_condition(some_param=now 10, other_param=still 4)
expect_condition(some_param=again 3, other_param=still 4)
but not expect_condition(some_param=20, other_param=20)
and not even expect_condition(some_param=200, other_param=200)

…or even some ad-hoc extensible syntax, which would allow to add such literate markers anywhere in the code.
—Should be easy to add to the (CPython or PyPy) parser.

The next level would be to do metaprogramming assertions… i.e. if you have other_param=4 and then other_param=still 5, a custom analyser would be able to detect the flaw; something called an Intention Mismatch Detector!

Could be a potential start to a magic but (arguably) Pythonic alternative to Ruby’s more uncontrolled DSL writing power.

Specialized Python list/generator comprehension syntax

One of the most common use cases for list comprehensions is to get an attribute of all the items in a sequence, or simply call a method on it:

names = [person.name for person in people]

I normally reduce that, by eliminating the semantic duplication/redundancies, to:

names = [x.name for x in people]

where x serves a similar purpose as pronouns in natural language (i.e. we say “for all people, get the name of each one” as opposed to “…of each person“, under normal circumstances, although this might not be the best example).

Another common thing to do is either call a method or access several attributes (or a combination of these):

bmi_values = [x.compute_bmi() for x in people]
tabular_data = [(x.name, x.age) for x in people]

Now as far as I know, Python is supposed to be a really readable language, and its code is also known to look like natural language. Which led me to think of the following crazy idea to have special case syntactic sugar for the above cases:

names = [get name for people]
# `get` is a keyword and the expression maps to
# [__tmpvar__.name for __tmpvar__ in people]

and

tabular_data = [(get name, get age) for people]

and

bmi_values = [each.compute_bmi() in people]
# `each` is a keyword and the expression maps to
# [__tmpvar__.compute_bmi() for __tmpvar__ in people]

or even

bmi_values = [.compute_bmi() for people]

and

names = [.name for people]

and

tabular_data = [(.name, .age) for people]

…to cover all cases with one simple syntax. The syntax isn’t really outrageous when considering `from . import stuff` and `from .stuff import Thing`.

…probably too crazy an idea to ever become reality, but maybe Guido will add at least some sugar on top of the list comprehension/generator expression syntax in some far future version of Python. Because I personally don’t like repetitive boilerplate such as that found in [x.name for x in people] where I have to resort to technical stuff like x to emulate pronouns instead of the language having first class support for forming sentences with pronouns or forming subclauses with no pronouns at all.

Sharing Validations Between Test and App Code

So I have several requirements:

1) I want to not have to duplicate (data/input) validations between test and app code at all
2) I want the same validation logic to result in a ValueError or a TypeError etc. instead of AssertionError
3) I want the same validation logic to result in AssertionError‘s in test code, regardless of whether I’m running Python with -O or without an optimisation level. The solution is as follows.

Validation used by non-test code; raises normal exceptions:

def _validate_addr(addr):
    # call from app code
    m = _VALID_ADDR.match(addr)
    if not m:
        raise ValueError("Addresses should be in the format 'http://:': %s" % (addr,))
    port = int(m.group(1))
    if not (0 <= port <= 65535):
        raise ValueError("Ports should be in the range 0-65535: %d" % (port,))

Test wrapper around the above validator that converts “normal” exceptions into AssertionErrors with the message of the original exception:

def _assert_valid_addr(addr):
    # call from test code
    try:
        _validate_addr(addr)
    except ValueError as e:
        raise AssertionError(e.message)

As this code does not rely on the assert keyword at all, all validations will be active with python -O, which is what we want.

If optimisable asserts are desired, modify the _assert_... as follows:

def _valid_addr(addr):  # not _assert_valid_... nor _validate_...
    # call from app code and don't forget to prefix with `assert`
    try:
        _validate_addr(addr)
    except ValueError as e:
        raise AssertionError(e.message)
    return True

and then prefix any calls to that function with assert in your non-test code:

# you still get the same decent assertion messages, and because the function returns `True`, the outer `assert` itself won't fail:
assert _valid_addr(some_var) 

In the above snippet, the entire line, and thus the function call, will be optimised away by Python when run with the -O flag.

Attribute deletion inconsistency in Python

This inconsistency exists on both Python 2 and 3 (tested on 2.7 and 3.2):

class C(object):
    x = None

c = C()

assert hasattr(c, 'x')
del c.x  # AttributeError: 'C' object attribute 'x' is read-only

however:

class C(object):
    x = None

    def __init__(self):
        self.x = 1


c = C()

assert hasattr(c, 'x')
del c.x  # OK

assert hasattr(c, 'x')
del c.x  # AttributeError: x

So depending on whether the attribute has previously existed on the instance of not, del gives different error messages.

A @coroutine decorator for Twisted

UPDATE: the approach described in this post is now packaged as http://pypi.python.org/pypi/txcoroutine. In addition, it provides memory-efficient tail recursion.

Twisted has a nice and useful @inlineCallbacks which allows one to write asynchronous code while avoiding ending up with a callback jungle. If you don’t already know it, I suggest you check it out. However, while @inlineCallbacks solves many problems and effectively provides a simple coroutine implementation on top of Python generators (see the PDF at http://www.dabeaz.com/generators/ and then at http://www.dabeaz.com/coroutines/), it doesn’t really provide the full coroutine feature set. I’ll explain.

Coroutines are meant to resemble lightweight (green) threads with fully deterministic (cooperative) multitasking. I won’t delve further into this here though. However, the (current) implementation @inlineCallbacks only provides very primitive control over how your coroutines behave. It basically assumes that you never want to control the execution of your coroutines from the outside, thus it’s impossible to (generically) stop, suspend or resume a coroutine once you’ve created it. A generator-producing function wrapped with @inlineCallbacks simply returns a Deferred instance that is fired when the coroutine somehow exits.

On closer inspection, though, you will notice that a Deferred instance has the methods cancel, pause and unpause—so, one might think these correspond to stopping, suspending and resuming the coroutine, respectively, however, calling those methods only affects how the Deferred itself behaves and has no effect whatsoever on what is happening inside of the coroutine. So for example calling .cancel() on the Deferred, does not stop the coroutine from executing but simply ignores the outcome of the coroutine. .pause() and .unpause() temporarily ignore and unignore the result of the coroutine, but still, nothing is affected inside of it.

Then, I realised how easy it would be to change that by writing a modified version of inlineCallbacks (and calling it coroutine, for both compatibility and semantic correctness, because it’s not only about just callbacks anymore). Deferred‘s constructor takes an optional canceller argument which is invoked when the Deferred is .cancel()-ed, so changing the following line in def inlineCallbacks:

return _inlineCallbacks(None, gen, Deferred())

to

return _inlineCallbacks(None, gen, Deferred(canceller=lambda _: gen.close()))

adds stopping support for coroutines. Now, one can produce code such as:

@coroutine
def some_process():
    try:
        while True:
            msg = yield get_message()
            process_message(msg)
    finally:  # could use `except GeneratorExit` but `finally` is more illustrative
        print("coroutine stopped, cleaning up")

proc = some_process()
proc.cancel()
# ==> coroutine stopped, cleaning up

Moving on. Adding suspending and resuming support, and support for automatically cancelling the inner Deferred, was trickier. There is no way to hook into pausing and unpausing of a Deferred. So I was creative and subclassed Deferred.

class Coroutine(Deferred):
    # this is something like chaining, but firing of the other deferred does not cause this deferred to fire.
    # also, we manually unchain and rechain as the coroutine yields new Deferreds.
    cancelling = False
    depends_on = None

    def pause(self):
        self.depends_on.pause()
        return Deferred.pause(self)

    def unpause(self):
        self.depends_on.unpause()
        return Deferred.unpause(self)

    def cancel(self):
        # to signal _inlineCallbacks to not fire self.errback with CancelledError;
        # otherwise we'd have to call `Deferred.cancel(self)` immediately, but it
        # would be semantically unnice if, by the time the coroutine is told to do
        # its clean-up routine, the inner Deferred hadn't yet actually been cancelled.
        self.cancelling = True

        # this errback is added as the last one, so anybody else who is already listening for CancelledError
        # will still get it.
        swallow_cancelled_error = lambda f: f.trap(CancelledError)

        self.depends_on.addErrback(swallow_cancelled_error)
        self.depends_on.cancel()

        self.addErrback(swallow_cancelled_error)
        Deferred.cancel(self)

Then it was only a matter of changing:

return _inlineCallbacks(None, gen, Deferred(canceller=lambda _: gen.close()))

to:

return _inlineCallbacks(None, gen, Coroutine(canceller=lambda _: gen.close()))

And keeping the Deferred’s depends_on field up to date in _inlineCallbacks where the generator magic is happening:

 if isinstance(result, Deferred):
+    deferred.depends_on = result

There was one more trick to cancellation support though. We don’t want a CancelledError to be sent to our main Deferred when we’re cancelling the inner one, also because then we won’t even be able to cancel the main one because its canceller would already have been called. So right at the start of the while loop inside _inlineCallbacks, we make the following amendment:

 if isFailure:
+    if deferred.cancelling:  # must be that CancelledError that we want to ignore
+        return

That’s it. Now you have stopping, suspending and resuming support for coroutines in Twisted. Here’s a full code example:

from __future__ import print_function

from twisted.internet import reactor
from twisted.internet.defer import Deferred

def get_message():
    d = Deferred(canceller=lambda _: (
        print("cancelled getting a message"),
        heavylifting.cancel(),
    ))
    print("getting a message...")
    heavylifting = reactor.callLater(1.0, d.callback, 'dummy-message')
    return d

@coroutine
def some_process():
    try:
        while True:
            msg = yield get_message()
            print("processing message: %s" % (msg,))
    finally:  # could use `except GeneratorExit` but `finally` is more illustrative
        print("coroutine stopped, cleaning up")

def main():
    proc = some_process()
    reactor.callLater(3, proc.cancel)

reactor.callWhenRunning(main)
reactor.run()
# ==> getting a message...
# ==> processing message: dummy-message
# ==> getting a message...
# ==> processing message: dummy-message
# ...
# ==> cancelled getting a message
# ==> coroutine stopped, cleaning up

Useful git command aliases

alias g='git'

alias ga='g add'
alias gb='g branch'
alias gc='g checkout'
alias gcb='gc -b'
alias gd='g diff'
alias gds='g diff --staged'
alias gf='g fetch'
alias gl='g log --pretty=format:"%Cblue%h%Creset%x09%an%x09 %ar%x09%s" --graph'
alias gp='g push'
alias gr='g rebase origin/master master'
alias gs='g status'
alias gst='g stash'
alias gu='gl trunk..master'
alias gw='gl master..origin/master'

alias d='gd'
alias s='gs'

Scala views are Zope adapters

Compare:

implicit def listToSet[T](xs: GenList[T]): Set[T] =
  new Set[T] {
    def include(x: T): Set[T] =
      xs prepend x
    def contains(x: T): boolean =
      !xs.isEmpty && (xs.head == x || (xs.tail contains x))
  }

vs:

class ListToSet(object):
    implements(ISet)
    adapts(GenList)
    def __init__(self, genlist):
        self.genlist = genlist
    def include(self, x):
        return self.genlist.prepend(x)
    def contains(self, x):
        return self.genlist and self.genlist.head == x or ISet(self.genlist.tail).contains(x)
Follow

Get every new post delivered to your Inbox.