Skip to content

Suggestion for "optimised Cython" #5

@honnibal

Description

@honnibal

There are lots of ways to write Cython. I normally suggest writing the optimised function with a pure C interface, declared "nogil". The nogil declaration tells the Cython compiler there can be no Python objects within the function body. This gives you much better compiler errors, because the compiler doesn't have to guess that something could be Python trickery. Writing this way gives quite nice code:

# cython: infer_types=True

cdef double std_dev(const double arr*, size_t size) nogil:
    cdef double mean = 0.0
    for pval in arr[:size]:
        mean += pval
    mean /= size
    cdef double sum_sq = 0.
    for pval in arr[:size]:
        sum_sq += (pval-mean)**2
    return (sum_sq / size)**0.5

The only weird syntax is the for pval in arr[:size] loop. You could just as easily do:

for i in range(size):
    pval = arr[i]

But looping over the value is pretty convenient.

Incidentally working with Cython nogil functions can give nicer code than the equivalent Python. The reason is that in Python, we become so scared of the function call overhead that we're reluctant to break things up. In Cython and C this isn't true -- so we might prefer the following:

# cython: infer_types=True

cdef double std_dev(const double arr*, size_t size) nogil:
    mean = get_mean(arr, size)
    cdef double sum_sq = 0.
    for pval in arr[:size]:
        sum_sq += (pval-mean)**2
    return (sum_sq / size)**0.5

cdef double get_mean(const double arr*, size_t size) nogil:
    cdef double mean = 0.0
    for pval in arr[:size]:
        mean += pval
    return mean / size

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions