Join the discussion on HackerNews!

This post and work on HipHip are done in part by Leon Barrett, who is visiting on a ‘sprintbatical’ from fellow Clojure shop The Climate Corporation, and Emil Flakk. This post is cross-posted to The Climate Corporation’s blog.

Climate Corp. shares the unique challenge of developing new algorithms and making them work at a large scale. This combination of research and engineering takes many cycles of algorithm and model design, including rapid implementation to test on lots of real data. In contrast to most engineering prototyping, such numeric modeling demands high performance out of the gate, since correctness cannot be judged on just a little bit of data. In this post, we introduce our open-source array processing library HipHip, which combines Clojure’s expressiveness with the fastest math Java has to offer.

Computational Functional Programming

As we’ve outlined in past blog posts, we love Clojure (and functional programming generally) because it allows us to quickly build performant and reusable tools. However, many of the data abstractions that make Clojure code so reusable don’t yield fast numerical computation. For instance, consider the dot product operation, which is the core performance bottleneck in most machine learning algorithms. Using Clojure’s built-in sequence operations, it would be natural to write the dense dot product as:

</img>

The above code succinctly expresses the idea that we form the sequence of element-wise products between ws and xs, and then we sum those entries. But its performance is a disaster, especially if you’re considering real-world machine learning use where each prediction could require thousands of dot products. The core of the slowness arises from the fact that intermediate operations produce sequences of ‘boxed’ Java Double objects. All arithmetic operations on these boxed objects are significantly slower than on their primitive counterparts. This implementation also creates an unnecessary intermediate sequence (the result of the map), rather than just summing the numbers directly.

Clojure’s built-in array macros vs. plain-old Java

Clojure comes with built-in array processing macros (amap and areduce) which extend Clojure sequence operations to arrays and allow for primitive arithmetic. Here’s what dot-product looks like using those macros:

</img>

The performance here is acceptable, but the code itself is pretty convoluted. Arguably, this isn’t even much of an improvement over the Java version:

Why not just use Java for math bottlenecks?

</img>

Since the dot product is a crucial bottleneck for machine learning, why not just suck it up and use the Java version? Clojure makes Java-interop easy so that you can take the core bottleneck of your system and make it fast.

The problem is that for computational systems, there typically isn’t a single bottleneck. Instead, any function that performs array operations (whether double, long, float, etc.) over each item of data can cause an unacceptable slowdown, even during the research prototyping stage. Since these sorts of array operations are the bulk of our computational code, our research engineers would mostly be stuck writing Java. We need the expressiveness of Clojure and the speed of Java primitives.

Introducing the HipHip array processing library

</img>

We’re happy to announce HipHip, a Clojure array-processing library. Here’s what a dot product looks like in HipHip:

All the arithmetic operations are performed over primitive doubles and which allows the speed to match Java. Here we get a clean, functional representation of the operation that matches Java performance without any worrying about type hints.

HipHip provides namespaces for working on arrays of the four main primitive math types (double, float, long, and int) without type hints, which include:

  • Versions of clojure.core array fns that do not require type hints:
  • A family of macros for efficiently iterating over array(s) with a common binding syntax, including amake, doarr, afill!, and our own versions of amap and areduce.
  • Common ‘mathy’ operations like summing over an array and selecting a maximal element (or many):

You might find some of these operations useful, even if your data isn’t already in arrays. For example, the following function can find the top 1000 elements of vector xs by score-fn about 20 times faster than (take 1000 (reverse (sort-by score-fn xs))) on 100k elements.

HipHip also provides a generic namespace hiphip.array with variants of the iteration macros that work on all (hinted) array types, including mixing-and-matching a variety of array types in the same operation:

HipHip uses Criterium for benchmarking, and runs benchmarks as tests, so we can be pretty confident that most of our double array operations run 0-50% slower than Java (but we’re not quite there for other array types yet, see the ‘Known Issues’ section of the readme for details). The readme also describes a critical option for fast array math if you’re running under Leiningen.

Check it out and let us know what you think in the Hacker News thread.