My Kraków adventure. Day one of EuroClojure 2014.

multi-armed-bandit optimisation strategies My notes on multi-armed-bandit optimisation strategies make a lot of sense. To me.
view from the conference center
The beautiful view from the conference centre.

Last week I attended the EuroClojure conference 2014. It was a truly fantastic conference in the beautiful city of Kraków. While the big conferences in the US attract thousands of participants, this one was rather cosy with some 300 participants. As a very good side effect of this, the conference was single tracked. So I missed none of the great talks.

If you do not know clojure by now, let me start with a very short primer: Clojure is a modern, functional programming language targeting the java virtual machine. It is a lisp dialect, designed for concurrency, performance and code that is easy to understand and thus easy to reason about. One of the most outstanding features of clojure is its immutable, persistent datastructures directly built into the language. With clojurescript there also exists a version of clojure targeting javascript rather than the jvm as a runtime.

Having built otto.de in java, modern programming languages that also target the jvm are always interesting for us. We already have Scala in use and are quite satisfied with it. Given that many aspects of a web application are functional in nature, I am sure, a language like clojure will be the best tool for quite a few of our future challenges.

So here is what I learned from all the great talks on day one. The follow up of day two will follow shortly. I will update this post, when videos or slides are published. If you need more detail, you should have a look at the amazingly detailed notes by Philip Potter.

The starter was a talk by Fergal Byrne. He told us about Clortex. A Machine intelligence software which he is working on and which tries to as closely as possible resemble the structure of the human brain’s neocortex. The general strategy is to first study the brain and then build a machine learing software based on it. Thus he started his talk with some interesting insight in how the neocortex in humans and other mammals is structured. There are several, hierarchical layers of neurons. Each layer works roughly the same. Neurons aggregate signals from hierarchically lower layers and their output is aggregated again in the higher layers. Thus the neocortex works with sequences of sequences of neural activation patterns. If you know a little clojure, you might guess that it really shines in mapping such a problem to code. Fergal’s work is based on practical and theoretical groundwork by Jeff Hawkins (of Palm Pilot fame) who released NuPic a software with the same intentions but built with C++ and python as open source in 2013. While the talk was on a super interesting general topic, there where also some interesting learnings on software development. As a developer I found it interesting how Fergal strictly classifies every part of his software as either core or integration. With core being as minimal as possible. Fergal is currently working on a book on machine learning which you can find on leanpub.

Next came a talk by Logan Campbell , who shared his experience of introducing clojure at the Australia Post Digital MailBox. 1 million lines of java code, maintenance outsourced offshore, were to be replaced. An interesting part was how he convinced his fellow developers that clojure could be a better solution than scala. One, supposedly quite common, objection was the lack of a type system. It is indeed the case that clojure does not come with type checking, but  this is also a perfect example of the power of a lisp like clojure. While other languages have to be built around typing as a core feature, not so clojure. The core.typed library adds typing support if and only if you need it. Logen showed different implementations for asynchronous request handling that he evaluated and he carved out what their syntactical and functional differences are. In the end Logan gave one very practical tip: If you want to introduce clojure in your organization, do not go for the big bang release. Go live with something small  first. It will be much simpler  to solve problems and concerns on a smaller scale first.

Tommy Hall’s talk on escaping the DSL hell by using parenthesis (i.e. clojure) all the way down was equally enlightening and hilarious. It started with a little shock for me, as I was not aware, just how hard puppet fails with namespacing. By this examples and that of other configuration management tools, Tommy demonstrated, that it is usually not the best of ideas to invent your own language. You might a) not really need to and b) it will likely be rubbish. So why don’t you c) look into clojure first. Tommy  then discussed Geomlab which is aimed to teach functional programming to children. It has a syntax roughly similar to that of haskell. It does the job quite ok. But. Once the kids have explored all of Geomlab’s features they are stuck. To solve other problems, real world problems even, they have to learn another language. Tommy showed how Geomlab is easily reimplemented in clojurescript. A language which allows the kids to keep investigating and to solve problems out of the DSL’s scope. See the demo on cljsfiddle (press the little play button). Tommy concluded with how the same principal problem (and solution) apply to many if not all other DSLs.

Mathieu Gauthron showcased jvm-breakglass. It lets you easily (by means of an extra maven dependency that is) deploy an nREPL server with your JVM application. After you connected with any off the shelf nREPL client, you can analyze and even modify the state of your application using clojure commands in that repl. While that is of course every operations departments‘ worst nightmare it also is more powerful than most other troubleshooting methods from the classical java toolbox like debugging, jmx or using plain log messages. Among the things you can do to your application is accessing public and private members, listing and rewiring spring beans and many others. I can not imagine using this in production myself, but especially if your software is hard to deploy, jvm-breakglass might be a lifesaver one day.

Gary Crawford began his talk on sentiment analysis of the twittershpere with an absorbing recitation from Leiningen vs. the Ants. He then introduced his project on disease prediction using twitter messages as an input. It was interesting to see how much research went into standardised questionaires like PANAS. PANAS-t, an adaption of which can be used to classify the sentiment of the autors of tweets. The core of Gary’s talk was about strategies and techniques to make big data handleable. Tweets are preprocessed and mapped to bitmaps which are then much cheaper to operate on. Geospatial lookup can be done by mapping go coordinates to pixels in a precolored image. Views with different temporal granularity can be efficiently implemented using multiple keys in redis. Two things Gary has not yet achieved: predict the result of the scottish independence referendum and correctly pronounce „Datensparsamkeit“. Anyway, he had a very good point about the latter. Gary’s slides are on slideshare.

multi-armed-bandit optimisation strategies
My notes on multi-armed-bandit optimisation strategies make a lot of sense. To me.

Next was Paul Ingles who presented on multi armed bandit optimization in clojure. Product optimization cycles are usually to long, complex and slow. Using traditional A/B testing, many organizations do not manage to conduct more than maybe a dozen tests a year. Which is not at all a good number. Using multi armed bandit optimisation the application under development has a number of options (e.g. for the sorting of content). It gets feedback about the success of each of them (e.g. clicks on that content) and presents the best option to the majority of users. All the while it still uses the other options to probe their current success. Paul showed off first the epsilon-greedy algorithm and then Thompson sampling which is in many respects superiour over the former. As in many of the other talks the implementation in clojure was pretty straightforward. Paul even build a little video portal called Notflix which determined both, the sorting of videos and the choice of the best thumbnail using multi armed bandit optimization.

Tommi Reiman from Finland started out with some philosophical discussion about sausages. If I got it correctly, the best sausages come from either Krakow, Poland or Tampere, Finland. Tommi showed, how the schema library is used to declaratively describe data formats with simple maps. The description can then be used for validation, transformation and coercion of data. Tommi continued to show how such a description can be used to generate descriptions for the API of your software. He especially presented ring-swagger and compojure-api which conveniently embed this into ring and compojure respectively. I found this talk interesting, because it showed how work you invest on the everyday problem of sanitizing user input can be reused for an automatic self-documentation of your software. Tommis slides are on slideshare.

Renzo Borgetti followed with an intense view behind the curtains of clojure itself. Renzo showed code from monsters like RT.java, Compiler.java and LispReader.java. Impressive to see how many lines of code form the implementation of clojure itself. It required some massive scrolling by Renzo to show them all. Also interesting was the look back in history. I did not know before, that clojure was first implemented in Common Lisp and at that time compiled to both java and C#. You can find Renzo’s slides on github.

meta ex live-performance
Meta ex live coding music

Last but not least was the keynote by Rich Hickey. The surprise topic was: core.async channels! Channels are a nifty way of communication between concurrent processes. The concept has before been implemented quite similarly in the go language. Rich got into quite some detail about how channels are implemented and how much thought was put into avoiding all but the most granular mutexes. One remarkable feature of the alt! function is the guarantee, that a piece of data is sent over or recieved from exactly one channel. The alt! function is atomic without requiring any channel-wide or even cross-channel locking. This is a minor but effective improvement over the implementation in go.

The day ended with an appearance of Meta-eX. The duo does live performances of electronic music. The music is composed with clojure. Live. From a text editor. They deliver a quite impressing show and just another example for the wide spectrum of possible applications of clojure. Find more info, music and videos on their homepage.

Alltogether day one was a lot of fun and there were a lot of great talks. My summary of day two will follow next week.