5 min read

Calling R from Clojure with rojure

In this short post I will show, how we can call R from Clojure, by using the rojure package.

Requirements

At first we need to add rojure to the dependencies of the Clojure project, we work on:

[rojure "0.2.0"]

Secondly we need to start an R process on our machine, and install(if not done) and use the “Rserve” package.

# install.packages("Rserve")
library(Rserve)
Rserve()

This will start an server process, which the rojure library can connect to and send it R data and code for execution.

Examples

Evaluating R code

This is the simplest usage. We can evaluate arbitrary R code, given as a string:

(use '(rojure core))
(use '[clojure.core.matrix :as m])
(def r (get-r))
(println (r-eval r "sessionInfo()"))
## {R.version {platform [x86_64-pc-linux-gnu], arch [x86_64], os [linux-gnu], system [x86_64, linux-gnu], status [], major [3], minor [4.1], year [2017], month [06], day [30], svn rev [72865], language [R], version.string [R version 3.4.1 (2017-06-30)], nickname [Single Candle]}, platform [x86_64-pc-linux-gnu (64-bit)], locale [LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C], running [Arch Linux], basePkgs [stats graphics grDevices utils datasets methods base], loadedOnly {compiler {Package [compiler], Version [3.4.1], Priority [base], Title [The R Compiler Package], Author [Luke Tierney <luke-tierney@uiowa.edu>], Maintainer [R Core Team <R-core@r-project.org>], Description [Byte code compiler for R.], License [Part of R 3.4.1], Built [R 3.4.1; ; 2017-09-08 12:26:18 UTC; unix]}}, matprod [default], BLAS [/usr/lib/libblas_nehalemp-r0.2.19.so], LAPACK [/usr/lib/liblapack.so.3.7.1]}

Clojure -> R

Clojure objects, can be set in the R session. This works for:

  • sequences / vectors
  • core.matrix dataset
  • core.matrix matrix

R->Clojure

Getting any data.frame from R returns it as object of type core.matrix dataset.

(use '(rojure core))
(use '[clojure.core.matrix :as m]
     '[clojure.core.matrix.dataset :as ds]

)
(def r (get-r))
(def iris (r-get r "iris"))
(println "shape: " (m/shape iris))
(println "column names: " (ds/column-names iris))
(println "class: " (class iris))
(println "First 5 of Sepal.Length" (take 5 (ds/column iris "Sepal.Length")))
## shape:  [150 5]
## column names:  [Sepal.Length Sepal.Width Petal.Length Petal.Width Species]
## class:  clojure.core.matrix.impl.dataset.DataSet
## First 5 of Sepal.Length (5.1 4.9 4.7 4.6 5.0)

The same works for matrices, which becomes a data structure on which the functions from clojre.core.matrix can be called.

(use '(rojure core))
(use '[clojure.core.matrix :as m])
(def r (get-r))
(def m (r-eval r "diag(1, 4, 4)"))
(println "shape: "(m/shape m))
(println "class " (class m))
(println (m/pm m))
## shape:  [4 4]
## class  clojure.lang.PersistentVector
## [[1.000 0.000 0.000 0.000]
##  [0.000 1.000 0.000 0.000]
##  [0.000 0.000 1.000 0.000]
##  [0.000 0.000 0.000 1.000]]
## nil

The concrete implementation of the matrix class created, depends on the current default, which can be changed by calling “m/set-current-implementation”

Using r-transform to train/predict a classifier

In the following example I call the quick start code from R package mlr from Clojure for training a classifier.

The “iris” data comes from “incanter/get-dataset”, gets written into the R session, and then we call the mlr R functions from Clojure.

I demonstrate two version of the same R code, by using either “with-r-eval” “r-transform”

with with-r-eval

Here the R code is given as sequence of strings.

(use '(rojure core))
(use '[clojure.core.matrix :as m])
(require '[incanter.core :as ic])
(require '[incanter.datasets :as id])

(def r (get-r))
(def iris-ds (id/get-dataset :iris))
; change col names, mlr package des not like ":" in variables         
(def iris-ds (ic/rename-cols {:Sepal.Length "Sepal.Length"
                             :Sepal.Width "Sepal.Width"
                             :Petal.Length "Petal.Length"
                             :Petal.Width "Petal.Width"
                             :Species "Species"
                              } iris-ds))

(r-set! r "iris" iris-ds)

 (println (with-r-eval r
  "library(mlr)"
  "task = makeClassifTask(data = iris, target = 'Species')"
  "lrn = makeLearner('classif.lda')"
  "n = nrow(iris)"
  "train.set = sample(n, size = 2/3*n)"
  "test.set = setdiff(1:n, train.set)"
  "model = train(lrn, task, subset = train.set)"
  "pred = predict(model, task = task, subset = test.set)"
  "performance(pred, measures = list(mmce, acc))"
  ))
## [0.02 0.98]

Using r-transform

The same can be achieved by calling r-transform and giving it a R script to execute.

(use '(rojure core))
(use '[clojure.core.matrix :as m])
(require '[incanter.core :as ic])
(require '[incanter.datasets :as id])

(def r (get-r))
(def iris-ds (id/get-dataset :iris))
; change col names, mlr package des not like ":" in variables         
(def iris-ds (ic/rename-cols {:Sepal.Length "Sepal.Length"
                             :Sepal.Width "Sepal.Width"
                             :Petal.Length "Petal.Length"
                             :Petal.Width "Petal.Width"
                             :Species "Species"
                              } iris-ds))

        
(println (r-transform r iris-ds "mlr.R"))
## [0.04 0.96]

The R code in ‘mlr.R’ code looks likes this: (Notice the usage of ‘in_’ and ‘Out_’ for input and output data.frames.)

library(mlr)
task = makeClassifTask(data = in_, target = "Species")
lrn = makeLearner("classif.lda")
n = nrow(in_)
train.set = sample(n, size = 2/3*n)
test.set = setdiff(1:n, train.set)
model = train(lrn, task, subset = train.set)
pred = predict(model, task = task, subset = test.set)
out_ <- performance(pred, measures = list(mmce, acc))

Transforming of S3 classes

R functions which return S3 objects gets returned to Clojure as maps.

(use '(rojure core))
(use '[clojure.core.matrix :as m])
(def r (get-r))
(println (r-eval r "sessionInfo()"))
## {R.version {platform [x86_64-pc-linux-gnu], arch [x86_64], os [linux-gnu], system [x86_64, linux-gnu], status [], major [3], minor [4.1], year [2017], month [06], day [30], svn rev [72865], language [R], version.string [R version 3.4.1 (2017-06-30)], nickname [Single Candle]}, platform [x86_64-pc-linux-gnu (64-bit)], locale [LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C], running [Arch Linux], basePkgs [stats graphics grDevices utils datasets methods base], loadedOnly {compiler {Package [compiler], Version [3.4.1], Priority [base], Title [The R Compiler Package], Author [Luke Tierney <luke-tierney@uiowa.edu>], Maintainer [R Core Team <R-core@r-project.org>], Description [Byte code compiler for R.], License [Part of R 3.4.1], Built [R 3.4.1; ; 2017-09-08 12:26:18 UTC; unix]}}, matprod [default], BLAS [/usr/lib/libblas_nehalemp-r0.2.19.so], LAPACK [/usr/lib/liblapack.so.3.7.1]}

Clojure -> R

Clojure objects, can be set in the R session. This works for:

  • sequences / vectors
  • core.matrix dataset
  • core.matrix matrix

R->Clojure

Getting any data.frame from R returns it as object of type core.matrix dataset.

(use '(rojure core))
(use '[clojure.core.matrix :as m])
(def r (get-r))
(def iris (r-get r "iris"))
(println "shape: " (m/shape iris))
(println "class: " (class iris))
## shape:  [150 5]
## class:  clojure.core.matrix.impl.dataset.DataSet

The same works for matrices, which becomes a data structure on which the functions from clojre.core.matrix can be called.

S3 classes gets converted to maps-of-maps, for example the class “lm” produced by the R function “lm”.

(use '(rojure core))
(def r (get-r))
(println (keys (r-eval r "lm(Fertility ~ . , data = swiss)")))

(println (get (r-eval r "lm(Fertility ~ . , data = swiss)")
     "coefficients"))
## (coefficients residuals effects rank fitted.values assign qr df.residual xlevels call terms model)
## [66.91518167896871 -0.1721139709414554 -0.2580082398347245 -0.8709400629394237 0.10411533074376746 1.0770481406909869]

Anecdote

This blogs is written in R markdown, containing executable code blocks written in Clojure which call again R….. So R calls Clojure calls R ….