Using R in Rails

In the era of big data, using the right tools for statistical analysis is essential. A well written R script does a great job of obtaining computational statistics from a data set.

First, it's important to understand what R is. You can read up on it here but it's a language and environment for statistical computing and graphics. Imagine taking tons of data and plotting and/or analyzing it. Below we'll talk about how we took this awesome tool and integrated it into a ruby on rails site for a client of ours.

Integrating an R script into a Rails app is fairly easy. You’ll first need to make sure R is installed on there server where the Rails app will be running. You can downlad it here. Once R is installed, the ‘rinruby’ gem makes it possible to run a script written in R from within a Rails app. Here’s how...

First intall the gem ‘rinruby’ and add it to your Gemfile. Then require rinruby in the controller that needs it.

# Gemfile
gem 'rinruby'
# main_controller.rb
require 'rinruby'

To keep our code base organized, we'll place the R script in a helper. You can also reference an external file. After including the helper in the controller, we need to create a new RinRuby connection. Then we can evaluate single line or multiline R commands.

# Multi line R command with rinruby in sample_helper.rb
def r_script
r = RinRuby.new # establishing a new RinRuby connection
r.eval <<-EOF
# multiline R script
# goes here
EOF
# single line R command with rinruby
r.eval ‘R command here’

In order to perform data analysis, we need to load data for the R script to analyze. Below is an example of a basic matrix. It can be loaded through an external file, commonly a csv, or by loading the data frame from variables.

col_1 col_2 col_3
row_1 a 6.9 9
row_2 b 7.3 6
row_3 c 4,7 8

Loading from a file is fairly simple...

r.eval “mydata <- read.csv(‘filename.txt’)”

Loading from a database requires a few more steps. Let's load the data through Active Record so it is interpreted as the above matrix. When loading from Active Record results, a few things must be kept in mind. Data types (number, character or logical) within a column must be the same. It is often easier to load by column so that enclosing string values in quotes can be applied to an entire column set at one time.

Here is how we could load data from the last 3 records.

# Obtain last 3 records from Record model
data = Record.last(3)

# Add single quotes around each string value
col_1 = data.pluck(:col_1).map{|c1| “‘#{c1}’”}
col_2 = data.pluck(:col_2)
col_3 = data.pluck(:col_3)

#Then we can load each column into a vector...
r.eval “col_1 <- c(#{col_1})”
r.eval “col_2 <- c(#{col_2})”
r.eval “col_2 <- c(#{col_3})”

# Load each column vector into a data frame and display the data as a matrix...
r.eval "sample_data <- data.frame(col_1, col_2, col_3, row.names=c('row_1', 'row_2', 'row_3'))"

# Now we can display sample data matrix
r.eval “head(sample_data)”

To reference an external file, you can use the following syntax...

R.eval ‘source(“#{Rails.root}/assets/randomscript.R”)’

Now we can can run the R script by calling the helper method that contains it.

def run_r_script
# run R script in helper
@results = r_script
end

Once the script has run, you can pull results out of the R interpreter with the ‘pull’ command. For example, if the final results are returned in the R script with: ‘return(final)’, then you can retrieve that R variable in Ruby with r.pull ‘final’

Since we ran the R script in a helper, we can return the result set so it is available as a Ruby object, and then we’ll close the R connection.

return r.pull 'final'
r.quit
r = RinRuby.new(false)
end

So now the returned value for the ‘r_script’ helper method is the result set stored in the R script variable 'final'. Since we assigned those results to an instance variable, they are available in the corresponding view. The results will come back as a matrix, but if you prefer you can convert them to an array.


@results = r_script.to_a


Interested in learning more? Our resident R / Ruby on Rails guru MJ will be doing a lunch and learn at our office this Friday at 12:30pm. We'll have pizza and you can come hang out w/ the team and learn more about R. Our address:


675 Drewry St. NE
Studio 4
Atlanta, GA 30306

MJ
Dec. 08 2014