Tuesday, 16 September 2014

Notes from the Kölner R meeting, 12 September 2014

Last Friday we had guests from Belgium and the Netherlands joining us in Cologne. Maarten-Jan Kallen from BeDataDriven came from The Hague to introduce us to Renjin, and the guys from DataCamp in Leuven, namely Jonathan, Martijn and Dieter, gave an overview of their new online interactive training platform.


Maarten-Jan gave a fascinating introduction to Renjin, an R interpreter in the Java virtual machine (JVM).
Why? Suppose all your other application are in the Java ecosystem, than an R engine in the JVM can use your tools for profiling/debugging, project/dependency management, release/repository management, continuous integration, component lifecycle management, etc. Additionally, it would allow you to host R applications in the cloud via services such as Google's App Engine, without the need to manage your own server.

Renjin's base is meant to be 100% compatible with R base version 2.14.2. Currently about 3/4 of all R primitive functions are implemented. Still, Renjin looks quite powerful already. You can test Renjin on Google's app engine:

You find Maarten-Jan's slides on the Renjin web site, which is also the best place to get started. If you want to dive deeper into the sources, then vist the Renjin repository on GitHub.

By the way, Renjin is not the only R engine that emerged alongside GNU R, other popular engines are pqR, fastR and TERR, all with their individuals aims and purposes.


Over the last year the guys behind DataCamp have created a lot of momentum in the online R training space. Alongside their main site DataCamp.com they also developed and maintain Rdocumenation.org and R-Fiddle.org.

The slides can be accessed here.

DataCamp ambitions are big, they want to make technical training scalable. Instructors can create course work in RMarkdown and host them on DataCamp. The student can follow the interactive course and carry out exercises as they go along. The clever software behind DataCamp provides immediate feedback to the student, without the need for the instructor to mark their homework. DataCamp provides access to some exciting free courses already, with premium courses on special subjects such as data.table to launch soon.

Next Kölner R meeting

The next meeting is scheduled for 12 December 2014.

Please get in touch if you would like to present and share your experience, or indeed if you have a request for a topic you would like to hear more about. For more details see also our Meetup page.

Thanks again to Bernd Weiß for hosting the event and Revolution Analytics for their sponsorship.

Tuesday, 9 September 2014

Next Kölner R User Meeting: Friday, 12 September 2014

Koeln R
The next Cologne R user group meeting is scheduled for this Friday, 12 September 2014.

We have a great agenda with international speakers:
  • Maarten-Jan Kallen: Introduction to Renjin, the R interpreter for the JVM
  • Jonathan Cornelissen, Martijn Theuwissen: DataCamp - An online interactive learning platform for R
The event will be followed by drinks and schnitzel at the Lux.

For further details visit our KölnRUG Meetup site. Please sign up if you would like to come along. Notes from past meetings are available here.

The organisers, Bernd Weiß and Markus Gesmann, gratefully acknowledge the sponsorship of Revolution Analytics, who support the Cologne R user group as part of their vector programme.

View Larger Map

Tuesday, 2 September 2014

Zoom, zoom, googleVis

The Google Charts API is quite powerful and via googleVis you can access it from R. Here is an example that demonstrates how you can zoom into your chart.

In the example below I set the maximum zoom level to 5% of the chart. Drag and pan with a left mouse button to zoom in; use a right mouse click to zoom out again. The functionality is available in other core charts as well, such as line, column and bar charts. For more configuration options of the explorer settings visit the Google documentation.

R code

Tuesday, 26 August 2014

ChainLadder 0.1.8 released

Over the weekend we released version 0.1.8 of the ChainLadder package for claims reserving on CRAN.

What is claims reserving?

The insurance industry, unlike other industries, does not sell products as such but promises. An insurance policy is a promise by the insurer to the policyholder to pay for future claims for an upfront received premium.

As a result insurers don't know the upfront cost for their service, but rely on historical data analysis and judgement to predict a sustainable price for their offering. In General Insurance (or Non-Life Insurance, e.g. motor, property and casualty insurance) most policies run for a period of 12 months. However, the claims payment process can take years or even decades. Therefore often not even the delivery date of their product is known to insurers. The money set aside for those future claims payments are called reserves.

Over the years several methods and models have been developed to estimate both the level and variability of reserves for insurance claims, see [1] or [2] for an overview.

In practice the Mack chain-ladder and bootstrap chain-ladder models are used by many actuaries along with stress testing / scenario analysis and expert judgement to estimate ranges of reasonable outcomes, see the surveys of UK actuaries in 2002 [3], and across the Lloyd's market in 2012 [4].

The ChainLadder package provides various statistical methods and models which are typically used for the estimation of outstanding claims reserves in general insurance. You can get a very brief overview on the package and reserving from my R in Finance lightning talk:

The package vignette [5] gives more details about the various models and methods implemented.

More context and theory is given in the chapter Claims reserving and IBNR of [6], including the log-linear model of [7] and [8] I discussed earlier on my blog.

Claims reserving is an active field of research as can be seen by the programme of the R in Insurance conference.


Version 0.1.8 fixes:
  • BootChainLadder produced warnings for triangles that had static developments when the argument process.distr was set to "od.pois"
  • as.triangle.data.frame didn't work for a data.frame with less than three rows
  • Arguments xlab and ylab were not passed through in plot.triangle when lattice=TRUE

Tuesday, 19 August 2014

googleVis 0.5.5 released

Earlier this week we released googleVis 0.5.5 on CRAN. The package provides an interface between R and Google Charts, allowing you to create interactive web charts from R. This is mainly a maintenance release, updating documentation and minor issues.

Screen shot of some of the Google Charts

New to googleVis? Review the examples of all googleVis charts on CRAN.

Perhaps the best known example of the Google Chart API is the motion chart, popularised by Hans Rosling in his 2006 TED talk.

Tuesday, 12 August 2014

GrapheR: A GUI for base graphics in R

How did I miss the GrapheR package?

The author, Maxime Hervé, published an article about the package [1] in the same issue of the R Journal as we did on googleVis. Yet, it took me a package update notification on CRANbeeries to look into GrapheR in more detail - 3 years later! And what a wonderful gem GrapheR is.

The package provides a graphical user interface for creating base charts in R. It is ideal for beginners in R, as the user interface is very clear and the code is written along side into a text file, allowing users to recreate the charts directly in the console.

Adding and changing legends? Messing around with the plotting window settings? It is much easier/quicker with this GUI than reading the help file and trying to understand the various parameters.

Here is a little example using the iris data set.
This will bring up a window that helps me to create the chart and tweak the various parameters.

Once I am happy with my configuration I hit DRAW and R will create the chart for me.

Finally, I find the underlying R code in a file created by GrapheR. For more details read also the package vignette, which is available in English, French and German!

Tuesday, 5 August 2014

Thanks to R Markdown: Perhaps Word is an option after all?

In many cases Word is still the preferred file format for collaboration in the office. Yet, it is often a challenge to work with it, not so much because of the software, but how it is used and abused. Thanks to Markdown it is no longer painful to include mathematical notations and R output into Word.

I have been using R Markdown for a while now and have grown very fond of it. Although I am quite happy with PDF and HTML output for basic reports and to switch to Sweave/LaTeX for more complex documents, I was pleasantly surprised to learn that the new version of RStudio can produce MS Word files directly from R Markdown as well; thanks to the power of pandoc. Perhaps Word is an option after all?