Eclipse and StatET – a working environment for R

In the endless search to find an interface for the R statistics package that recreates the features of my favorite Matlab development environment, I finally ran across the Eclipse and StatET combination. The Eclipse project produces the program that acts as the integrated development environment (also useful for Java development, php development, perl scripting etc.). To integrate your R installation into this development environment, you use the StatET plug-in for Eclipse, available at http://www.walware.de/?page=/it/statet/. The result looks like the picture below:

The Eclipse/StatET development environment for R. There aren't normally crudely drawn red circles on it.

The upshot of this is that you get a development environment that has the normal R console that you’re familiar with, plus a window to do your script file editing (with colored syntax highlighting and syntax checking), and a list of the environmental variables (i.e. your data frames, arrays, lists etc) that exist in the current R session in the Object Browser.

Here’s how you get it set up and running (on Windows 7 in this case, there are also OS X and Linux flavors available):

The first thing to do is fire up your normal R installation and download the rJava package, which is available on the R servers like every other package. Once you’ve done this, you can close R. If you’ve used R for more than five minutes, you’ve probably downloaded packages before, so I’ll skip the details.

Download the Eclipse package from http://www.eclipse.org/downloads I used the “Eclipse IDE for Java EE Developers,” package (about 190MB). This version includes tools for doing other stuff like html and php editing.

Unzip the Eclipse files, open the folder, and run eclipse.exe.

When you arrive at the Eclipse intro screen, choose the icon that takes you to the “Workbench”.

The splash screen for Eclipse. Click on the "Workbench" icon in the upper right to continue.

At the Workbench, Go to the Help Menu and choose Install New Software.

Go to Help>Install new software.

On the new window,  hit the “Add” button on the right.

Add the StatET site.

In the window that appears, paste in the Walware site address (see their install instructions here)  in the Location field. Leave the Name field blank. Hit OK.

Enter the StatET site address.

Back on the Install New Software window, click on the “Work with:” menu and choose the Walware site from the list of available sites. A set of objects should appear in the lower window. Check the box next to each of the items, and then hit Next. Go through the install windows, agree to all the licenses and what not, and restart Eclipse when it prompts you.

Select each item in the list to install.

Now, you still can’t do anything with R until you create a new project. Go to File>New>Project.

Create a new project.

In the New Project window, click on the StatET folder, and choose R-Project, then hit Next.

Select the StatET folder and click on the R-Project item.

Give your project a name and choose a location for this project file to live (the default location may be fine, or you may want to point it at the directory where you R data live). Hit Finish. You will get a message that this project is “associated with the StatET perspective”. Tell it that you want this to be the default and hit OK.
Next we’ll set it up so that an R console runs within this Eclipse environment.

Go to Window>Preferences. In the window that opens, click on the StatET folder, then on Run/Debug and R Environments. In the new display on the right, click the Add button.

Setting up the R environment.

In the window that appears “Add R Environment Configuration”, give the R environment some name. In the location field, click on the + sign on the right side and choose Browse Filesystem. Go to the folder where you current R installation lives. In my case, it’s the R-2.10.1 folder that contains all the R system files.

Tell Eclipse where your R installation lives.

My R installation lives in the R-2.10.1 folder.

Back on the “Add R Environment Configuration” window, hit the “Detect Default Properties/Settings, and then hit OK. Hit Apply near the bottom of the main preferences window.

On the left side of the Preferences window, click on the R Interaction item, just below the R Environments item we just worked with. Choose “New Console inside Eclipse”.
Hit OK to exit the Preferences window.

Set this to use a New Console inside Eclipse, leave the other items on their default values.

Back on the main Eclipse workbench, we’ll now fire up an R Console. Go to Run>Run Configurations. Click on the R Console item on the left, then click on the ‘New’ icon above it (looks like a piece of paper with a plus sign in the upper right). This will open a new display on the right. (Incidentally, when you start a new Eclipse session in the future, you can launch the R console by going to the Run>Run History menu and just choosing the name of the R console you set up previously).

Name your run configuration something useful, leave Launch Type as RJ (RMI/JRI), and set a working directory.

First name your new configuration something besides “New_configuration” in the Name field. On the Main Tab, leave the Launch Type set as RJ (RMI/JRI). In the Working Directory field, navigate to your normal R working directory. In the Options/Arguments field, you could enter some startup commands that you would like to be run every time you fire up this R console. Maybe that would include loading a library or running a source file.

Now click on the R Console tab, where you have a few further options. “Pin initial console automatically” is fine to use. You can tell Eclipse to keep track of your history and save a transcript of the session as well in the second field. And down at the very bottom, there is a section called “Object DB”. Make sure this is checked (Enable) and that “Refresh DB automatically” is checked. This Object DB is the function that will list all of your variables that are in memory while you’re working in R, so you’ll want it to be functional.

Make sure the Object DB check boxes are checked.

Click Apply and Run to start the new R Console. Wait a few seconds while the R console fires itself up. The normal output from R will now show up in the “console” window that appears in the lower portion of the workbench. In the Object Browser on the lower left, you should get a list of environmental variables from the current R session. This will include all the packages that you have installed and running (such as the base stats, graphics, MASS etc) and it will also have a “.GlobalEnv” item at the top.

Now we have a R console running, along with the Object Browser showing the items in memory.

Once you’ve created some objects in memory, such as data frames or numeric arrays or whatever, they will appear under the .GlobalEnv item. If you mouse-over a particular variable, it will show you a summary of the contents of that item (similar to when you use the str() command in the normal R environment.

The Object Browser showing some of the objects I entered in my R console, including a data frame and some character variables.

Incidentally, if you want to edit the contents of a data frame, you’ll still need to use the
> fix(dataframename)
or
> newdata <- edit(dataframename)
commands to make your changes. This functions just like it did in the R-Gui window, but you obviously can do it from the command prompt in your Eclipse/StatET environment.

Note that you can use the Object browser to filter certain objects using the search field, or you can choose to only display objects of a certain type by clicking the down-arrow on the upper left of the Object Browser window. If you only want to see your objects and not all the packages you have installed, choose the “Show non-package Variables only” item.

Use this option to hide all of the objects contained in the R packages you have installed. This leaves only the objects that you create during your R session.

Having the package variables displayed can be helpful sometimes, because it lets you stroll through all the functions available in each package. For example, if I click on the ‘ncdf’ package, it shows me the various functions/commands that are available to me from that package. Maybe you’ll discover an interesting function that you didn’t previously know existed.

A selection of the various functions contained in the ncdf package.

_______________________________
The last thing to set up is the script editing window. If you’ve used R much, you know that it’s often easiest to write scripts, i.e. lists of commands, in a file that you then submit to the R console to process. The Eclipse/StatET environment provides a script editor.

To write a new .R file, go to File>New>R-Script File. In the window that pops up, choose a folder to save this new file in. This will probably be the folder for your R environment that you created. In the “File name” field below, enter a name for you new .R file, and hit Finish.

Create a new R-Script file. Choose a destination folder and give the script file a name.

Now a new file will open up in the R script editor on the work bench. This provides colored syntax highlighting, inserts closing parentheses and quotes, highlights possible syntax errors and so on. You can submit the whole script or just a highlighted section using the Run menu.

The script editor with syntax highlighting.

_______________________

Working in the Eclipse/StatET environment provides you with some extra features over just working in a normal R-Gui console. The biggest one is that you can “filter” through previous commands. You’ll recall that in the normal R-Gui, if you hit the Up Arrow while sitting at the command prompt, you can scroll through all the previous commands you’ve entered, in order. But sometimes the command you want to get to is quite a while back and you don’t want to have to type the whole thing again. In the Eclipse/StatET R console, you can type the first few letters of the command, then hit Ctrl + Alt + Up Arrow to cycle through the previously-used commands that start with those letters. If there are multiple potential matches, you just keep hitting Ctrl + Alt + Up Arrow until you find the right one.

For example, in the image below, I entered “str(data)” several steps previously, as you can see at the top of the console output. On the command prompt at the bottom, I’ve typed “str” again. If I now hit Ctrl + Alt + Up Arrow, Eclipse/StatET will automatically fill in “str(data)” and I can just hit Return to run that command. This is more useful than just hitting the Up Arrow repeatedly when you’re trying to re-enter long commands that you entered many steps back.

A view of the R console in Eclipse, and the command prompt at the bottom.

Hitting Ctrl + space bar at the command prompt shows you a list of objects that you currently have in memory as well. If you type the first few letters of an object name, Ctrl + space bar will only show you the potential matches that have those letters.

You’ll also notice that Eclipse/StatET will do things like automatically fill in closing parentheses and closing quotes for you when you start typing those sorts of things, which can save you a few keystrokes here and there.

Some things will remain unchanged from how they worked when you used the standard R-Gui. When you plot data, the new plots spawn in their own window outside of the Eclipse environment, just like you’re used to.  If you use the fix() or edit() commands to get to the data editor, the data editor window will also pop up outside the Eclipse environment, just like when you run R from the normal R-Gui.

Finally, for normal usage, you’ll now start Eclipse rather than starting R by itself. Once your Eclipse environment is up and running, you run R from within the Eclipse environment, as outlined above.