Monthly Archives: November 2014

Improving scientific graphics

I came across the blog post http://juretriglav.si/standards-for-graphic-presentation/ and I was surprised to find that there were actually standards for creating scientific graphics and they were created in 1914. 100 years ago! Wow!

Generally graphics are not thought as important as the textual components of the paper/book/blog post. This is a mistake. Prof. Till Tantau’s (creator of the excellent TikZ/PGF graphics system) suggestions for creating good graphics, under section Guidelines of Graphics in http://www.texample.net/media/pgf/builds/pgfmanualCVS2012-11-04.pdf are also definitely worth reading. Check out the TikZ/PGF gallery at http://www.texample.net/tikz/examples/.

Getting familiar with R’s help system.

So you’ve decided to seriously try R for your analysis. Now the question is how do you get comfortable in using it? Or more importantly how do you use R’s help system. I have some friends who really want to become more proficient at R but are not sure how to begin. This is my feeble attempt to provide some pointers for them to consider. Maybe you, also, might find some of it useful.

First of all, try to use something more user friendly. Like RStudio (http://www.rstudio.com/). You can switch to base R console (or OS console) once you’re comfortable with R. I use base R console only because when I learned R (starting around 2007) there was no RStudio and now I don’t see any benefit of switching to it.

Let’s start exploring. So, now you have an R-session open and you’re staring at a blank area. You think you need help? Damn right, you need help!

  1. Type ?help in your session. A window or a webpage will popup. This is what popped up for me!

    help

  2. The top left of that page shows the function and package that provides this function. In our case help comes from utils package.
  3. If you really don’t know how to do stuff in R, just ask google. Since I don’t know how to read csv file in R I searched for read csv file in R I learned that I have to use read.csv . So, I looked up ?read.csv . And, lo and behold, it tells me that I’m reading help about read.table{utils} from the utils package.
  4. Just remember this for now.
  5. Scroll to the Usage section of the document and see how it’s defined and all the other fancy stuff. Don’t worry if it doesn’t make too much sense.
  6. Now scroll down to the bottom until you reach Examples section.
  7. Everything you see in the Examples section can be tried out in an active R session. Try out the examples listed there by manually typing them in the R session.
  8. I know I know. It is bogus!! There’s an easier way.
  9. Type example(read.csv) and see all the code in the Examples section executed in your session! This is one of the reasons why I really like R a lot!.

Just remember that google, ? OR help, and example can help you a lot. Just try to create a mental map of the function name and the package where it came from and you’re well on your way to becoming a competent R user/programmer.

Grouping in R using data.table

Check out http://www.ats.ucla.edu/stat/r/faq/firstlast.htm

Now compare the steps below.

> options(prompt="R=> ")
R=> hsb2 <- read.csv('http://www.ats.ucla.edu/stat/r/faq/hsb2.csv')
R=> library(data.table) # ?data.table
R=>
R=> hsb2dt <- as.data.table(hsb2)
R=> setkey(hsb2dt, math) # sort by math... See ?setkey
R=>
R=> highest <- hsb2dt[, tail(.SD, 1), by=prog] # ?tail
R=> highest
   prog  id female race ses schtyp read write math science socst
1:    3 143      0    4   2      1   63    63   75      72    66
2:    1 169      0    4   1      1   55    59   63      69    46
3:    2 200      0    4   2      2   68    54   75      66    66
R=>
R=> lowest <- hsb2dt[, head(.SD, 1), by=prog]
R=> lowest
   prog  id female race ses schtyp read write math science socst
1:    3   2      1    1   2      1   39    41   33      42    41
2:    1 167      0    4   2      1   63    49   35      66    41
3:    2 128      0    4   3      1   39    33   38      47    41
R=>
R=> quit()

Checkout data.table at https://github.com/Rdatatable/data.table/wiki