This is an interview with Jeffrey D. Long, Professor in the Department of Psychiatry, Carver College of Medicine at the University of Iowa (USA) and author of the book “Longitudinal Data Analysis for the Behavioral Sciences Using R“. Dr. Long answers questions about his book and how he uses R in his work in behavioral sciences.
F4S: Hello Jeffrey. Please, give us a brief introduction about yourself.
I am a Professor in the Department of Psychiatry, Carver College of Medicine, University of Iowa (USA). My expertise is applied statistics in the behavioral and medical sciences. I am the head statistician for Neurobiological Predictors of Huntington’s Disease (PREDICT-HD), funded by the National Institutes of Health and the CHDI Foundation, Inc. PREDICT-HD is a longitudinal observational study of individuals at-risk for Huntington’s Disease (HD), which is an inherited neurodegenerative disease. PREDICT-HD has several scientific sections that concentrate on different aspects of HD, including brain imaging, cognitive functioning, motor impairment, and psychiatric problems. My biostatistics team analyzes data from the scientific sections to answer substantive research questions.
F4S: How did you get involved with open source software?
I became involved with open source software through two routes: teaching and my daily professional activities. I taught applied statistics courses for many years to students with diverse backgrounds working in various setting. Commercial software was a concern because of its cost and availability, and I turned to the open source R system for statistical computing and graphics (http://www.r-project.org/). My professional activities involve statistical analysis and writing reports. I needed an integrated process that will handle both aspects. This is provided through R, and the other open source packages, Emacs (http://www.gnu.org/software/emacs/), Emacs Speaks Statistics (ESS; http://ess.r-project.org/), and Sweave (http://www.stat.uni-muenchen.de/~leisch/Sweave/).
The integrated components allow a program file to have both R and LaTex syntax interweaved. The statistical analysis is performed in a pre-compilation step, and the results are embedded in the Tex file, which is compiled to produce the final PDF report. The method is highly desirable in my work, as databases are updated continually requiring the updating of reports. To the best of my knowledge, this cannot be performed in commercial software packages, as there is little or no cooperation.
F4S: Tell us the story behind your book “Longitudinal Data Analysis for the Behavioral Sciences Using R“.
The book is a culmination of teaching an applied longitudinal data analysis course for many years. There are relatively few people who use R in the behavioral sciences and I wanted to emphasize that it is a good option for researchers and students. R is free software that offers add-on packages authored by world leading authorities in longitudinal data analysis and graphing.
F4S: Who will benefit from reading it?
The beneficiaries are researchers and students who want a practical introduction to the analysis of longitudinal behavioral data. Longitudinal data is defined as repeated measurements taken on the same subjects over time. Such data is correlated within an individual and this should be considered in statistical modeling. The book focuses on the application of linear mixed effects regression (LMER). The book is specifically written for those who want a hands-on approach to using R for LMER modeling. The software and the data sets can be freely downloaded and all the examples can be reproduced on the reader’s own computer.
F4S: How will you describe your experience writing the book?
The joy of the book was digging deeper into the open source software. The entire book was written using the integrated components that I previously mentioned. I learned many tips and tricks along the way that have increased efficiency in my daily professional work.
F4S: Have you published other FLOSS related books?
Yes. Two former students (A. Zieffler and J. Harring) and I have published another book focusing on the application of R for the statistical comparison of groups: “Comparing Groups: Randomization and Bootstrap Methods Using R”.
F4S: Do you have plans for other books?
Not in the near-future.
F4S: Why is free/libre open source scientific software important for your field?
For students, open source is important because it makes world-class statistical software available regardless of one’s personal or institutional resources (except for an Internet connection). For professionals in my field, open source means that software can be made available simultaneously with new methods that are introduced in professional publications. The result is that new and useful tools can be used immediately for researchers who need them.
F4S: Which projects, books, blogs or sites related to open source software for science can you recommend?
There are many websites I regularly consult for help with R, LaTex, Emacs, and ESS. I will list the website by topic.
F4S: Where people can contact you?
The book website is http://www.sagepub.com/books/Book234770?siteId=sage-us&prodTypes=any&q=longitudinal. R code and data sets are available on the “Supplements” tab.
F4S: Thanks for agreeing to the interview, Jeffrey.
R in a Nutshell provides a quick and practical way to learn this increasingly popular open source language and environment. You'll not only learn how to program in R, but also how to find the right user-contributed R packages for statistical modeling, visualization, and bioinformatics.
- Analysis of Questionnaire Data with R, an interview with author Bruno Falissard
- Video: Pandas data analysis package, design and development
- Interview: Jesper Schmidt Hansen, author of GNU Octave Beginner’s Guide
- Interview with Ivan Idris author of NumPy 1.5 Beginner’s Guide
- A Primer on Scientific Programming with Python, an interview with author Hans Petter Langtangen