Introduction to R
========================================================
author: Martin Morgan (mtmorgan@fhcrc.org), Fred Hutchinson Cancer Research, Center, Seattle, WA, USA. 
date: 24 August 2014

```{r setup, include=FALSE}
options(width=44)
opts_chunk$set(cache=TRUE)
```

Outline
========================================================

Part I
- Vectors (data)
- Functions
- Help!

Part II
- Classes (objects)
- Generics & methods
- Help!

***

Part III
- Packages
- Help!

Part I: Vectors (data)
========================================================

```{r}
1                # vector of length 1
c(1, 1, 2, 3, 5) # vector of length 5
```

Part I: Vectors (data)
========================================================

- logical `c(TRUE, FALSE)`, integer, numeric, complex, character 
  `c("A", "beta")`
- list `list(c(TRUE, FALSE), c("A", "beta"))`
- Statistical concepts: `factor`, `NA`

Assignment and names
```{r}
x <- c(1, 1, 2, 3, 5)
y = c(5, 5, 3, 2, 1)
z <- c(Female=12, Male=3)
```
- `=` and `<-` are the same

Part I: Vectors (data)
========================================================

Operations
```{r}
x + y        # vectorized
x / 5        # ...recylcing
x[c(3, 1)]   # subset
```

Part I: Functions
========================================================

Examples: `c()`, concatenate values; `rnorm()`, generate random normal deviates; `plot()`

```{r}
x <- rnorm(1000)    # 1000 normal deviates
y <- x + rnorm(1000, sd = 0.5)
```
- Optional, named arguments; positional matching
```{r}
args(rnorm)
```

Part I: Functions
========================================================

```{r}
plot(x, y)
```
- `formula`: another way `plot(y ~ x)`

Part I: Help!
========================================================

Within R
```{r, eval=FALSE}
?rnorm
```
Rstudio
- "Help" tab, search for "rnorm"

Main sections
- Title, Description, Usage, Arguments, Details, Value (result), See also, 
  Examples

Part II: Classes (objects)
========================================================

Motivation: manipulate complicated data
- e.g., `x` and `y` from previous example are related to one another --
  same length, element i of y is a transformation of element i of x
  
Solution: a "data frame" to coordinate access
```{r}
df <- data.frame(X=x, Y=y)
head(df, 3)
```

Part II: Generics & methods
========================================================

```{r}
class(df) # plain function
dim(df)   # generic & method for data.frame
head(df$X, 4)  # column access
```

Part II: Generics & methods
========================================================

```{r}
## create or update 'Z'
df$Z <- sqrt(abs(df$Y))
## subset rows and / or columns
head(df[df$X > 0, c("X", "Z")])
```

Part II: Generics & methods
========================================================
```{r}
plot(Y ~ X, df) # Y ~ X, values from 'df'
## lm(): linear model, returns class 'lm'
fit <- lm(Y ~ X, df)
abline(fit)  # plot regression line
```

Part II: Generics & methods
========================================================
```{r}
anova(fit)  
```

Part II: Generics & methods
========================================================
- `fit`: object of class `lm`
- `anova()`: generic, with method for for class `fit`
```{r}
methods(anova)
```

Part II: Help!
========================================================

```{r, eval=FALSE}
## class of object
class(fit)

## method discovery
methods(class=class(fit))
methods(anova)

## help on generic, and specific method
?anova
?anova.lm
```

Part III: Packages
========================================================
Installed
- Base & recommended
- Additional packages
```{r}
length(rownames(installed.packages()))
```
Available
- [CRAN](http://cran.r-project.org/web/packages/available_packages_by_name.html),
  [Bioconductor](http://bioconductor.org/packages/release/BiocViews.html#___Software);
- Also: [github](http://github.com), [rforge](https://r-forge.r-project.org/), ...

Part III: Packages
========================================================
'Attached' (installed and available for use):
```{r, eval=FALSE}
search()            # attached packages
ls("package:stats") # functions in 'stats'
```
Attaching (make installed package available for use)
```{r, eval=FALSE}
library(ggplot2)
```
Installing CRAN or Bioconductor packages
```{r, eval=FALSE}
source("http://bioconductor.org/biocLite.R")
biocLite("GenomicRanges")
```

Part III: Help!
========================================================

Packages
- Available packages. CRAN:
  [Package index](http://cran.r-project.org/web/packages/available_packages_by_name.html),
  [Task Views](http://cran.r-project.org/web/views/); Bioconductor: [BiocViews](http://bioconductor.org/packages/release/BiocViews.html#___Software)
- Package descriptions ('landing pages'), e.g.,
  [ggplot2](http://cran.fhcrc.org/web/packages/ggplot2/index.html),
  [GenomicRanges](http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html)
- Vignettes: narrative descriptions of how to use the package, e.g.,
  [minfi](http://bioconductor.org/packages/release/bioc/html/minfi.html)
  
Part IV: Help!
========================================================

Best bet
- Other R users you know!

R
- [StackOverflow](http://stackoverflow.com/questions/tagged/r) search for `[R]`;
  R-help [mailing list](http://www.r-project.org/mail.html)

Bioconductor
- [Web site](http://bioconductor.org)
- [Mailing list](http://bioconductor.org/help/mailing-list/)
- Soon: [support site](http://support.bioconductor.org)

Acknowledgements
========================================================
Funding
- US NIH / NHGRI 2U41HG004059; NSF 1247813

People
- Seattle Bioconductor team: Sonali Arora, Marc Carlson, Nate Hayden,
  Valerie Obenchain, Herv&eacute; Pag&egrave;s, Dan Tenenbaum
- Vincent Carey, Robert Gentleman, Rafael Irizzary, Sean Davis, Kasper Hansen,
  Michael Lawrence, Levi Waldron