---
title: "AdductomicsR workflow"
author: "Josie Hayes"
date: '`r format(Sys.Date(), "%B %e, %Y")`'
output: rmarkdown::html_vignette
vignette: >
    %\VignetteIndexEntry{Adductomics workflow}
    %\VignetteEngine{knitr::rmarkdown}
    %\VignetteEncoding{UTF-8}
---


# Getting Started


```{r, eval=FALSE}
#ensure you have mzR installed
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("mzR", version = "3.8")

# install the package directly from Github
library(devtools)
devtools::install_github("JosieLHayes/adductomicsR")

#install the data package containing the data 
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("ExperimentHub", version = "3.9")

#or download the packages and install from source
library(devtools)
devtools::install("path_to_dir/adductomicsR")
devtools::install("path_to_dir/adductData")

```

After installation of the adductomics package and all dependencies attach
the adductomics package by typing (copying and pasting) 
this line of code into the R console and hitting enter:

```{r, eval=FALSE}
# load the package
library(adductomicsR)
library(adductData)
library(ExperimentHub)
```

```{r, echo=FALSE}
suppressMessages(suppressWarnings(library(adductomicsR)))
suppressMessages(suppressWarnings(library(ExperimentHub)))

```

We have provided 2 mzXML files for use in this vignette in adductData.

# Preparation of the data
Mass drift correction: Usually mass drift is corrected using lock masses on
the mass spectrometer.
If this has not been done a python script is provided in the directory in 
which the package is saved on your computer at
/inst/extdata/thermo_MassDriftCalc.py and can be launched from within python 
using (replace the path to the python script in your system):
`exec(open(“thermo_MassDriftCalc.py“).read())`

# Retention time correction

Each sample is corrected for retention time drift using the rtDevModeling 
function. To run this with the default parameters enter the path of the
directory containing your mzXML files and the run order file (order in which
samples were run). For further information on parameters see
??rtDevModelling. An example run order file is available in inst/extdata
(within the directory where the package is saved on your computer) and 2
mzXML files are available in adductData/ExperimentHub.These files will
be used in this vignette automatically.

Download the mzXML files from ExperimentHub for use in this vignette. They 
must have .mzXML to be recognized by the package so they are renamed as 
well.

```{r, quiet=TRUE, error = TRUE}
eh  = suppressMessages(suppressWarnings(ExperimentHub::ExperimentHub()))
temp = suppressMessages(suppressWarnings(
AnnotationHub::query(eh, 'adductData')))
```

```{r, quiet=TRUE, error = TRUE}
suppressMessages(suppressWarnings(temp[['EH1957']])) #first mzXML file
file.rename(cache(temp["EH1957"]), file.path(hubCache(temp),
                                             'ORB35017.mzXML'))

temp[['EH1958']] #second mzXML file
file.rename(cache(temp["EH1958"]), file.path(hubCache(temp), 'ORB35022.mzXML'))
```

```{r, quiet=TRUE, eval=FALSE}
rtDevModelling(
  MS2Dir = hubCache(temp),
  nCores=4,
  runOrder =paste0(system.file("extdata", 
                               package ="adductomicsR"),'/runOrder.csv')
  )
```

# Identify adducts

The specSimPepId function detects adducts present on the peptide. To run this
with the default parameters (the largest triply charged peptide of human 
serum albumin) enter the path of your mzxml files and rtDevModels object. For 
further information on running this with different peptides see
??specSimPepId. This produces MS2 spectra plots, each in a separate
directory for each sample. A plot of the model spectrum is also saved in
the mzXML files directory for comparison. The spectra are grouped based 
on the mz and RT windows, and plots of these groups are also provided 
based on the raw RT and adjusted RT. These plots can be used to determine
whether multiple groups pertain to the same peak.
    
```{r, quiet=TRUE, eval=FALSE}
specSimPepId(
  MS2Dir = hubCache(temp),
  nCores=4, 
  rtDevModels =paste0(hubCache(temp),'/rtDevModels.RData')
  )
```

# Generate a target table for quantification
A list of the adducts for quantification and their monoisotopic mass (MIM), 
retention time (RT), peptide and charge is generated using the following 
command. Substitute the file path of the allResults file to the location of 
your allResults file from the previous step.

```{r, quiet=TRUE, error=TRUE}

generateTargTable(
  allresultsFile=paste0(system.file("extdata",package =
  "adductomicsR"),'/allResults_ALVLIAFAQYLQQCPFEDHVK_example.csv'),
  csvDir=tempdir(check = FALSE)
  )
```

It is recommended that the allGroups plot ( m/z vs RT) is used to ensure that 
the adducts in the target table do not pertain to the same peak, as the 
quantification step can be computationally intensive.

# Quantify adducts

See ??adductQuant for an explanation on the parameters for this function. To 
use your target table produced in the previous step, alter the value in the 
'targTable' option to the path of your target table. Similarly replaced the 
path to the directory of your own mzXML files in filePaths (set as 
"Users/Documents/mzXMLfiles" here.

```{r, quiet=TRUE, eval=FALSE}
adductQuant(
  nCores=2, 
  targTable=paste0(system.file("extdata", 
                               package="adductomicsR"),
                               '/exampletargTable2.csv'), 
  intStdRtDrift=30, 
  rtDevModels= paste0(hubCache(temp),'/rtDevModels.RData'),
  filePaths=list.files(hubCache(temp),pattern=".mzXML",
                       all.files=FALSE,full.names=TRUE),
  quantObject=NULL,
  indivAdduct=NULL,
  maxPpm=5,
  minSimScore=0.8,
  spikeScans=1,
  minPeakHeight=100,
  maxRtDrift=20,
  maxRtWindow=240,
  isoWindow=80,
  hkPeptide='LVNEVTEFAK', 
  gaussAlpha=16
  )

```

# Extract the results from the AdductQuantif Object
It is recommended that spectra for each of the adducts found are checked 
manually using LC-MS software, either at this step or before quantification. 

To load your adductquantif object set the path to the file on your system. In 
the example it assumes the file is present in your working directory.
```{r, quiet=TRUE, eval=FALSE, error=TRUE}
#load the adductquantif object 
load(paste0(hubCache(temp),"/adductQuantResults.Rda"))

#produce a peakTable from the Adductquantif object and save to a temporary
#directory
suppressMessages(suppressWarnings(outputPeakTable(object=
    object, outputDir=tempdir(check = FALSE))))
```

# Filter the results from the peak area table
Mass spectrometry data is inherently noisy, and the filterAdductTable() 
function will filter out samples and adducts based on set thresholds. It is 
recommended to use this filter function to remove adducts that have many 
missing values and samples where the housekeeping peptide is weak, suggestive 
of misintegration. Substitute the name of the peaklist file with the path and 
the name of your peaklist file produced in the previous step.

```{r, quiet=TRUE,eval=FALSE, error=TRUE}
filterAdductTable(
  paste0(tempdir(check = FALSE),"/adductQuantif_peakList_", Sys.Date(), ".csv")
  )
```

```{r, quiet=TRUE, error=TRUE}
#session info
sessionInfo()
```