@@ -893,7 +893,7 @@ EIC for serine and run a *centWave*-based peak detection on that data using
893893``` {r centWave-default}
894894#' Get the EIC for serine in all files
895895serine_chr <- chromatogram(mse, rt = c(164, 200),
896- mz = serine_mz + c(-0.01 , 0.01 ),
896+ mz = serine_mz + c(-0.005 , 0.005 ),
897897 aggregationFun = "max")
898898
899899#' Get default centWave parameters
@@ -906,7 +906,7 @@ chromPeaks(res)
906906
907907The peak matrix returned by ` chromPeaks ` is empty, thus, with the default
908908settings * centWave* failed to identify any chromatographic peak in the EIC for
909- serine. These default values are shown below:
909+ serine. The default values for the parameters are shown below:
910910
911911``` {r centWave-default-parameters}
912912#' Default centWave parameters
@@ -920,6 +920,7 @@ however see that these values are way too large for our UHPLC-based data set
920920(see below).
921921
922922``` {r, fig.cap = "Extracted ion chromatogram for serine."}
923+ #' Plot the EIC
923924plot(serine_chr)
924925```
925926
@@ -1273,21 +1274,21 @@ repeatedly measured QC samples (e.g. sample pools) and adjust the full
12731274experiment based on these. See the alignment section in the * xcms*
12741275[ vignette] ( https://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html )
12751276for more information on this subset-based alignment. Note that such a
1276- subset-based alignment requires the samples to be loaded in the order in which
1277- they were measured. Also, recently, functionality was added to * xcms* to perform
1278- the alignment on pre-selected signals (e.g. retention times of internal
1277+ subset-based alignment requires the samples to be organized in the order in
1278+ which they were measured. Also, recently, functionality was added to * xcms* to
1279+ perform the alignment on pre-selected signals (e.g. retention times of internal
12791280standards) or to align a data set against an external reference.
12801281
12811282For our example we use the * peakGroups* method that, as mentioned above, aligns
12821283samples based on the retention times of * anchor peaks* . To define these, we need
1283- to first run an initial correspondence analysis to group chromatographic peaks
1284+ to first run an initial correspondence analysis and group chromatographic peaks
12841285across samples. Below we use the * peakDensity* method for correspondence
12851286(details about this method and explanations on the choices of its parameters are
12861287provided in the next section). In brief, parameter ` sampleGroups ` defines to
12871288which sample group of the experiment individual samples belong to, and parameter
12881289` minFraction ` specifies the proportion of samples (of one of the sample groups
12891290defined in ` sampleGroups ` ) in which a chromatographic peak needs to be detected
1290- to group them into an LC-MS feature. Chromatographic peaks will be grouped into
1291+ to group them into an LC-MS feature. Chromatographic peaks will be grouped to
12911292features if their difference in * m/z* and retention times is below the defined
12921293thresholds and if in at least ` minFraction * 100 ` percent of samples of at least
12931294one sample group a chromatographic peak was detected. For our example we use the
@@ -1303,7 +1304,7 @@ the samples, its settings does not need to be fully optimized.
13031304#' Define the settings for the initial peak grouping - details for
13041305#' choices in the next section.
13051306pdp <- PeakDensityParam(sampleGroups = sampleData(mse)$group, bw = 1.8,
1306- minFraction = 1, binSize = 0.02 , ppm = 10)
1307+ minFraction = 1, binSize = 0.01 , ppm = 10)
13071308mse <- groupChromPeaks(mse, pdp)
13081309```
13091310
@@ -1330,9 +1331,9 @@ pgm <- adjustRtimePeakGroups(mse, PeakGroupsParam(minFraction = 1))
13301331head(pgm)
13311332```
13321333
1333- Ideally, if possible, the anchor peaks should span a large range of the
1334- retention time range to allow alignment of the full LC runs. Below evaluate the
1335- distribution of retention times of the anchor peaks in the first sample.
1334+ Ideally, if possible, the anchor peaks should span most of the retention time
1335+ range to allow alignment of the full LC runs. Below evaluate the distribution of
1336+ retention times of the anchor peaks in the first sample.
13361337
13371338``` {r}
13381339#' Evaluate distribution of anchor peaks' rt in the first sample
@@ -1346,9 +1347,9 @@ on the `minFraction` parameter) the algorithm minimizes the observed
13461347between-sample retention time differences for these. Parameter ` span ` defines
13471348the degree of smoothing of the loess function that is used to allow different
13481349regions along the retention time axis to be adjusted by a different factor. A
1349- value of 0 will most likely cause overfitting, while 1 would cause all retention
1350- times of a sample to be shifted by a constant value. Values between 0.4 and 0.6
1351- seem to be reasonable for most experiments.
1350+ value close to 0 will most likely cause overfitting, while a value of 1 would
1351+ cause all retention times of a sample to be shifted by a constant value. Values
1352+ between 0.4 and 0.6 seem to be reasonable for most experiments.
13521353
13531354``` {r alignment-correspondence}
13541355#' Define settings for the alignment
@@ -1474,10 +1475,10 @@ assignment defined in `sampleData`.
14741475
14751476``` {r}
14761477#' Extract a chromatogram for a m/z range containing serine
1477- chr_1 <- chromatogram(data , mz = serine_mz + c(-0.005, 0.005))
1478+ chr_1 <- chromatogram(mse , mz = serine_mz + c(-0.005, 0.005))
14781479
14791480#' Default parameters for peak density; bw = 30
1480- pdp <- PeakDensityParam(sampleGroups = sampleData(data )$group, bw = 30)
1481+ pdp <- PeakDensityParam(sampleGroups = sampleData(mse )$group, bw = 30)
14811482
14821483#' Test these settings on the extracted slice
14831484plotChromPeakDensity(chr_1, param = pdp)
@@ -1497,22 +1498,22 @@ of this curve (which is created with the base R `density` function) is
14971498configured with the parameter ` bw ` . The * peakDensity* algorithm assigns all
14981499chromatographic peaks within the same * peak* of this density estimation curve to
14991500the same feature. Chromatographic peaks assigned to the same feature are
1500- indicated with a grey rectangle in the plot. In the present example, because
1501- retention times of the two chromatographic peaks are very similar, this
1502- rectangle is very narrow and looks thus more like a vertical line. Based on this
1503- result, the default settings (` bw = 30 ` ) seemed to correctly define features. It
1504- is however advisable to evaluate settings on multiple slices, ideally with
1505- signal from more than one compound being present. Such slices could be
1506- identified in e.g. a plot created with the ` plotChromPeaks ` function (see
1507- example in the chromatographic peak detection section).
1501+ indicated with a grey rectangle in the lower panel of the plot. In the present
1502+ example, because retention times of the two chromatographic peaks are very
1503+ similar, this rectangle is very narrow and looks thus more like a vertical
1504+ line. Based on this result, the default settings (` bw = 30 ` ) seemed to correctly
1505+ define features. It is however advisable to evaluate settings on multiple
1506+ slices, ideally with signal from more than one compound being present. Such
1507+ slices could be identified in e.g. a plot created with the ` plotChromPeaks `
1508+ function (see example in the chromatographic peak detection section).
15081509
15091510In our example we extract a chromatogram for an * m/z* slice containing signal
15101511for known isomers betaine and valine ([ M+H] + * m/z* 118.08625).
15111512
15121513``` {r correspondence-bw, fig.cap = "Correspondence analysis with default settings on an *m/z* slice containing signal from multiple ions."}
15131514#' Plot the chromatogram for an m/z slice containing betaine and valine
1514- mzr <- 118.08625 + c(-0.01 , 0.01 )
1515- chr_2 <- chromatogram(data , mz = mzr, aggregationFun = "max")
1515+ mzr <- 118.08625 + c(-0.005 , 0.005 )
1516+ chr_2 <- chromatogram(mse , mz = mzr, aggregationFun = "max")
15161517
15171518#' Correspondence in that slice using default settings
15181519plotChromPeakDensity(chr_2, param = pdp)
@@ -1527,14 +1528,14 @@ reduced value for parameter `bw`.
15271528
15281529``` {r correspondence-bw-fix, fig.cap = "Correspondence analysis with reduced bw setting on a *m/z* slice containing signal from multiple ions."}
15291530#' Reducing the bandwidth
1530- pdp <- PeakDensityParam(sampleGroups = sampleData(data )$group, bw = 1.8)
1531+ pdp <- PeakDensityParam(sampleGroups = sampleData(mse )$group, bw = 1.8)
15311532plotChromPeakDensity(chr_2, param = pdp)
15321533```
15331534
15341535Setting ` bw = 1.8 ` strongly reduced the smoothness of the density curve
15351536resulting in a higher number of density * peaks* and hence a nice grouping of
15361537(aligned) chromatographic peaks into separate features. Note that the height of
1537- the peaks of the density curve are not considered for the grouping.
1538+ the peaks of the density curve are not relevant for the grouping.
15381539
15391540By having defined a ` bw ` appropriate for our data set, we proceed and perform
15401541the correspondence analysis on the full data set. Other parameters of
@@ -1557,17 +1558,17 @@ allows to generate *m/z*-dependent bin sizes: the width of the *m/z* slices
15571558increases by ` ppm ` of the bin's * m/z* along the * m/z* axis.
15581559
15591560For our correspondence analysis we set the maximal acceptable difference of
1560- chrom peaks' * m/z* values with ` binSize = 0.02 ` and ` ppm = 10 ` , hence grouping
1561+ chrom peaks' * m/z* values with ` binSize = 0.01 ` and ` ppm = 10 ` , hence grouping
15611562chromatographic peaks with similar retention time and with a difference of their
1562- * m/z* values that is smaller than 0.02 + 10 ppm of their * m/z* values. By
1563+ * m/z* values that is smaller than 0.01 + 10 ppm of their * m/z* values. By
15631564setting ` minFraction = 0.4 ` we in addition require for a feature that a
15641565chromatographic peak was detected in ` >= ` 40% of samples of at least one sample
15651566group.
15661567
15671568``` {r correspondence-analysis}
15681569#' Set in addition parameter ppm to a value of 10
15691570pdp <- PeakDensityParam(sampleGroups = sampleData(mse)$group, bw = 1.8,
1570- minFraction = 0.4, binSize = 0.02 , ppm = 10)
1571+ minFraction = 0.4, binSize = 0.01 , ppm = 10)
15711572
15721573#' Perform the correspondence analysis on the full data
15731574mse <- groupChromPeaks(mse, param = pdp)
@@ -1821,9 +1822,9 @@ l <- lm(log2(avg_filled) ~ log2(avg_detect))
18211822summary(l)
18221823```
18231824
1824- With a value of 0.994 , the slope of the line is thus very close to the slope of
1825+ With a value of 1.007 , the slope of the line is thus very close to the slope of
18251826the identity line and the two sets of values are also highly correlated (R
1826- squared of 0.79 ).
1827+ squared of 0.81 ).
18271828
18281829
18291830
@@ -1977,8 +1978,8 @@ available in the infrastructure provided through the *xcms*, *Spectra*,
19771978* MsCoreUtils* , * MetaboCoreUtils* and other related Bioconductor packages. It
19781979would for example be easily possible to extract specific information for
19791980selected chromatographic peaks or LC-MS features from an * xcms* result object
1980- and perform some additional visualizations or analyses on them. Below we first
1981- identify chromatographic peaks that would match the * m/z* of serine.
1981+ and perform some additional visualizations or analyses on them. AS an example we
1982+ below first identify chromatographic peaks that would match the * m/z* of serine.
19821983
19831984``` {r}
19841985#' Extract chromatographic peaks matching the m/z of the [M+H]+ of serine
@@ -2004,10 +2005,10 @@ serine_ms1_2 <- chromPeakSpectra(mse, msLevel = 1, method = "closest_rt",
20042005 peaks = rownames(serine_pks)[2])
20052006```
20062007
2007- For LC-MS/MS data, this function would allow to select all MS2 spectra from the
2008- data set with their precursor m/z (and retention time) within the
2009- chromatographic peak's * m/z* and retention time width using parameters `msLevel
2010- = 2` and ` method = "all"`.
2008+ For LC-MS/MS data, this function would also allow to extract all MS2 spectra
2009+ from the data set with their precursor m/z (and retention time) within the
2010+ chromatographic peak's * m/z* and retention time width by using parameters
2011+ ` msLevel = 2` and ` method = "all" ` .
20112012
20122013Below we plot the EIC and the MS1 scan for the selected chromatographic peak.
20132014
@@ -2033,9 +2034,9 @@ and retention time ranges of the chromatographic peak in that sample,
20332034` featureChrommatograms ` will instead integrate the signal from the * m/z* and
20342035retention time area of the ** feature** , i.e. will use a single area and
20352036integrate the signal from that same area in each sample. This * m/z* - retention
2036- time area might eventually be larger than the respective ranges for a single
2037+ time area might however be larger than the respective ranges for a single
20372038chromatographic peak in one sample. This * m/z* - retention time area for
2038- features can be extracted using the ` featureArea ` function:
2039+ features can also be extracted (and evaluated) using the ` featureArea ` function:
20392040
20402041``` {r}
20412042#' Extract the m/z - retention time area for features
@@ -2075,10 +2076,10 @@ cols[iso_idx[[1]]] <- "#ff0000ff"
20752076plotSpectra(serine_ms1_2, col = cols, lwd = 2)
20762077```
20772078
2078- While in the example above were specifically looking for potential isotopes of a
2079- single, selected, mass peak (by setting ` seedMz ` to the * m/z* value of that
2080- peak), we could also use ` isotopologues ` to identify all potential isotope
2081- groups in a spectrum.
2079+ While in the example above we were specifically looking for potential isotopes
2080+ of a single, selected, mass peak (by setting ` seedMz ` to the * m/z* value of that
2081+ peak), we could also use ` isotopologues ` without specifying ` seedMz ` to identify
2082+ all potential isotope groups in a spectrum.
20822083
20832084``` {r}
20842085#' Identify all potential isotope peaks in the MS1 spectrum
@@ -2151,7 +2152,7 @@ space from an LC-MS experiment.
21512152
21522153We below subset the data to the first sample and visualize the identified
21532154chromatographic peaks in the * m/z* - retention time plane using the
2154- ` plotChromPeaks ` function already used before.
2155+ ` plotChromPeaks ` function that we used already before.
21552156
21562157``` {r, fig.cap = "Position of identified chromatographic peaks in the first sample."}
21572158#' Plot identified chromatographic peaks in the first sample
@@ -2267,18 +2268,18 @@ particular how to adapt peak detection setting on a rather noisy
22672268* chromatographic* data. Below we load the example data from a text file.
22682269
22692270``` {r peaks-load}
2270- data <- read.table(
2271+ cdata <- read.table(
22712272 system.file("txt", "chromatogram.txt", package = "xcmsTutorials"),
22722273 sep = "\t", header = TRUE)
2273- head(data )
2274+ head(cdata )
22742275```
22752276
22762277Our data has two columns, one with * retention times* and one with
22772278* intensities* . We can now create a ` Chromatogram ` object from that and plot the
22782279data.
22792280
22802281``` {r peaks-plot, fig.width = 12, fig.height = 2.15}
2281- chr <- Chromatogram(rtime = data $rt, intensity = data $intensity)
2282+ chr <- Chromatogram(rtime = cdata $rt, intensity = cdata $intensity)
22822283par(mar = c(2, 2, 0, 0))
22832284plot(chr)
22842285```
0 commit comments