Correlate is an Excel plug-in that performs sparse canonical
correlation analysis.
If two sets of assays (e.g. gene expression and DNA copy number) have
been performed on the same set of patient samples then sparse CCA can
be used to find a set of variables in assay 1 that is maximally correlated
with a set of variables in assay 2.
Overview of Correlate:
- Correlate is a very flexible tool for correlating any pair of
data sets with measurements taken on the same set of samples. For
instance you can use it to correlate a set of clinical variables with
a set of genomic measurements.
- Correlate is a point-and-click Excel interface for the R package
PMA.
- Correlate implements methods proposed in the following paper:
Witten DM, Tibshirani R, and T Hastie (2009) A penalized matrix
decomposition, with applications to sparse principal
components and canonical correlation analysis. Biostatistics
10(3): 515-534.[pdf]
- See some Correlate screenshots.
- Meet the authors!
You may have to clear your history, cache, if you have visited this site recently because of a recent change that affected a redirect!
Getting started with Correlate:
- Download Correlate and follow the installation instructions.
- Flip through the Correlate manual.
- Step through a typical Correlate analysis:
- Put the data in a single Excel workbook containing two
worksheets: one containing data set 1 and the other containing
data set 2. An example is here.
- Open the Addins menu item in Excel
and click on ``Correlate''.
- Load the data into Correlate.
- Run Correlate using automatic tuning parameter selection.
- Inspect the resulting plots to choose a tuning parameter value.
- Re-run Correlate using a large number of permutations to get a
meaningful p-value.
- The resulting weight vectors for the two data sets define a set
of variables in the first data set that is maximally correlated with a
set of variables in the second data set.