So, you have a ton of hard-disks spinning with RNA Seq data you need to analyse? Excellent. But first you need some software to do that. This post follows the nature protocol described by Trapnell et el. described here (something freely accessible from nature publishing group – must be my lucky day today).
First, you will need boost libraries installed. You might be tempted to do this via
sudo apt-install libboost-all-dev, you lazy… That is a terrible idea in this instance. You may miss this little note of relevance: WARNING: Due to a serious issue with Boost Serlialization [sic] library introduced in version 1.56, Cufflinks currently can Boost Librariesonly be built with Boost version 1.55 or lower. The issue is expected to be fixed in the upcoming Boost v1.59.. Guess what version is installed by apt-get in Ubuntu 16.04 LTS? Yeah, 1.58. Bad luck bro.
Download the latest version of the boost libraries from sourceforge (sourceforge in 2016!, wait let me send a Fax with a complaint). Installing this is relatively straightforward nowadays. Here is what you should do (
$ is the bash prompt, you don’t need that)
$ tar zxvf boost_1_61_0.tar.gz $ cd boost_1_61_0 $ ./bootstrap.sh --prefix=/opt/boost_1_61_0 $ ./b2 $ sudo ./b2 install
The prefix will define where to install the boost libraries. It is critical to define this, so all the installations are contained (and easily removed/updated). Also, you will need to add these to lines to
export BOOST_ROOT=/opt/boost_1_61_0 export LD_LIBRARY_PATH=$BOOST_ROOT/lib:$LD_LIBRARY_PATH
Make sure to re-source the file after you make these changes (either reopen the terminal or
Next in our install list, is bowtie2 – a short sequence aligner. Download the latest version from sourceforge (arrgghhh what is wrong with you bioinf people?!). Note that you need to download the source version (as we are going to build the binaries ourselves).
$ unzip bowtie2-2.2.9-source.zip $ sudo apt-get install libtbb-dev $ cd bowtie2-2.2.9 $ make WITH_TBB=1 $ cd .. $ sudo mv bowtie2-2.2.9 /opt
The tbb library is a threading library which offers shorter running times on multi-core architectures.
make WITH_TBB=1 will compile binaries which make use of this library. Some more detailed instructions can be found in the user manual.
TopHat uses Bowtie for read alignment but adds the ability to align spliced juctions. Nowadays TopHat is superseded by (more efficient and accurate) HiSat2, but the protocol described by Trapnell et al. still uses this. Download the latest release of TopHat (presently version 2.1.1).
$ tar xzvf tophat-2.1.1.tar.gz $ cd tophat-2.1.1 $ ./configure --prefix /opt/tophat-2.1.1 --with-boost=/opt/boost_1_61_0 $ make $ sudo make install
Download the test data and execute the following commands:
$ tar zxvf test_data.tar.gz $ cd test_data $ tophat -r 20 test_ref reads_1.fq reads_2.fq
You should see the words Run complete: 00:00:00 elapsed if the test case ran successfully.
Cufflinks requires SAM tools to be installed (yet another dependency, and sourceforge once again). This requires some tinkering around to get it to work. Warning: Do not go for the latest version (from github) of SAMTools as this will not work with the current version of cufflinks. Version 0.1.19 works with cufflinks 2.2.1. Also, there is a note that htslib is now more accurate and efficient, please ignore.
$ tar xjvf samtools-0.1.19.tar.bz2 $ cd samtools-0.1.19 $ make $ sudo mkdir -p /opt/samtools-0.1.19/bin /opt/samtools-0.1.19/include/bam /opt/samtools-0.1.19/lib $ sudo cp libbam.a /opt/samtools-0.1.19/lib $ sudo cp *.h /opt/samtools-0.1.19/include/bam
First you need to install the Eigen library. You may be again tempted to do this via
apt-get install. This didn’t work for me (building cufflinks comet vomits on Eigen). My installation works with version Eigen v3.2.8. Safe to download from the Eigen repo and install it in the following manner.
$ tar xjvf 3.2.8.tar.bz2 $ cd eigen-eigen-07105f7124f9 $ sudo mkdir /opt/eigen3 $ sudo cp -rv ./Eigen /opt/eigen3 $ export EIGEN_CPPFLAGS="-I/opt/eigen3"
--with-eigen= configure option does not work (annoyingly) which is why you need to set
EIGEN_CPPFLAGS to the Eigen3 location.
Download the latest version of Cufflinks (from github, finally). The current version is 2.2.1. Then:
$ tar xzvf cufflinks-2.2.1.tar.gz $ cd cufflinks-2.2.1 $ ./configure --prefix=/opt/cufflinks-2.2.1 --with-boost=/opt/boost_1_61_0 --with-bam=/opt/samtools-0.1.19 $ make $ sudo make install
This will take a while. After it finishes you now need to set some more environment variables in your
BOOST_ROOT should have already been set in a previous step (somewhere above).
export BOOST_ROOT=/opt/boost_1_61_0 export SAMTOOLS=/opt/samtools-0.1.19 export LD_LIBRARY_PATH=$BOOST_ROOT/lib:$SAMTOOLS/lib:$LD_LIBRARY_PATH export RNA_SEQ_TOOLS=/opt/bowtie2-2.2.9:/opt/tophat-2.1.1/bin:$SAMTOOLS/bin:/opt/cufflinks-2.2.1/bin export PATH=.:$RNA_SEQ_TOOLS:$PATH
Remember you need to
source ~/.profile after these changes.
Download the cufflinks test file. Now from the directory where you saved the file run
cufflinks ./test_data.sam. This will create some console output and a file called
transcripts.gtf (this should have some exons if you
Hope you made it this far in considerable less time than myself. It took me close to two days to set me up with cufflinks …