User Tools

Site Tools


genetica:bioinf_process:download

As previosuly mentioned, files resulting from the sequencer are stored at varbank server.

The following files from the bioinformatic process are available

BAM binary version of a SAM file

SAM a tab-delimited text file that contains sequence alignment data SAM

BAM_INDEX bam index

FUNC functional annotation of variations

FASTQ gzip compressed FASTQ, containing raw sequence and quality scores FASTQ

HS_METRICS Picard hybrid selection metrics

EXONCOV_ALL total exon coverage statistics

EXONCOV_LOW low coverage part of the EXONCOV_ALL table

VCF row vcf (Variant Call Format) files from different variation callers VCF

In order to download files, select those files you are interested in by clicking on the left check box and next click on Create and Send Download Links located at the table foot. A window with download links will appear.

We are interested in FASTQ files, the ones coming directly from the sequencing machine.

Because there are many files and these are relatively large (3.5-4.1 Gb when compressed) it is best to download directly to local server using

wget and the links provided by the pop-up message from varbank, something like this:

wget http://varbank.ccg.uni-koeln.de/downloads/2b3c87c39f158033283457199ef6e50a/SN7640211_14143_P1H03_MND428_1_sequence.fq.gz

wget http://varbank.ccg.uni-koeln.de/downloads/2b3c87c39f158033283457199ef6e50a/SN7640211_14074_P1A01_MND1014_1_sequence.fq.gz

If we have many files to download, maybe it is best to copy all this links into a txt file (e.g. files-to-download.txt) and then download using

wget -i files-to-download.txt

where -i indicates –input-file, meaning that it reads URLs from a local or external file.

It is very important NOT to close the varbank page since the download links are valid for the duration of a session - this is 14 days or until you log out.

genetica/bioinf_process/download.txt · Last modified: 2020/08/04 10:58 by 127.0.0.1