Hilfe-Text | Blat is an alignment tool like BLAST, but it is structured differently.
On DNA, Blat works by keeping an index of an entire genome in memory.
Thus, the target database of BLAT is not a set of GenBank sequences, but
instead an index derived from the assembly of the entire genome.
By default, the index consists of all non-overlapping 11-mers except for
those heavily involved in repeats, and it uses less than a gigabyte of RAM.
This smaller size means that Blat is far more easily mirrored than BLAST.
Blat of DNA is designed to quickly find sequences of 95% and greater
similarity of length 40 bases or more. It may miss more divergent or
shorter sequence alignments. (The default settings and expected behavior
of standalone Blat are slightly different from those on the graphical version
of Blat.)
On proteins, Blat uses 4-mers rather than 11-mers, finding protein sequences
of 80% and greater similarity to the query of length 20+ amino acids.
The protein index requires slightly more than 2 gigabytes of RAM.
In practice -- due to sequence divergence rates over evolutionary time --
DNA Blat works well within humans and primates, while protein Blat continues
to find good matches within terrestrial vertebrates and even earlier organisms
for conserved proteins. Within humans, protein Blat gives a much better
picture of gene families (paralogs) than DNA Blat. However, BLAST and
psi-BLAST at NCBI can find much more remote matches.
From a practical standpoint, Blat has several advantages over BLAST:
speed (no queues, response in seconds) at the price of lesser homology depth
the ability to submit a long list of simultaneous queries in fasta format
five convenient output sort options
a direct link into the UCSC browser
alignment block details in natural genomic order
an option to launch the alignment later as part of a custom track
Blat is commonly used to look up the location of a sequence in the genome
or determine the exon structure of an mRNA, but expert users can run large
batch jobs and make internal parameter sensitivity changes by using the command
line Blat on this bwUniCluster.
DOCUMENTATION/FAQ
http://genome.ucsc.edu/FAQ/FAQblat.html#blat1
http://genome.ucsc.edu/cgi-bin/hgBlat
The command line executables of this module can be found in the folder:
/opt/bwhpc/common/bio/blat/35/bin/i386
The PATH is set to this folder, too.
Examples for the usage on bwHPC clusters can be found in:
/opt/bwhpc/common/bio/blat/35/bwhpc-examples/
Please read the 'README.bwhpc' file in /opt/bwhpc/common/bio/blat/35/bwhpc-examples/.
In case of problems, please contact 'rainer.rutka@uni-konstanz.de'. |