Phylogenetically related organisms contain similar
oligonucleotide composition or genomic
signature. This behaviour is often observed within different
elements in the same prokaryotic cell: chromosomes, plasmids and
phages. The results obtained by comparing oligonucleotide composition
of those elements may be retrieved in this side.. We have searched five algorithms to compute
distances, and our results are available here.
According to our calculations with randon sequences, GC-content
of sequences may affect the distance between the complete
sequence and their subsequences when some statistical methods are
used to compute distances. This effect has been observer
for Hamming distance, Global distance and Tetranucleotide Usage
Deviation, so those statistics have been removed from aour
computing. Consequently, this service uses only the
following statistical methods to distances based in oligonucleotide
composition of sequences:
Both methods have been extensively used in this website, although they are not available simultaneously in all tools/data. The reasons to select these statistical methods are described here Assignment of
sequences to genomes We searched the suitability of both methods
to assign a DNA subsequence from sequenced
prokaryotes to their source genome. The results of our computing are
available here.
Based on our data, the following conclusions had been obtained:
When generating oligo-skews, the figures obtained by computing Pearson distances for z-scores of tetranucleotides were very flat. In contrast, Euclidean distances show the presence of picks in the figures. Although those picks had not been searched yet, we believe they could be related to Horizontal Transfer events. Oligo-Skews have been generated for all sequenced prokaryotes by using Euclidean distance.
|