![]() |
|
| Home Programs Discussion References Links Acknowledgments Sitemap |
|
G+C content is slightly higher in one type III secretion gene cluster,
but not the other B. pseudomallei and a variety of other Gram-negative pathogens use type III secretion as a conserved and highly adapted virulence mechanism (Hueck, 1998). Type III secretion systems are made up of clusters of homologous proteins with export functions that deliver virulence factors directly to host cells. Therefore, type III secretion system is a type of PAI in bacterial genomes. From the calculated results, the currently known two
type III secretion gene clusters of B. pseudomallei (Winstanley et
al., 1999; Attree and Attree, 2002) display different G+C content
compared to the putative ORFs. The
PAIs are usually generated by horizontal gene transfer.
Therefore, they often consist of DNA regions that differ from the
core genome in G+C
content and in different codon usage.
However, the differences in the G+C
contents of PAIs and the core genome will not be observed if the DNAs of
the donors and recipients have similar or identical G+C contents (Hacker and Kaper, 2000).
Furthermore, laterally transferred genes may adopt the genome-wide
tendencies in terms of G+C content of their new host (Karlin, 2001).
This may account for the difference we observed between the two
type III secretion systems in terms of their G+C content and highlights
the possibility that the two type III secretion systems may have been
acquired at different time during evolution.
In addition, this also poses a question on the validity of using
G+C content as one of the universal indicators of PAIs. Genome signature contrast Nearly all the putative ORFs and the ORFs in the two
type III secretion systems have
A few ORFs have
Codon and amino acid
usage contrast A gene is considered putative alien (pA) if
the biases
However, as noted by Karlin (2001), not
every pA gene cluster is a PAI, and conversely, not every PAI is a pA gene
cluster. Therefore, this
criterion cannot serve as a definitive indicator of a gene cluster being a
PAI even if it is pA. In summary Karlin proposed the five criteria for detecting anomalous gene clusters and pathogenicity islands in bacterial genomes based on a sliding window W of length 10, 20, …, 50 kb. In our current study, we implemented the five criteria based on each in silico generated ORF with the initial concern that the genome of B. pseudomallei is being assembled (i.e., the sizes of W is smaller). A smaller size of W, as we have seen, may give bigger relative errors in calculation. Therefore, a study based on the sliding window size of 10, 20, …, 50 kb should be designed to reassess the potential of pathogenicity after the B. pseudomallei genome is assembled. The B. pseudomallei genome is being sequenced and assembled. Thus, the original contig file we downloaded from Sanger Institute may contain redundant sequences or some genome sequences are unrepresented in the file. This could also contribute the errors in our study, when we want to computer the average values of several parameters of the entire genome. Another problem that our method has is that the ORFs are only generated by simple programmes (getorf), thus the resulting ORFs are not equal to the real transcribed and translated genome of the bacteria. We obtained 53,516 putative ORFs using the getorf programme, as counted by the programme in Appendix I. This is obviously an over-representation of the genes of B. pseudomallei (as we know, human beings only have about 40,000 genes). This problem can be partially overcome by increasing the threshold of ORF length during ORF generation. Nevertheless, the ORFs generated by our method, although not equal to the real genes produced in real B. pseudomallei bacterium, are exhaustive. In addition to using know
type III secretion system gene clusters to validate the five criteria,
validation should also be carried out using currently known non-pathogenic
genes. Systemic statistical
methods are to be developed to analyse the calculated results.
For better management and exploitation of data, database interface
can be included in our programme so that the calculated values generated
are automatically stored in a database for more convenient data mining. |
||