skip to main content
Primo Search
Search in: Busca Geral

Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions

Sims, Gregory E ; Jun, Se-Ran ; Wu, Guohong A ; Kim, Sung-Hou

Proceedings of the National Academy of Sciences - PNAS, 2009-02, Vol.106 (8), p.2677-2682 [Periódico revisado por pares]

United States: National Academy of Sciences

Texto completo disponível

Citações Citado por
  • Título:
    Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions
  • Autor: Sims, Gregory E ; Jun, Se-Ran ; Wu, Guohong A ; Kim, Sung-Hou
  • Assuntos: Alphabets ; Biological Sciences ; Genes ; Genetic research ; Genome ; Genomes ; Genomics ; High frequencies ; Homology ; Introns ; Methods ; Mutation ; Nucleotide sequence ; Nucleotide sequences ; Nucleotides ; Optimization ; Phylogeny ; Physical Sciences ; Test ranges ; Topology
  • É parte de: Proceedings of the National Academy of Sciences - PNAS, 2009-02, Vol.106 (8), p.2677-2682
  • Notas: ObjectType-Article-2
    SourceType-Scholarly Journals-1
    ObjectType-Feature-1
    content type line 23
    ObjectType-Article-1
    ObjectType-Feature-2
    Author contributions: G.E.S. and S.-H.K. designed research; G.E.S., S.-R.J., G.A.W., and S.-H.K. performed research; G.E.S., S.-R.J., G.A.W., and S.-H.K. contributed new reagents/analytic tools; G.E.S., S.-R.J., G.A.W., and S.-H.K. analyzed data; and G.E.S. and S.-H.K. wrote the paper.
    Contributed by Sung-Hou Kim, December 30, 2008
  • Descrição: For comparison of whole-genome (genic + nongenic) sequences, multiple sequence alignment of a few selected genes is not appropriate. One approach is to use an alignment-free method in which feature (or l-mer) frequency profiles (FFP) of whole genomes are used for comparison--a variation of a text or book comparison method, using word frequency profiles. In this approach it is critical to identify the optimal resolution range of l-mers for the given set of genomes compared. The optimum FFP method is applicable for comparing whole genomes or large genomic regions even when there are no common genes with high homology. We outline the method in 3 stages: (i) We first show how the optimal resolution range can be determined with English books which have been transformed into long character strings by removing all punctuation and spaces. (ii) Next, we test the robustness of the optimized FFP method at the nucleotide level, using a mutation model with a wide range of base substitutions and rearrangements. (iii) Finally, to illustrate the utility of the method, phylogenies are reconstructed from concatenated mammalian intronic genomes; the FFP derived intronic genome topologies for each l within the optimal range are all very similar. The topology agrees with the established mammalian phylogeny revealing that intron regions contain a similar level of phylogenic signal as do coding regions.
  • Editor: United States: National Academy of Sciences
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.