AnÃlise comparativa de algoritmos de agrupamento de ests (âexpressed sequence tagsâ)

AUTOR(ES)
DATA DE PUBLICAÇÃO

2006

RESUMO

Expressed sequence tags (ESTs) are short single pass sequences generated by random sequencing selected clones of a cDNA library. Since they contain information about expressed genes in the cell, they are suited for several applications, mainly for gene discovery and expression profiling. Due to their intrinsic redundancy, it is important to classify ESTs in groups containing all the messages of the same transcript. This process is called ESTs clustering and leads to a data complexity reduction and provides estimates of mRNA abundance. Several specialized tools are available in the literature for this task, and the main objective of this study was to provide the first objective comparison regarding the accuracy of five such tools (CAP3, d2_cluster, ESTate, TGICL and XSACT), based on a reference clustering constructed based on the information of the complete human genome sequence. Several comparative analyses were conducted and they showed that the clustering tools display good agreement with standard clustering, and that they produce similar results. However, in some cases the results of the tools are affected by the cDNA library itself. Based on several criteria, XSACT displayed more consistent results; nevertheless there is no significant difference that points to the utilization of a specific EST clustering tool.

ASSUNTO(S)

agrupamento de ests ests genomics anÃlise de seqÃÃncias biolÃgicas biological sequence analysis est clustering genÃmica; anÃlise biolÃgica genetica

Documentos Relacionados