An approximation to the Longest Common Subsequence for Multiple Sequences problem using Shannon’s Entropy

Authors

DOI:

https://doi.org/10.17268/sel.mat.2019.02.13

Keywords:

Heuristic Algorithm, Subsequence, Common Subsequence, Alignment, Entropy, Simulation

Abstract

The Longest Common Subsequence for Multiples Sequences problem (LCSM), has been studied for more than 40 years, mainly motivated by its multiple applications in Bioinformatics. In this article we present a heuristic algorithm that estimates one or more solutions to the LCSM problem, using the Shannon’s Entropy as a measure of information in order to determine alignments with the greatest number of matches.

References

Blin, G., Bulteau, L., Jiang, M., Tejada, P. J. And Vialette, S. Hardness of Longest Common Subsequence for Sequences with Bounded Run-Lengths. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012; 138–148.

Cover, T. M., and Thomas, J. A. Elements of information theory. John Wiley & Sons, 2012.

Forero, W. E. S. Estudio comparativo de algoritmos para el problema de la subsecuencia común más larga restringida / comparative study of algorithms for the constrained longest common subsequence problem. Magíster en ciencias de la Ingeniería de sistemas y computación, Junio 2010.

Hakata, K., and Imai, H. The longest common subsequence problem for small alphabet size between many strings. Algorithms and Computation (1992), 469–478.

Huang, K., Yang, C.-B., Tseng, K.-T., et al. Fast algorithms for finding the common subsequence of multiple sequences. In Proceedings of the International Computer Symposium, 2004; 1006–1011.

Korkin, D. A new dominant point-based parallel algorithm for multiple longest common subsequence problem. Technical Report TR01-148, Univ. of New Brunswick, Tech. Rep. 2001.

Li, Y., Li, H., Duan, T., Wang, S., Wang, Z., and Cheng, Y. A real linear and parallel multiple longest common subsequences (mlcs) algorithm. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016; ACM:1725–1734.

Maier, D. The complexity of some problems on subsequences and supersequences. Journal of the ACM (JACM), 1978; 25(2), 322–336.

Published

2019-12-24

How to Cite

Sáenz, R. M. (2019). An approximation to the Longest Common Subsequence for Multiple Sequences problem using Shannon’s Entropy. Selecciones Matemáticas, 6(02), 275-282. https://doi.org/10.17268/sel.mat.2019.02.13