Data Mining Approach for Amino Acid Sequence Classification

Main Article Content

Dr. Sheshang Degadwala
Dhairya Vyas

Abstract

Computerized applications are employed all around the world, an enormous amount of data is collected. The essential information contained in large amounts of data is attracting scholars from a variety of disciplines to examine how to extract the hidden knowledge inside them. The technique of obtaining or mining usable and valuable knowledge from enormous amounts of data is known as data mining. Text mining, picture mining, sequential pattern mining, web mining, and so on are all examples of data mining fields. Sequencing mining is one of the most important technologies in this field, as it aids in the discovery of sequential connections in data. Sequence mining is used in a variety of applications, including customers' buying trends analysis, web access trends analysis, atmospheric observation, amino acid sequences, Gene sequencing, and so on. Sequence mining techniques are utilized in protein and DNA analysis for sequence alignment, pattern searching, and pattern categorization. Researchers are exhibiting an interest in the subject of amino acid sequence categorization in the field of amino acid sequence analysis. It has the ability to find recurrent patterns in homologous proteins. This study describes the numerous methods used by numerous studies to categories proteins and gives an overview of the most important sequence classification techniques.

Downloads

Download data is not yet available.

Article Details

How to Cite
Degadwala, D. S. ., & Vyas, D. . (2021). Data Mining Approach for Amino Acid Sequence Classification . International Journal of New Practices in Management and Engineering, 10(04), 01–08. https://doi.org/10.17762/ijnpme.v10i04.124
Section
Articles

References

S. Bankapur and N. Patil, "Enhanced Protein Structural Class Prediction using Effective Feature Modeling and Ensemble of Classifiers," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, doi: 10.1109/TCBB.2020.2979430.

Siddhant College of Engineering, Institute of Electrical and Electronics Engineers. Bombay Section., and Institute of Electrical and Electronics Engineers, 2018 3rd International Conference for Convergence in Technology (I2CT)?: The Gateway Hotel, XION Complex, Wakad Road, Pune, India. Apr 06-08, 2018.

W. Bao, D. Wang, and Y. Chen, “Classification of Protein Structure Classes on Flexible Neutral Tree,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 14, no. 5, pp. 1122–1133, 2017

Hamou, R. M., Kabli, F. and Amine, A. (2017) “New classification system for protein sequences” 2017 First International Conference on Embedded & Distributed Systems (EDiS).

M. R. Harun Babu and N. K. S, “Protein Family Classification using Deep Learning.” bioRxiv preprint first posted online Sep. 11, 201

S. Brahnam, L. Nanni, and A. Lumini “Prediction of protein structure classes by incorporating different protein descriptors into general Chou’s pseudo amino acid composition,” J. Theor. Biol., vol. 360, pp. 109–116, Nov. 2014.

Wang D., “A novel protein structure classification model,” no. September, 2015.

A. Charuvaka and H. Rangwala, “Classifying protein sequences using regularized multi-task learning,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 11, no. 6, pp.1087–1098, 2014.

K. M. Shawkat Zamil and J. Rahman, “Prediction of Protein-Protein Interaction from Amino Acid Sequence Using Ensemble Classifier,” Int. Conf. Comput. Commun. Chem. Mater. Electron. Eng. IC4ME2 2018, pp. 1–4, 2018.

D. Zhang and M. R. Kabuka, “Protein Family Classification with Multi-Layer Graph Convolutional Networks,” Proc. - 2018 IEEE Int. Conf. Bioinforma. Biomed. BIBM 2018, pp. 2390–2393, 2019.

I. Wohlers, M. Le Boudic-jamin, and H. Djidjev, “LNBI 8542 - Exact Protein Structure Classification Using the Maximum Contact Map Overlap Metric,” pp. 262– 273.

S. Ji et al., “Deep CDpred: Inter-residue distance and contact prediction for improved prediction of protein structure,” PLoS One, vol. 14, no. 1, pp. 1–15, 2019.

A. Ghosh and B. Parai, “Protein secondary structure prediction using distance based classifiers,” Int. J. Approx. Reason., vol. 47, no. 1, pp. 37–44, 2008.

L. Zhu, S. P. Deng, and D. S. Huang, “A Two-Stage Geometric Method for Pruning Unreliable Links in Protein-Protein Networks,” IEEE Trans. Nanobioscience, vol. 14, no. 5, pp. 528–534, 2015.

S. Shatabda, M. A. H. Newton, M. A. Rashid, D. N. Pham, and A. Sattar, “How good are simplified models for protein structure prediction?,” Adv. Bioinformatics, vol. 2014, 2014.

D. S. Huang and H. J. Yu, “Normalized feature vectors: A novel alignment-free sequence comparison method based on the numbers of adjacent amino acids,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 10, no. 2, pp. 457–467, 2013.

K. M. Shawkat Zamil and J. Rahman, “Prediction of Protein-Protein Interaction from Amino Acid Sequence Using Ensemble Classifier,” Int. Conf. Comput. Commun. Chem. Mater. Electron. Eng. IC4ME2 2018, pp. 1–4, 2018.