در هنگام جستجو کلمه در قسمت عنوان میتوانید کلمات مورد جستجو را با کاراکتر (-) جدا کنید.
کاربرد نوع شرط:
- جایگاه : پژوهشی
- مجله: Journal of Health Management and Informatics
- نوع مقاله: Journal Article
- کلمات کلیدی:
- چکیده:
- چکیده انگلیسی: Introduction: Manipulation of protein stability is important for understanding the principles that govern protein thermostability, both in basic research and industrial applications. Various data mining techniques exist for prediction of thermostable proteins. Furthermore, ANN methods have attracted significant attention for prediction of thermostability, because they constitute an appropriate approach to mapping the non-linear input-output relationships and massive parallel computing.Method: An Extreme Learning Machine (ELM) was applied to estimate thermal behavior of 1289 proteins. In the proposed algorithm, the parameters of ELM were optimized using a Genetic Algorithm (GA), which tuned a set of input variables, hidden layer biases, and input weights, to and enhance the prediction performance. The method was executed on a set of amino acids, yielding a total of 613 protein features. A number of feature selection algorithms were used to build subsets of the features. A total of 1289 protein samples and 613 protein features were calculated from UniProt database to understand features contributing to the enzymes’ thermostability and find out the main features that influence this valuable characteristic.Results:At the primary structure level, Gln, Glu and polar were the features that mostly contributed to protein thermostability. At the secondary structure level, Helix_S, Coil, and charged_Coil were the most important features affecting protein thermostability. These results suggest that the thermostability of proteins is mainly associated with primary structural features of the protein. According to the results, the influence of primary structure on the thermostabilty of a protein was more important than that of the secondary structure. It is shown that prediction accuracy of ELM (mean square error) can improve dramatically using GA with error rates RMSE=0.004 and MAPE=0.1003.Conclusion: The proposed approach for forecasting problem significantly improves the accuracy of ELM in prediction of thermostable enzymes. ELM tends to require more neurons in the hidden-layer than conventional tuning-based learning algorithms. To overcome these, the proposed approach uses a GA which optimizes the structure and the parameters of the ELM. In summary, optimization of ELM with GA results in an efficient prediction method; numerical experiments proved that our approach yields excellent results.Keywords: Protein Stability, Primary and secondary structures, Extreme learning machine, Neural networks, Genetic algorithm
- انتشار مقاله: 10-07-1395
- نویسندگان: Jalal Rezaeenour,Mansoureh Yari Eili,Zahra Roozbahani,Mansour Ebrahimi
- مشاهده
- جایگاه : پژوهشی
- مجله: Journal of Health Management and Informatics
- نوع مقاله: Journal Article
- کلمات کلیدی:
- چکیده:
- چکیده انگلیسی: Introduction: Cancer is a major cause of mortality in the modern world, and one of the most important health problems in societies. During recent years, research on cancer as a system biology disease is focused on molecular differences between cancer cells and healthy cells. Most of the proposed methods for classifying cancer using gene expression data act as black boxes and lack biological interpretability. The goal of this study is to design an interpretable fuzzy model for classifying gene expression data of Lymphoma cancer.Method: In this research, the investigated microarray contained 45 samples of lymphoma. Total number of genes was 4026 samples. At first, we offer a hybrid approach to reduce the data dimension for detecting genes involved in lymphoma cancer. In lymphoma microarray, six out of 4029 genes were selected. Then, a fuzzy interpretable classifier was presented for classification of data. Fuzzy inference was performed using two rules which had the highest scores. Weka3.6.9 software was used to reduce the features and the fuzzy classifier model was implemented in MATLAB R2010a. Results of this study were assessed by two measures of accuracy and precision.Results: In pre-processing stage, in order to classify gene expression data of Lymphoma, six out of 4026 genes were identified as cancer- causing genes, and then the fuzzy classifier model was applied on the obtained data. The accuracy of the results of classification was 96 percent using 10 rules with the highest scores and that using 2 rules with the highest scores was about 98 percent.Conclusion: In the proposed approach, for the first time, a fully fuzzy method named a minimal rule fuzzy classification (MRFC) was introduced for extracting fuzzy rules with biological interpretability and meaning extraction from gene expression data. Among the most outstanding features of this method is the ability of extracting a small set of rules to interpret effective gene expression in cancer patients. Another result of this approach is successfully addressing the problem of disproportion between the number of samples and genes in microarrays with the proposed Filter-Wrapper Feature Selection method (FWFS).Keywords: Lymphoma Cancer, Cancer Diagnosis, Microarray, Gen Expression, Fuzzy Classifier
- انتشار مقاله: 09-10-1395
- نویسندگان: Zahra Roozbahani,Jalal Rezaei Noor,Mansoureh Yari Eili,Ali Katanforoush
- مشاهده