TY - JOUR
T1 - Prediction of the anomeric configuration, type of linkage, and residues in disaccharides from 1D (13)C NMR data
AU - Pereira, Florbela
PY - 2011/1/1
Y1 - 2011/1/1
N2 - A machine learning approach was explored for the prediction of the anomeric configuration, residues, and type of linkages of disaccharides using (13)C NMR chemical shifts. For this study, 154 pyranosyl disaccharides were used that are dimers of the alpha or beta anomers of D-glucose, D-galactose or D-mannose residues bonded through alpha or beta glycosidic linkages of types 1 -> 2, 1 -> 3, 1 -> 4, or 1 -> 6, as well as methoxylated disaccharides. The (13)C NMR chemical shifts of the training set were calculated using the CASPER (Computer Assisted SPectrum Evaluation of Regular polysaccharides) program, and chemical shifts of the test set were experimental values obtained from the literature. Experiments were performed for (1) classification of the anomeric configuration, (2) classification of the type of linkage, and (3) classification of the residues. Classification trees could correctly classify 67%, 74%, and 38% of the test set for the three tasks, respectively, on the basis of unassigned chemical shifts. The results for the same experiments using Random Forests were 93%, 90%, and 68%, respectively. (C) 2011 Elsevier Ltd. All rights reserved.
AB - A machine learning approach was explored for the prediction of the anomeric configuration, residues, and type of linkages of disaccharides using (13)C NMR chemical shifts. For this study, 154 pyranosyl disaccharides were used that are dimers of the alpha or beta anomers of D-glucose, D-galactose or D-mannose residues bonded through alpha or beta glycosidic linkages of types 1 -> 2, 1 -> 3, 1 -> 4, or 1 -> 6, as well as methoxylated disaccharides. The (13)C NMR chemical shifts of the training set were calculated using the CASPER (Computer Assisted SPectrum Evaluation of Regular polysaccharides) program, and chemical shifts of the test set were experimental values obtained from the literature. Experiments were performed for (1) classification of the anomeric configuration, (2) classification of the type of linkage, and (3) classification of the residues. Classification trees could correctly classify 67%, 74%, and 38% of the test set for the three tasks, respectively, on the basis of unassigned chemical shifts. The results for the same experiments using Random Forests were 93%, 90%, and 68%, respectively. (C) 2011 Elsevier Ltd. All rights reserved.
KW - Machine learning techniques
KW - Random Forest
KW - Classification tree
KW - (13)C NMR
KW - Carbohydrate
KW - Disaccharide
U2 - 10.1016/j.carres.2011.02.017
DO - 10.1016/j.carres.2011.02.017
M3 - Article
C2 - 21440245
SN - 0008-6215
VL - 346
SP - 960
EP - 972
JO - Carbohydrate Research
JF - Carbohydrate Research
IS - 7
ER -