Quadratic Logistic References
Here are some references which describe the Quadratic-Logistic method for Protein Secondary Structure Prediction. If you don't have access to these in your library, please email me munson@helix.nih.gov for copies of the articles. Be sure to include your mailing address.
- 1995. Di Francesco, V., P.J. Munson, J. Garnier, Use of Multiple Alignments in Protein Secondary Structure Prediction. 28th Hawaii International Conference on System Sciences, 5: p. 285-291, IEEE, Los Alamitos, CA.
Abstract: Using a new database of 20 proteins not included in any of the previously used training datasets, we have incorporated multiple alignment information from homologous proteins into two well-characterized prediction methods: COMBINE (a jury method) and the Q-L (or quadratic logistic) method. It is found that the increase in accuracy from the use of related proteins is similar for both methods (5.8% and 6.3%, respectively) yielding a per residue prediction accuracy (Q3) of 68.7% and 69.0%, respectively, for a three state prediction. Most of the improvement came from consideration of averaging, profiling or consensus predictions. Of this improvement, a small amount (0.5%) came from recognition that "gap-permissive" positions in the alignment are most frequently in the coil state. Our finding is consistent with the hypothesis of a common secondary structure for the aligned family, and that improved accuracy is due to reduced noise in the prediction.
- 1994. Munson, P. J., V. Di Francesco and R. Porrelli, Protein Secondary Structure Prediction using Periodic-Quadratic-Logistic Models: Theoretical and Practical Issues. 27th Annual Hawaii International Conference on System Science, 5: p. 375-384, IEEE, Los Alamitos, CA.
Abstract: We extend logistic discriminant function methodology to compete effectively with neural networks and "information theory" methods in prediction of protein secondary structure. Unlike "black-box" methods, our model produces 400 pairwise interaction parameters which are interpretable from a molecular standpoint. Under optimal conditions, our model can produce up to 65.9% crossvalidated prediction accuracy on three states. A broad family of models is searched using a semi-parametric (penalized) approach combined with stepwise parameter selection. We show that optimal models have about 800 effective parameters for this data set. The highest prediction accuracy is concentrated in a fraction of the total residues, and the confidence of a prediction can be easily calculated. Such high-confidence predictions may be useful as the basis for prediction of the complete structure of the protein.
- 1993. Munson, P. J., L. Cao, V. Di Francesco and R. Porrelli, Semiparametric and Kernel Density Estimation Procedures for Prediction of Protein Secondary Structure. Statistical Computing Section, American Statistical Association Annual Meeting, : p. 107-113.
Abstract: We have investigated the problem of prediction of protein secondary structure from amino acid sequence information, using parametric, semi-parametric and nonparametric approaches. In a three state model of secondary structure (helix, sheet, coil) which describes the local conformation of the protein polymer chain, we are able to attain an accuracy rate between 63 and 67% using each of these approaches. Maximum likelihood estimates are more satisfactory than the "information theory" method used by previous authors. In the fully parametric approach, parameter values have meaningful biophysical interpretations. The nonparametric approach is a variation of "homology" prediction methods familiar to molecular biologists. The semi-parametric approach produces a compromise result, providing some parametric information together with protection from model mis-specification. Computation intensive crossvalidation is necessary to establish the correct prediction rates.
- 1993. Munson, P. J., V. Di Francesco and R. Porrelli, Secondary Structure Prediction using Penalized Likelihood Models. Computing Science and Statistics: 25th Symposium on the Interface, Mike Tarter, ed., The Interface Foundation, Alexandria, VA: p. 203-209.