Extracting Protein Structural Information from Deep Mutational Scanning Data and Fitness Deviations Induced by Co-mutations
报告题目(中文):
利用深度突变扫描数据和共突变引起的适应度偏离提取蛋白质结构信息
Abstract
Protein structure dictates biological function, and accurate residue–residue contact prediction is crucial for modeling protein folds and guiding design. Traditional coevolution-based methods depend on multiple sequence alignments, which severely limits their performance in systems with few homologs such as orphan or engineered proteins. We introduce CODAprot, a machine-learning framework that predicts residue contacts using single and double mutant fitness data from Deep Mutational Scanning. Using only a set of artificially generated homologous sequences with high sequence identity, CODAprot predicts protein contact maps without requiring extensive multiple sequence alignments. CODAprot employs a support vector regression model to estimate expected double-mutant fitness, defines a covariant deviation of activity (CODA) between predicted and experimental values, and maps this deviation to contact probabilities via a naïve Bayes classifier. Across multiple protein domains, CODAprot successfully recovers biologically meaningful contact patterns, achieving an accuracy of approximately 50% at the Top-L/2 threshold, outperforming both traditional coevolutionary and existing DMS-derived approaches. Importantly, CODAprot operates independently of homologous sequence information, offering a robust machine-learning route for structural inference in sequence-sparse systems. This work provides a complementary paradigm to coevolution analysis and opens a practical avenue for integrating DMS data into protein structure prediction and design.
PPT展示
报告人:翟嘉琪
报告题目(英文):
GLMYsymm: Inferring Symmetry Order of Protein Complexes from Single Sequences Using Persistent GLMY Homology
报告题目(中文):
GLMYsymm:基于持续 GLMY 同调从单序列推断蛋白质复合物的对称阶数
Abstract
Proteins often perform essential biological functions in the form of complexes. The assembly of protein complexes typically exhibits symmetry, which contributes to a more stable structural organization. Therefore, investigating structural symmetry is of great importance for protein complex structure prediction. However, existing methods for predicting protein structural symmetry are limited in number and generally show unsatisfactory accuracy. To address this issue, we propose GLMYsymm, a single-sequence–based model for symmetry prediction of homologous protein complexes. The key innovation of this work lies in the use of persistent GLMY homology to extract topological information from protein sequences, which is further integrated with the ESM2 pretrained model to enable end-to-end symmetry prediction directly from sequence data.
To the best of our knowledge, this is the first study to apply persistent GLMY homology to protein sequence feature extraction, without relying on structural information or multiple sequence alignment (MSA). Through extensive ablation and comparative experiments, we further demonstrate the effectiveness of GLMY homology–based topological sequence features for training deep learning models. On the same test dataset, GLMYsymm outperforms both Seq2symm and QUEEN by 0.32 in terms of the Macro AUC-PR metric. In the field of protein structure prediction, GLMYsymm can be used to assess the symmetry of predicted protein structures, thereby assisting in protein structure quality evaluation, and can also serve as an important reference for stoichiometry prediction.
PPT展示
报告人:高帅
报告题目(英文):
Robust Adaptive Learning Control for a Class of Non-affine Nonlinear Systems
报告题目(中文):
一类非仿射非线性系统的鲁棒自适应学习控制
Abstract
We address the tracking problem for a class of uncertain non-affine nonlinear systems with high relative degrees, performing non-repetitive tasks. We propose a rigorously proven, robust adaptive learning control scheme that relies on a gradient descent parameter adaptation law to handle the unknown time-varying parameters of the system, along with a state estimator that estimates the unmeasurable state variables. Furthermore, despite the inherently complex nature of the non-affine system, we provide an explicit iterative computation method to facilitate the implementation of the proposed control scheme. The paper includes a thorough analysis of the performance of the proposed control strategy, and simulation results are presented to demonstrate the effectiveness of the approach.