Using NLP to Detect Knots in Protein Structures | July 17th-23rd 2023

Abstract

Proteins are essential components of our bodies, with their function often dependent on their 3D structure. However, uncovering the 3D structure has for a long time been redeemed by months of hard work in the lab. Recent advances in Machine learning and Natural language processing have made it possible to build models (eg. AlphaFold) capable of predicting the protein's 3D structure with the same precision as experimental methods.

In this talk, I will explore an even more specific application of language models for proteins - the detection of a knot in a protein's 3D structure solely from the protein amino acid sequence. Knotting in proteins is a phenomenon that can affect their function and stability. Thanks to NLP and interpretation techniques we can try to uncover why and how proteins tie themself into a knot. In this research, we rely on many Python-based tools starting from Biopython to Pymol and Hugging Face transformer library.

TalkPyData: Deep Learning, NLP, CV (2023)

The speaker

Eva Klimentová

Full-time PhD student in Bioinformatics doing research, teaching and being taught. I'm currently exploring the world of proteins, their 3D structure and function with a focus on proteins with a knot in their structure and I combine it with state-of-the-art Machine learning approaches. In the meantime, I also enjoy teaching a Machine learning course for the first time. Before I started my PhD, I was helping with some classical bioinformatics tasks in the Bioinformatics core facility and also doing some Machine learning on genomic data.