SeriesFusion
Science, curated & edited by AI
Nature Is Weird  /  Biology

The lifespan of every protein in your body is written in a hidden grammar of amino acids that AI can now read.

Some proteins last for minutes while others survive for years within a single cell. This longevity is not just a result of overall chemical composition or random chance. Specific patterns of lysine spacing and proline repeats act as a code that determines how fast a protein is recycled. AI language models can now decode these sequences to predict exactly how long a protein will function. Engineers can use this grammar to design synthetic proteins with custom-made expiration dates for medical use.

Original Paper

Multi-Scale Sequence Encoding Distinguishes Long-Lived and Short-Lived Proteins Revealed by Protein Language Model Embeddings

Tangilal Dihan Chowdhury, Fasiha Tanzeem Taiba, Md Ushama Shafoyat, Maruf Hasan, Kazy Noor e Alam Siddiquee, Kaiissar Mannoor, Md. Shabiul Islam

research_square  ·  rs-9298535

Abstract Protein stability and turnover are fundamental determinants of proteome regulation, yet how protein lifetime is specified by amino-acid sequences remains incompletely understood. Here, we identify a previously uncharacterized, multi-scale organization of sequence features associated with protein stability using protein language model (PLM) representations. Using experimentally derived half-life data from four human cell lines, we uncover a conserved stability-associated axis in embeddin