Matches in SemOpenAlex for { <https://semopenalex.org/work/W4321765061> ?p ?o ?g. }
- W4321765061 abstract "Proteolysis-targeting chimeras (PROTACs) are hetero-bifunctional molecules. They induce the degradation of a target protein by recruiting an E3 ligase to the target. The PROTAC can inactivate disease-related genes that are considered as understudied, thus has a great potential to be a new type of therapy for the treatment of incurable diseases. However, only hundreds of proteins have been experimentally tested if they are amenable to the PROTACs. It remains elusive what other proteins can be targeted by the PROTAC in the entire human genome. For the first time, we have developed an interpretable machine learning model PrePROTAC, which is based on a transformer-based protein sequence descriptor and random forest classification to predict genome-wide PROTAC-induced targets degradable by CRBN, one of the E3 ligases. In the benchmark studies, PrePROTAC achieved ROC-AUC of 0.81, PR-AUC of 0.84, and over 40% sensitivity at a false positive rate of 0.05, respectively. Furthermore, we developed an embedding SHapley Additive exPlanations (eSHAP) method to identify positions in the protein structure, which play key roles in the PROTAC activity. The key residues identified were consistent with our existing knowledge. We applied PrePROTAC to identify more than 600 novel understudied proteins that are potentially degradable by CRBN, and proposed PROTAC compounds for three novel drug targets associated with Alzheimer's disease.Many human diseases remain incurable because disease-causing genes cannot by selectively and effectively targeted by small molecules. Proteolysis-targeting chimera (PROTAC), an organic compound that binds to both a target and a degradation-mediating E3 ligase, has emerged as a promising approach to selectively target disease-driving genes that are not druggable by small molecules. Nevertheless, not all of proteins can be accommodated by E3 ligases, and be effectively degraded. Knowledge on the degradability of a protein will be crucial for the design of PROTACs. However, only hundreds of proteins have been experimentally tested if they are amenable to the PROTACs. It remains elusive what other proteins can be targeted by the PROTAC in the entire human genome. In this paper, we propose an intepretable machine learning model PrePROTAC that takes advantage of powerful protein language modeling. PrePROTAC achieves high accuracy when evaluated by an external dataset which comes from different gene families from the proteins in the training data, suggesting the generalizability of PrePROTAC. We apply PrePROTAC to the human genome, and identify more than 600 understudied proteins that are potentially responsive to the PROTAC. Furthermore, we design three PROTAC compounds for novel drug targets associated with Alzheimer's disease." @default.
- W4321765061 created "2023-02-25" @default.
- W4321765061 creator A5051041953 @default.
- W4321765061 creator A5066245750 @default.
- W4321765061 date "2023-02-24" @default.
- W4321765061 modified "2023-09-27" @default.
- W4321765061 title "Elucidation of Genome-wide Understudied Proteins targeted by PROTAC-induced degradation using Interpretable Machine Learning" @default.
- W4321765061 cites W1488181101 @default.
- W4321765061 cites W1565539350 @default.
- W4321765061 cites W1964633739 @default.
- W4321765061 cites W1968426398 @default.
- W4321765061 cites W1968682237 @default.
- W4321765061 cites W1982267716 @default.
- W4321765061 cites W2020257412 @default.
- W4321765061 cites W2030205108 @default.
- W4321765061 cites W2037487430 @default.
- W4321765061 cites W2043338013 @default.
- W4321765061 cites W2045911289 @default.
- W4321765061 cites W2047672715 @default.
- W4321765061 cites W2054068479 @default.
- W4321765061 cites W2062294630 @default.
- W4321765061 cites W2074370114 @default.
- W4321765061 cites W2074440680 @default.
- W4321765061 cites W2076425088 @default.
- W4321765061 cites W2081120309 @default.
- W4321765061 cites W2086405243 @default.
- W4321765061 cites W2092975210 @default.
- W4321765061 cites W2097632784 @default.
- W4321765061 cites W2114358087 @default.
- W4321765061 cites W2124361787 @default.
- W4321765061 cites W2129888542 @default.
- W4321765061 cites W2144114582 @default.
- W4321765061 cites W2147863530 @default.
- W4321765061 cites W2152339342 @default.
- W4321765061 cites W2170564629 @default.
- W4321765061 cites W2336040088 @default.
- W4321765061 cites W2336330109 @default.
- W4321765061 cites W2409568753 @default.
- W4321765061 cites W2553908226 @default.
- W4321765061 cites W2559947248 @default.
- W4321765061 cites W2597078857 @default.
- W4321765061 cites W2604930678 @default.
- W4321765061 cites W2614370829 @default.
- W4321765061 cites W2767251210 @default.
- W4321765061 cites W2767856718 @default.
- W4321765061 cites W2771127066 @default.
- W4321765061 cites W2792397649 @default.
- W4321765061 cites W2793168264 @default.
- W4321765061 cites W2799524357 @default.
- W4321765061 cites W2805296794 @default.
- W4321765061 cites W2806672448 @default.
- W4321765061 cites W2883077013 @default.
- W4321765061 cites W2891213440 @default.
- W4321765061 cites W2901669535 @default.
- W4321765061 cites W2908223531 @default.
- W4321765061 cites W2909353434 @default.
- W4321765061 cites W2913179893 @default.
- W4321765061 cites W2913446986 @default.
- W4321765061 cites W2936661052 @default.
- W4321765061 cites W2951276412 @default.
- W4321765061 cites W2952481429 @default.
- W4321765061 cites W2968449718 @default.
- W4321765061 cites W2982554531 @default.
- W4321765061 cites W2983166786 @default.
- W4321765061 cites W2984761660 @default.
- W4321765061 cites W2989702634 @default.
- W4321765061 cites W2991059836 @default.
- W4321765061 cites W2994376311 @default.
- W4321765061 cites W2997958114 @default.
- W4321765061 cites W2999615587 @default.
- W4321765061 cites W3011847211 @default.
- W4321765061 cites W3011997563 @default.
- W4321765061 cites W3012111617 @default.
- W4321765061 cites W3012324425 @default.
- W4321765061 cites W3049656208 @default.
- W4321765061 cites W3079973136 @default.
- W4321765061 cites W3088011183 @default.
- W4321765061 cites W3088870471 @default.
- W4321765061 cites W3090754737 @default.
- W4321765061 cites W3094762523 @default.
- W4321765061 cites W3096897583 @default.
- W4321765061 cites W3107242996 @default.
- W4321765061 cites W3126039337 @default.
- W4321765061 cites W3130894852 @default.
- W4321765061 cites W3146944767 @default.
- W4321765061 cites W3177828909 @default.
- W4321765061 cites W3199468887 @default.
- W4321765061 doi "https://doi.org/10.1101/2023.02.23.529828" @default.
- W4321765061 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/36865212" @default.
- W4321765061 hasPublicationYear "2023" @default.
- W4321765061 type Work @default.
- W4321765061 citedByCount "0" @default.
- W4321765061 crossrefType "posted-content" @default.
- W4321765061 hasAuthorship W4321765061A5051041953 @default.
- W4321765061 hasAuthorship W4321765061A5066245750 @default.
- W4321765061 hasBestOaLocation W43217650611 @default.
- W4321765061 hasConcept C104317684 @default.
- W4321765061 hasConcept C119857082 @default.
- W4321765061 hasConcept C134459356 @default.
- W4321765061 hasConcept C141231307 @default.