NpjComput.Mater.:聚合物的按需設計難題:或許從今天開始得到緩解

海歸學者發起的公益學術平臺

分享資訊,整合資源


交流學術,偶爾風月

在聚合物研究中,正向篩選和逆向設計是推動聚合物從實驗室研究走向市場應用的關鍵步驟。然而,聚合物材料的研發面臨著一個重大挑戰——大規模聚合物資料集的缺乏。正因如此,利用材料資訊學透過小型資料集設計符合特定需求的聚合物成為科學家們的研究焦點。傳統的聚合物篩選方法雖然在某些方面取得了進展,但依然無法有效解決如何在有限的候選庫中找到滿足要求的聚合物這一難題。透過人類的想象力列舉所有可能的聚合物結構顯然是不現實的,這就提出瞭如何進行“按需逆向設計”的問題,成為了當前聚合物領域的一個重要方向。
針對這一挑戰,長春應用化學研究所孫昭豔研究員團隊提出了一個創新性的聚合物生成模型——PolyTAO。該模型透過一個包含近百萬條聚合物結構-性質對的大型資料庫,結合Transformer輔助的定向預訓練方法,使得聚合物的按需逆向設計成為可能。PolyTAO在top-1生成模式下,達到了99.27%的化學有效性,生成了約20萬個聚合物,成功率在所有已報道的聚合物生成模型中名列前茅。PolyTAO的成功不僅體現在其高效生成大量化學有效的聚合物,還在於其在多個聚合物性質上的優異表現。研究表明,PolyTAO生成的聚合物在15個預定義性質上的預測精度非常高,R²值平均為0.96,這意味著模型充分掌握了聚合物的結構-性質關係,並能精準預測其效能。
圖1. 在15個預定義性質上的生成表現
圖2. 模型幾乎適用所有聚合物中常見的化學元素(並可根據後續任務補充缺失指定化學元素相關的資料)
為了驗證PolyTAO模型在實際應用中的廣泛適應性,研究團隊還在多個小型聚合物資料集上進行了微調實驗,採用了半模板和無模板生成正規化。這些實驗結果表明,PolyTAO不僅能在常規生成模式下工作,還能在無模板生成和更具挑戰性的任務中成功生成具有目標性質的聚合物,展示了其強大的靈活性和廣泛的應用前景。
圖3. PolyTAO利用半模板方式生成具備指定原子化能的聚合物
圖4. PolyTAO透過無模板方式生成具備指定帶隙的聚合物
PolyTAO的提出為聚合物的按需逆向設計提供了新的方向,突破了傳統方法在小型資料集和候選庫多樣性方面的限制。它不僅推動了聚合物生成模型的理論進步,也為材料科學領域的逆向設計方法提供了重要啟示。
隨著模型在不同領域的推廣應用,未來PolyTAO有望為更多材料的設計與發現提供重要的工具,幫助加速材料的開發與應用。該文近期發表於npj ComputationaMaterials10: 273 (2024)英文標題與摘要如下,點選左下角“閱讀原文”可以自由獲取論文PDF。
On-demand reverse design of polymers with PolyTAO
Haoke Qiu & Zhao-Yan Sun
The forward screening and reverse design of drug molecules, inorganic molecules, and polymers with enhanced properties are vital for accelerating the transition from laboratory research to market application. Specifically, due to the scarcity of large-scale datasets, the discovery of polymers via materials informatics is particularly challenging. Nonetheless, scientists have developed various machine learning models for polymer structure-property relationships using only small polymer datasets, thereby advancing the forward screening process of polymers. However, the success of this approach ultimately depends on the diversity of the candidate pool, and exhaustively enumerating all possible polymer structures through human imagination is impractical. Consequently, achieving on-demand reverse design of polymers is essential. In this work, we curate an immense polymer dataset containing nearly one million polymeric structure-property pairs based on expert knowledge. Leveraging this dataset, we propose a Transformer-Assisted Oriented pretrained model for on-demand polymer generation (PolyTAO). This model generates polymers with 99.27% chemical validity in top-1 generation mode (approximately 200k generated polymers), representing the highest reported success rate among polymer generative models, and this was achieved on the largest test set. Importantly, the average R2 between the properties of the generated polymers and their expected values across 15 predefined properties is 0.96, which underscores PolyTAO’s powerful on-demand polymer generation capabilities. To further evaluate the pretrained model’s performance in generating polymers with additional user-defined properties for downstream tasks, we conduct fine-tuning experiments on three publicly available small polymer datasets using both semi-template and template-free generation paradigms. Through these extensive experiments, we demonstrate that our pretrained model and its fine-tuned versions are capable of achieving the on-demand reverse design of polymers with specified properties, whether in a semi-template generation or the more challenging template-free generation scenarios, showcasing its potential as a unified pretrained foundation model for polymer generation.
媒體轉載聯絡授權請看下方

相關文章