Skip to content
Information Technology, Science

Simulating scientists: New tool for AI-powered scientific discovery

Monash University 3 mins read

Published in Nature Machine Intelligence, an Australian team led by Monash University researchers has developed a generative AI tool that mimics scientists to support and speed up the process of scientific discoveries. 

 

Named LLM4SD (Large Language Model 4 Scientific Discovery), the new AI system is an interactive Large Language Model (LLM) tool which can carry out basic steps of scientific research i.e. retrieve useful information from literature and develop hypotheses from data analysis. The tool is freely available and open source. 

 

When asked, the system is also able to provide insights to explain its results, a feature that is not available for many current scientific validation tools. 

 

LLM4SD was tested with 58 separate research tasks relating to molecular properties across four different scientific domains: physiology, physical chemistry, biophysics and quantum mechanics. 

 

Lead co-author of the research, PhD candidate Yizhen Zheng, is from the Department of Data Science and AI at Monash University’s Faculty of Information Technology. 

 

“Just like ChatGPT writes essays or solves math problems, our LLM4SD tool reads decades of scientific literature and analyses lab data to predict how molecules behave—answering questions like, ‘Can this drug cross the brain’s protective barrier?’ or ‘Will this compound dissolve in water?’,” Mr Zheng said. 

 

“Apart from outperforming current validation tools that operate like a ’black box’, this system can explain its analysis process, predictions and results using simple rules, which can help scientists trust and act on its insights.”

 

The LLM4SD tool outperformed state-of-the-art scientific tools that are currently used to carry out these tasks; for example, it boosted accuracy by up to 48 per cent in predicting quantum properties critical for materials design. 

The study’s lead co-authors include PhD candidate Huan Yee Koh who is jointly at Monash University’s Department of Data Science and AI and the Monash Institute of Pharmaceutical Sciences, and PhD candidate Jiaxin Ju from the School of Information and Communication Technology at Griffith University.

"Rather than replacing traditional machine learning models, LLM4SD enhances them by synthesizing knowledge and generating interpretable explanations," Ms Ju said.

"This approach ensures that AI-driven predictions remain reliable, and accessible to researchers across different scientific disciplines," Mr Koh added. 

Data scientist, AI expert and co-author of the research, Professor Geoff Webb from Monash’s Faculty of Information Technology, said that LLMs can accurately mimic the key scientific discovery skills of synthesising knowledge from the literature and developing hypotheses by interpreting data. 

 

“We are already fully immersed in the age of generative AI and we need to start harnessing this as much as possible to advance science, while ensuring we are developing it ethically,” Professor Webb said. 

 

“This tool has the potential to make the drug discovery process easier, faster and more accurate and become a supercharged research support for scientists in every field all across the world.”

 

Research co-author Professor Shirui Pan is a data mining and machine learning expert and an ARC Future Fellow with the School of Information and Communication Technology at Griffith University.

 

“A model like LLM4SD can rapidly synthesize decades of prior knowledge and then turn around to spot new patterns in the data that might not be widely reported,” Professor Pan said. 

 

“We see this as a key development in speeding up research and development processes and beyond."

 

The research was a collaboration between AI and drug discovery researchers at Monash University’s Faculty of Information Technology, Monash Institute of Pharmaceutical Sciences and Griffith University. 

 

The project was supported by an Australian Research Council (ARC) grant, a National Health and Medical Research Council of Australia Ideas grant and an ARC Future Fellowship. 

 

Co-authors of the research, PhD candidate Yizhen Zheng and Professor Geoff Webb from Monash’s Department of Data Science and Artificial Intelligence at the Faculty of Information Technology, are available for interviews. 

 

To read the full research paper, please click here

 

- ENDS -

MEDIA ENQUIRIES 

Teju Hari Krishna, Monash University 

T: +61 450 501 248 E: [email protected] 

For more Monash media stories, visit our news and events site

More from this category

  • Information Technology
  • 14/07/2025
  • 22:10
LyondellBasell

LyondellBasell improves CDP climate score to A in 2024, strengthening ESG leadership

HOUSTON, July 14, 2025 (GLOBE NEWSWIRE) -- LyondellBasell (NYSE: LYB) today announced it has improved its climate change score from A-minus to A in CDP’s 2024 assessment, placing the company in the leadership category for the second consecutive year. CDP is the world’s leading environmental disclosure platform, used by investors and stakeholders to evaluate how companies manage climate-related risks, opportunities and performance.“Improving to an A score reflects the momentum we’ve built across our sustainability agenda,” said Andrea Brown, chief sustainability officer at LYB. “This recognition affirms the strength of our strategy, from scaling circular solutions and advancing low-carbon innovation to…

  • Entertainment, Science
  • 14/07/2025
  • 08:03
National Science Week

Science of sex; AI farming; fire weather; quantum poetry; art of distraction – Science Week is coming

The national festival that reaches more than 3 million people through over 2,000 events is back from 9 to 17 August. There will be…

  • Contains:
  • Indigenous, Information Technology
  • 11/07/2025
  • 09:31
Charles Darwin University

World-first study uses First Nations calendars for solar power forecasting

The in-depth observations of First Nations seasonal calendars could be key to improving solar power forecasting, according to a world-first study by Charles Darwin University (CDU). The study combined First Nations seasonal calendars with a novel deep learning model, an artificial intelligence technique, to predict future solar panel power output. Solar is one of the world’s leading renewable energy alternatives but there continues to be challenges with the technology’s reliability. At present, solarpower generation is difficult to predict because of weather, atmospheric conditions and how much power is absorbed on a panel surface. CDU researchers developed the model using the…

Media Outreach made fast, easy, simple.

Feature your press release on Medianet's News Hub every time you distribute with Medianet. Pay per release or save with a subscription.