Using LLMs to Mutate Java Code | HackerNoon
Briefly

This article explores the evaluation of large language models (LLMs) in the context of code mutation generation for Java programs. It details a systematic approach to extracting relevant context information to create prompts for LLMs, followed by a thorough filtering process for generated mutations. The study aims to compare the performance of different LLMs based on usability, cost, and their ability to produce compilable code mutations. Insights into the effects of prompt variation and model types reveal significant performance discrepancies among LLMs, informing future improvements in mutation generation strategies.
In this study, we comprehensively assess the capabilities of various LLMs in generating code mutations, emphasizing their effectiveness, usability, and performance across distinct metrics.
Our findings reveal that certain LLMs outperform others depending on prompt structures, highlighting the significance of tailored prompts in achieving high-quality mutation outputs.
We filter non-compilable, redundant, or identical mutations to optimize the mutation generation process, ensuring the relevance and usability of the outputs generated by LLMs.
Understanding the limitations of different models is crucial for future research, where we discuss threats to validity and suggest improvements to enhance mutation generation accuracy.
Read at Hackernoon
[
|
]