End-user programmers face significant challenges in verifying the correctness of AI-generated code. Many users tend to rely on informal evaluations, such as visually checking outputs, rather than undertaking thorough testing. This trend is more pronounced among users with low self-efficacy, who may mistakenly overvalue the accuracy of the AI's output, thereby exacerbating a common issue of overconfidence. The implications of these behaviors highlight the inadequacy in existing testing practices and the potential risks in depending on automated code generation tools.
The second challenge is in verifying whether the code generated by the model is correct. In GridBook, users relied on 'eyeballing' the final output for correctness, rather than rigorous testing.
Participants, especially those with low computer self-efficacy, may overestimate the AI's accuracy, intensifying the overconfidence that end-user programmers generally have in their programs' accuracy.
Collection
[
|
...
]