The New ChatGPT Has a Huge Problem in Chinese
Briefly

According to MIT Tech, nearly all of the 100 longest Chinese-language tokens used by the AI to decipher Chinese prompts were comprised of spammy porn and gambling content.
Experts suggest that the problem of uncleaned data in AI training, such as detecting problematic keywords, could have been relatively easy to fix.
The failure to clean the Chinese language tokens contrasts with the apparently fine English tokens, posing a significant hurdle for OpenAI's capabilities in Chinese outputs.
Read at Futurism
[
]
[
|
]