Home Office travel records used in a trial of a controversial anti-fraud crackdown that under which thousands of parents lost their child benefit were so flawed that almost half of the families initially flagged as having emigrated were still living in the UK, it has emerged. The pilot scheme saved HMRC 17m but left 46% of families targeted incorrectly suspected of fraud, a margin of error far in excess of the 1% to 5% scientifically acceptable.
In early August, data from the Los Angeles Homeless Services Authority showed only two out of 88 beds at an East Hollywood homeless shelter were occupied, a shockingly low rate in a county where some 47,000 sleep on the streets. There's just one big problem, according to the nonprofit PATH, which operates the shelter. The data were dead wrong. Path's internal data showed 84 beds were filled.
Anshul Kundaje sums up his frustration with the use of artificial intelligence in science in three words: "bad benchmarks propagate". He expresses concern about questionable claims made by researchers about AI models, which take months to verify and often turn out to be false due to poorly defined benchmarks. This problem creates misinformation and wrong predictions, as flawed benchmarks are misused by enthusiastic users. The lack of reliable benchmarks threatens to undermine AI's potential to accelerate scientific progress rather than enhance it.
In the version of the article originally published, there were calculation and transcription errors in the data for Fig. 5g, Extended Data Fig. 7c,g and Extended Data Fig. 8e. The corrected data has led to minor changes in the graphs.