Advances in OCR for Historical Chinese, Japanese, Coptic, and Greek Texts | HackerNoon
Briefly

Historical Chinese characters present numerous challenges for pattern recognition because of their extensive character set and diverse writing styles. A method to enhance recognition was proposed by Li et al. which incorporated STM into an MQDF classifier. Through extensive experimentation on both traditional Chinese characters and historical documents, significant reductions in error rates were achieved. The introduction of nonlinear transfer potentially improved accuracy further, and results indicated that merely tagging a small number of samples could significantly enhance recognition performance by lowering error rates.
Historical Chinese characters pose significant challenges for pattern recognition due to their large character set and varying writing styles, impacting classification accuracy.
Li et al. proposed a method for recognizing historical Chinese characters using STM and MQDF classifiers, achieving a marked reduction in error rates during experiments.
The introduction of nonlinear transfer in the classification process along with supervised STMs significantly enhances the classifiers' generalization abilities, expanding recognition capabilities of historical texts.
Testing showed that by labeling just 10% of samples, the error rate for recognized documents could be reduced by up to 60%, highlighting the impact of strategic labeling.
Read at Hackernoon
[
|
]