Artificial intelligencefromHackernoon7 months agoTraining Tesseract for Low-Resource Languages | HackerNoonTrained Tesseract OCR on 1233 Kurdish text lines from pre-1950 documents to advance digitization of Kurdish historical materials.
Data sciencefromHackernoon7 months agoKey Challenges in OCR Research and Future Directions | HackerNoonHistorical Kurdish documents are difficult to digitize due to scarce resources, unclear text, non-standard spacing, complex layouts, and limitations of current OCR methods.