The description is submitted to:


Session R4830 ("Can You Read This One?" New Advances and Issues in Digital Recognition of Printed/Handwritten Arabic Characters), 2017
For almost a year already, I have been lucky to work with a group of colleagues on the development of an open-source OCR solution for the printed Arabic script. Working with a MA computer scientist student who added a neural network library to an existing open-source OCR engine, we have run extensive tests on Arabic printed books, consistently achieving accuracy rates in the high 90s. Our system can be easily trained for new typeset or even script (our tests on Syriac---a language that also uses connected scripts---yielded the same high-accuracy results). It still lacks a user friendly interface and we are currently working on putting together an open online pipeline that will make the use of our system easy and efficient. I will share our experience of putting together a team and working together toward a common goal with very limited funding. Although miscommunication between humanists and computer scientists often creates problems, it is becoming more and more important for humanists to maintain a certain level of computer science literary in order find efficient solutions, especially since most problems have already been solved and what required is an engineering solution that can be implemented by a savvy masters level, computer science student.