Paperwork: OCR based scanned docs manager

Paperwork makes document management as easy as it gets: scan the docs and leave the rest to Paperwork. Find the relevant docs later using text search. Paperwork uses Tesseract OCR in the background to detect text in the documents and whoosh to search the documents.


  • Search suggestions
  • Document labels to find easily
  • Settings to select resolution, orientation, language etc.


Paperwork depends on sane (scanning), tesseract (OCR), GTK/Glade (GUI), whoosh (index, search docs, provide keywords). An official package is not yet available for Ubuntu. Build instructions are available to compile the package on Ubuntu.

On GitHub: Paperwork

Leave a Reply

Your email address will not be published. Required fields are marked *