Ocr4all Setup Guide

User Manual:

Open the PDF directly: View PDF .
Page Count: 4

Preparation
Choosing the correct Docker Version
Linux and macOS
Docker for Windows
Docker Toolbox

Preparation

• You have to prepare the following folder structure (or simply download it from

https://github.com/OCR4all/getting_started):

o ocr4all (main folder)

 data (folder for the documents you want to recognize)

 models (folder for the OCR models )

• This structure can provisionally be created/downloaded to anywhere in your System.

However, depending on your System (Linux, Windows, macOS), it might be recommended to

move it later, see below.

Choosing the correct Docker Version

• You will need the Community Edition of Docker for Installation.

• If you are fortunate enough to be able to choose between several operating systems or

willing to set up one (just) for OCR4all, we firmly recommend to use Linux over Windows.

• Linux: https://docs.docker.com/install/ (choose your distribution in the left menu and follow

the installation instructions)

• Windows:

o There are two ways of using Docker on Windows: Docker for Windows

(recommended) and the Docker Toolbox.

o Docker for Windows:

 Requires Windows 10, 64 bit: Pro, Enterprise or Education (can be looked up

under System Information)

 https://docs.docker.com/docker-for-windows/release-notes/

(do not choose “Download Docker for Windows” right away, but instead use

“Download” under the “Stable Releases” section below to skip registration)

o Docker Toolbox for other (older) Versions of Windows:

https://docs.docker.com/toolbox/toolbox_install_windows/

• macOS:

o Similar to Windows, there is Docker for Mac und the Docker Toolbox.

o https://docs.docker.com/docker-for-mac/

(do not choose “Download Docker for Mac” right away, but instead use “Download”

under the “Stable Releases” section below to skip registration)

o https://docs.docker.com/docker-for-mac/docker-toolbox/

 For models older than 2010 (< macOS Sierra 10.12).

 Will not be covered in this guide.

Following up, you will find three separate guides, each for Linux and macOS, Docker for

Windows, and for the Docker Toolbox (using Windows).

You can copy the different terminal commands without line breaks from the

accompanying file calls.txt.

If you have any questions/remarks, or run into any problems, please do not hesitate to

wuerzburg.de) or to open an issue on GitHub!

Linux and macOS

Docker Setup

• Follow the instructions under https://docs.docker.com/install/ …

• … and appreciate that everything works without further adjustments!

OCR4all Setup

• The OCR4all folder structure detailed above can be located anywhere you want.

• Open a terminal inside the OCR4all folder and pull the OCR4all image using the following

command (this will take up a few minutes and requires a stable connection to the internet):

sudo docker pull ls6uniwue/ocr4all

• Create the OCR4all container using the following command:

sudo docker run -p 1476:8080 -p 5000:5000 \

-u `id -u root`:`id -g $USER` --name ocr4all \

-v $PWD/data:/var/ocr4all/data \

-v $PWD/models:/var/ocr4all/models/custom \

-it ls6uniwue/ocr4all

(Once again, this may take a while)

Browser Access and further use

• OCR4all is optimized for Chrome/Chromium.

• Browser Access: http://localhost:1476/OCR4all_Web/

• In the Browser Tool, check Project Overview → Project selection: If you can find the two

demo books called “Cirurgia“ and “GNM“ the mapping (-v $PWD/data:/…) is working

properly. Otherwise, it´s likely that there was a typo in the “docker run” command and you

have to create the container again. First, delete the container you just created:

Stop the process in the terminal using CTRL+C, then type: sudo docker rm ocr4all

Check and correct your command, especially the “-v $PWD/data:/…”-parts, then run it again.

• If everything is set up properly, you can (and should!) start OCR4all in the future by using:

sudo docker start –ia ocr4all

Docker for Windows

Docker Setup

• Follow the installation guide under https://docs.docker.com/docker-for-windows/release-

notes/. Make sure to give all needed permissions, install all additional drivers etc.

• Start Docker.

• Adjust the Docker settings (Right-click on the Docker symbol in the hidden bottom-right

toolbar, then chose settings):

o Shared Drives: chosen drive (or partition).

 You will need at least one. Our recommendation is to simply use C.

 Click Apply. (Attention: This requires a valid, non-empty Windows password.

Changing or removing the password later results in a silent removal of your

Docker privileges!).

o Advanced: adjust CPUs (max) and Memory (2GB+) if you want to.

OCR4all Setup

• Move the OCR4all folder structure detailed above to a shared drive (or partition). In the

following example, we use “C:\Users\Public\ocr4all\...”. We firmly recommend using the

same for the first setup.

• Inside the OCR4all folder, open PowerShell (Shift+Right click inside OCR4all folder → Open

PowerShell window here) and load an OCR4all image using the following command (this will

take up a few minutes and requires a stable connection to the internet):

docker pull ls6uniwue/ocr4all

• Create the OCR4all container using the following command (Note: this works only for the

recommended setup, i.e. when the ocr4all folder is located in “C:\Users\Public\...”)

docker run -p 1476:8080 -p 5000:5000 --name ocr4all

-v C:\Users\Public\ocr4all\data:/var/ocr4all/data

-v C:\Users\Public\ocr4all\models:/var/ocr4all/models/custom

-it ls6uniwue/ocr4all

Alternatively, you will have to adjust the paths marked in bold print.

o Use absolute paths and autocompletion!

o It is recommended to not use print working directory (PWD) in this case.

Browser Access and further use

• OCR4all is optimized for Chrome/Chromium.

• Browser Access: http://localhost:1476/OCR4all_Web/

• In the browser, check Project Overview → Project selection: If you can find the two pre-

loaded books called „Cirurgia“ und „GNM“, the mapping (-v C:\Users\...) is working properly.

Otherwise, there might be a typo in the „docker run“ command and you have to create the

container again. First, delete the container you just created:

Stop the process in Powershell using CTRL+C, then type: docker rm ocr4all

Check and correct your command (as with most terminals, you can sift through your previous

commands using the arrow keys), especially the two “-v C:\Users\...”-lines, then run it again.

• If everything is set up properly, you can (and should!) start OCR4all in the future by using

docker start –ia ocr4all

Docker Toolbox

Docker Setup

• Follow the installation guide under

https://docs.docker.com/toolbox/toolbox_install_windows/. Make sure to give all needed

permissions, install all additional drivers etc.

• Start the Docker quickstart terminal and wait for all processes to finish (Give the needed

permissions; This requires a stable internet connection).

• Close Docker quickstart terminal.

• Open Oracle VM Virtual Box.

o Right click on „default“ → Close → Turn Off.

o Click on „default“ → Change → System → Adjust CPUs (almost max) and memory

(2GB+) if you want to → OK.

o It is possible to share additional drives (partitions), however, this is quite complicated

and not recommended or explained further at this point.

• Restart Docker quickstart terminal.

OCR4all Setup

• Move the OCR4all folder structure detailed above into a folder under C:\Users. In the

following example, we use “C:\Users\Public\ocr4all\...”. We firmly recommend to use the

same.

• Inside the OCR4all folder, open PowerShell (Shift+right click inside OCR4all folder → Open

PowerShell window here) and load an OCR4all image using the following command (this will

take up a few minutes and requires a stable connection to the internet):

docker pull ls6uniwue/ocr4all

• Open the Docker quickstart terminal again and create the OCR4all container using the

following command (Note: this only works for the recommended setup, i.e. if the ocr4all

folder is located in C:\Users\Public\...)

docker run -p 1476:8080 -p 5000:5000 --name ocr4all

-v /c/Users/Public/ocr4all/data:/var/ocr4all/data

-v /c/Users/Public/ocr4all/models:/var/ocr4all/models/custom

-it ls6uniwue/ocr4all

Alternatively, you have to adjust the paths marked in bold print.

o Use absolute paths and autocompletion!

o It is recommended to not use print working directory (PWD) in this case.

Browser Access and further use

• OCR4all is optimized for Chrome/Chromium.

• Browser Access: http://192.168.99.100:1476/OCR4all_Web/

• In the browser, check Project Overview → Project selection: If you can find the two pre-

loaded books called „Cirurgia“ und „GNM“, the mapping (-v …) is working properly.

Otherwise, it´s likely that there was a typo in the „docker run“ command, so you will have to

create the container again. First, delete the container you just created:

Stop the process in the docker quickstart terminal using CTRL+C, then type: docker rm ocr4all

Check and correct your command, especially the two “-v…”-lines, then run it again.

• If everything is set up properly, you can (and should!) restart OCR4all in the future by using

docker start –ia ocr4all

Ocr4all Setup Guide

Navigation menu

Versions of this User Manual:

Views

Navigation