Cbrtekstraktor Manual V04 20180627

cbrtekstraktor%20manual%20V04-20180627

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 115

DownloadCbrtekstraktor Manual V04-20180627
Open PDF In BrowserView PDF
cbrTekStraktor

“Strange adventures on other planets”
Space Detective - Issue 4
Published April 1952 by Avon Publications.

D E S I G N

0

Chapter

C U S T O M I Z A T I O N

Preface
Notices
Copyright (c) 2017 - 2018 - cbrTekStraktor
cbrTekStraktor is free software
Permission is granted to copy, distribute and/or modify this software under the terms
of the GNU General Public License as published by the Free Software Foundation;
either version 2 of the License, or (at your option) any later version.
You should have received a copy of the GNU General Public License along with this
program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite
330, Boston, MA 02111-1307 USA to obtain the GNU General Public License
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
GNU General Public License: www.gnu.org/copyleft/gpl.html
Contact details for copyright holder: cbrtekstraktor@gmail.com

1

D E S I G N

C U S T O M I Z A T I O N

Trademarks
cbrTekStraktor relies on the following freely available technology
Java SDK 1.5 through 1.9 or higher
Apache Tesseract 4
Google TensorFlow 1.4 or higher (optionally)
The cbrTekStraktor java source code has been developed to be deployed on various
platforms and operating systems. The Java source code has been tested on the
following platforms and Operating Systems.
Platform
Intel – AMD
Intel – AMD
Intel - AMD

Operating system
Windows 7 (32 and 64 bit)
Linux Ubuntu 16.04
Windows 10 (64 bit)

Release note
Release history of this document
Date
2017-06-05
2018-05-01
2018-06-27

Document version
Draft
Updated version
Added TensorFlow support

2

cbrTekStraktor version
V01 – Build 2017_06_05
V02 – Build 2018_05_01
V04 – Build 2018_06_27

D E S I G N

C U S T O M I Z A T I O N

YouTube channel
https://www.youtube.com/channel/UCy0NfU7-N8RcyI-rj3fSCEw

Comments welcome
Mail your comments or defects reports to: cbrtekstraktor@gmail.com

3

1

Chapter

C B R T E K S T R A K T O R

Introduction
cbrTekStraktor is an application to automatically extract text from the text bubbles or
speech balloons present in comic book reader files (CBR). Its prime goal is to perform
analysis on the texts of comic books. cbrTekStraktor can however also be used for
scanlation or similar purposes.
The application also enables to manually define text areas in CBR files. The application
comprises a simple graphical editor for further processing the extracted text.
The text extraction is achieved by a combination of statistical and graphical processing
operations. It is based on the following 3 major algorithms
 Binarization of color images (Niblak and other methods)
 Connected components
 K-Means clustering
Apache Tesseract is used to perform Optical Character Recognition on the extracted
text.
Google's TensorFlow Inception Visual Recognition Convolution Neural Network can
optionally be used to fine-tune the speech balloon detection.
cbrTekStraktor has some known limitations. It has been conceived to perform
extraction of Western (Roman) characters and will only work on comic pages with a
light background.
Subsequent versions of the application will
 Integrate with translation software in order to provide automated translation of
comic book texts.
 Provide a mechanism to automatically re-inject translated text into the text
balloons

1

C B R T E K S T R A K T O R




[Wikipedia] Scanlation (also scanslation) is the scanning, translation, and editing
of comics from a language into another language. Scanlation is done as an
amateur work and is nearly always done without express permission from the
copyright holder. The word "scanlation" is a portmanteau of the words scan and
translation.

[Wikipedia] A comic book archive or comic book reader file (also called
sequential image file) is a type of archive file for the purpose of sequential
viewing of images, commonly for comic books. Comic book archive files mainly
consist of a series of image files, typically PNG or JPEG files, stored as a single
archive file. The file name extension indicates the archive type used, e.g. CBR or
CBRZ

Associated documents
[01] Christophe Rigaud, Norbert Tsopze, Jean-Christophe Burie and Jean-Marc Ogier :
Robust frame and text extraction from comic books, La Rochelle (France) and
Yaoundé (Cameroon)
[02] Christophe Rigaud, Dimsthenis Karatzas, Joost Van De Weijer, Jean-Christophe
Burie and Jean-Marc Ogier : Automatic tekst location in scanned comic books,
Barcelona (Spain), 2013
[03] Christophe Rigaud, Jean-Christophe Burie and Jean-Marc Ogier : An active
contour model for speech balloon detection in comics, La Rochelle (France)
[04] Karl Tombre, Salvator Tabbone, Loïc Pélissier, Bart Lamiroy and Philippe Dosch :
Text/graphics separation revisited, Vandoeuvre-lès-Nancy (France)
[05] Muhammad Muzamil Luqman, Hoang Nam Ho, Jean-Christophe Burie and JeanMarc Ogier : Automatic indexing of comic page images for query by example based
focused content retrieval, la Rochelle (France)
[06]Zhongliang Fu, Fulin Bian, Songtao Zhou and Qingwu Hu : Algorithm for fast
detection and identification of characters in gray-level images, Wuhan (Republic of
China)
[07] Olivier Augereau, Motoi Iwata and Koichi Kise : A survey of comics research in
computer science. November 2017 14thInternational Conference on Document
Analysis and Recognition, Kyoto (Japan).
2

C B R T E K S T R A K T O R

Public domain Comic Book Archives

The Comic Book Images which are used in this manual have been downloaded from
the “Digital Comic Museum”. DCM is a great site for downloading free public domain
Golden Age Comics. All files here have been researched by DCM‟s staff and users to
make sure they are copyright free and in the public domain.
http://digitalcomicmuseum.com/

3

D E S I G N

2

Chapter

C U S T O M I Z A T I O N

Installation
Distribution
The most recent version of cbrTekStraktor is published on


GitHub (https://github.com/cbrTekStraktor/cbrTekStraktor)



SourceForge (https://sourceforge.net/projects/cbrTekStraktor).

The material distributed via GitHub and SourceForge comprises the source code, an
executable JAR file and this reference manual.

Quick Installation

Prerequisites

A recent Java Runtime Engine (JRE) or Java Software Development Kit (JSDK) is
required.
cbrTekStraktor has been tested with 64-bit Oracle Java SDK 7 and 8 on
Windows 7 and Linux Ubuntu 16.04
The application is based on standard Java Swing functionality and will
therefore more than likely also function correctly on other operating systems
(e.g. OS-X, Red Hat, Windows 10, etc.) and Java other JRE‟s and JSDKs.

W I N D O W S

L I N U X

Windows users need to manually create the folder C:\temp and
c:\temp\cbrTekStraktor\bin
Linux users need to create the directory $HOME/cbrTekStraktor and
$HOME/cbrTekStraktor/bin, in which $HOME is to be substituted by the
actual location of the Linux user‟s home directory.

1

D E S I G N

C U S T O M I Z A T I O N

Installation

Just put the JAR file (cbrTekStraktor.jar) in c:\temp\cbrTekStraktor\bin (or
$HOME/cbrTekStraktor/bin)

Starting the
application

It should suffice to double click on the cbrTekStraktor Jar file to start and run
the application.
In the event that double-clicking on the Jar file does not work, you can
manually start the application as follows

W I N D O W S

L I N U X

Command line
parameters

Open a Windows command window
CD c:\temp\cbrTekStraktor
java –jar cbrTekStraktor.jar
Open a Linux command window
cd $HOME/cbrTekStraktor/bin
java –jar ./cbrTekStraktor.jar

The following command line parameters are supported
-D {project folder name}. The –D options enables to specify the folder name
of the project to be opened. If the project root folder is not specified as a
command line parameter, it will be defaulted to “c:\temp\cbrTekStraktor” or
$HOME/cbrTekStraktor

2

D E S I G N

First time usage

C U S T O M I Z A T I O N

The following dialog will be shown when the application is started for the first
time or whenever one of the required file system components is found to be
missing.
The dialog reportson all folders which are missing and will prompt you to
confirm whether those missing folders can be created automatically. Click
“Yes” if you want to have the missing folders created.

Upon having successfully completed the creation of the missing folders, the
following dialog will be displayed. You should close the application at this
stage and then restart.

Source code installation
Download or clone the cbrTekStraktor application package from GitHub or
SourceForge using the approach you are comfortable with and install the material in
the workspace folder of your preferred Java IDE.

3

D E S I G N

C U S T O M I Z A T I O N

The source code of cbrTekStraktor was created using the Eclipse Neon IDE. The
following screenshot depicts the structure of the Java packages.



[Wikipedia] A JAR (Java ARchive) is a package file format typically used to
aggregate many Java class files and associated metadata and resources
(text, images, etc.) into one file for distribution. JAR files are archive files
with which include a Java-specific manifest file. They are built on the ZIP
format and typically have a .jar file extension.

Apache Tesseract Installation
Tesseract is an optical character recognition engine for various operating systems. It is
free software, released under the Apache License; Version 2.0 Development has been
sponsored by Google since 2006. Tesseract is considered one of the most accurate
open-source OCR engines available.
cbrTekStraktor uses tesseract (e.g. tesseract-ocr 4.00.00alpha) to perform OCR. Given
that only basic OCR functionality is needed, it can safelybe assumed that other
versions of Tesseract will integrate with cbrTekStraktor too.
Read the Tesseract home page on GitHub for a quick introduction
https://github.com/tesseract-ocr/tesseract/wiki
It is recommended to use the latest version of tesseract; you should therefore
regularly upgrade or reinstall Tesseract.

4

D E S I G N

M I C R O S O F T
W I N D O W S

C U S T O M I Z A T I O N

The binaries of the Tesseract OCR engine can be found on
https://github.com/tesseract-ocr/tesseract/wiki/Downloads
cbrTekStraktor V02 was tested using the University of Mannheim‟s
experimental 64 bit tesseract-ocr-w64-setup-v4.0.0-beta version available at
https://github.com/UB-Mannheim/tesseract/wiki

Whilst installing Tesseract make sure to make a note where the installer is putting the
binaries. cbrTekStraktor accesses the Tesseract OCR client via the Windows command
shell or the Linux shell. You therefore need to set the name of the folder holding the
Tesseract binaries via the Project Configuration dialog.
You might also consider installing additional language packs for Apache Tesseract.
cbrTekStraktor will detect which language packs have been installed and use those
when appropriate. Language packs can be found via https://github.com/tesseractocr/tesseract/wiki
Once installed Tesseract‟s installation directory should resemble the following.

5

D E S I G N

C U S T O M I Z A T I O N

You should run an installation test on Tesseract by issuing the following instruction
from a command window: tesseract. This will result in an exhaustive usage message.
Alternatively run the tesseract - - versioncommand. A double dash is required.

6

D E S I G N

C U S T O M I Z A T I O N

The installation of Tesseract 3 on Ubuntu 16.04 is straightforward.There are
plenty of resources on the World Wide Web commenting on how to install
and use Tesseract on Linux. The following URL explains how to install
Tesseract on Ubuntu 16.04.
https://www.howtoforge.com/tutorial/tesseract-ocr-installation-and-usageon-ubuntu-16-04/

L I N U X
( U B U N T U )

In a nutshell, you need to consecutively run the following commands in a Unix shell
>sudo apt install tesseract-ocr
>sudo apt-get install tesseract-ocr-[lang]
The first command will prompt for the root password and then install the latest
version of Tesseract for Ubuntu.
The second command will install a language pack. For example tesseract-ocr-fra will
install the French language pack. Re-run the command for any of the language packs
you want to install.
Once the installation has completed you should perform a quick installation test on
Linux. Open a Unix command window and run the following two commands:


In order to determine where tesseract has been installed, issue the command
“type –a tesseract”. In most cases this will result in /usr/bin/tesseract.



Should the installation directory be different from /usr/bin you will need to
correctly configure the Tesseract Installation Folder parameter on your
cbrTekStraktor project. The section “HowTo: Projects” provides detailed
instructions on cbrTekStraktor projects and how to configure the Tesseract
installation folder.

Next simply run the command ”tesseract” in the Unix shell. Detailed status and error
messages will be displayed.

7

3

Chapter

C B R T E K S T R A K T O R

Main screen
Summary
This section provides step by step instructions on how to use the cbrTekStraktor
application.
Starting the application
See previous section to learn how to start the application.

The above picture shows the main screen. The size and the location of the main
screen are reused when the application is restarted.
The major functions of the application can be accessed via set of buttons located on
the left of the application‟s main canvas.
Status information is displayed on the top right corner of the application‟s main
window.

1

C B R T E K S T R A K T O R

Image

This button enables to select a scanned image of a comic book page (or any
other image in JPG, GIF or PNG format) and display it on the canvas.

Extract Text

This button enables to select a scanned image of a comic book page and
extract the textual information on it.

Edit

Pressing this button will open the graphical editor, which enables to examine
the various graphical components of a single comic book page.
The editor also enables to manually select or deselect textor graphical
components and to further edit or translate the extracted text.

OCR

The Optical Character Recognition functionality is accessed via this button.

Translate

The translation component is currently not implemented.

Report

By pressing this button one opens the reporting component, which provides
access to summarized graphical and statistical information of a single comic
book page.

Re-inject

This will enable to re-inject text into the previously identified speech balloons
or other text areas. This option is currently under development.

Other

The “Bulk processor” checkbox will activate the bulk processing option. This
enables to perform the text extraction on a set of comic book pages stored in
a single folder. This option might therefore be used to extract and OCR the
text of an entire comic book.
The “spinner” component enables to enlarge or shrink the image displayed on
the screen. It should only be used in “Image” mode. In fact it is recommended
no to use when in “Edit” mode.

2

C B R T E K S T R A K T O R

Menu bar
All of the above and additional functions can also be accessed via the menu bar items
“File” and „Tools”.
See the picture below for a quick overview of the available menu items.

Pop-up menu
Right-clicking of the main canvas will open a pop-up menu, comprising similar
menuitems as the ones on the menus discussed in the above section.

3

C B R T E K S T R A K T O R

Additional major functionality
Additional functionality, whichcan be accessed via the menu bar. There are 2 main
menu items
 Files


Properties

Tools
The Files>Properties menu item provides access the “Edit option” and
“Tesseract Option” Dialogs.
The “Edit option” dialog enables to customize the look and feel of the Comic
Page Editor, e.g. the background drop. See section “Edit mode”

The Tesseract Option dialog screen enables to handpick one of the many
Tesseract options and parameters and set it to an appropriate value.See
section “OCR mode”.

4

C B R T E K S T R A K T O R

Statistics

The Tools>Statistics function collects statistical information on the entire set
scanned comic book files available in the $PROJECTDIR folder. See the
“Developer Notes” section for more information on the folder structure of the
application and to learn where to locate the $PROJECTDIR folder.

Housekeeping

Tools>Housekeeping. This will prune (remove) temporary files from the
cbrTekStraktor workfolders.

Import

Files>Import. The import function is currently not implemented.

Export

Files>Export. The Export function creates a file of all extracted textual
information within a project (see cbrTekStraktor Projects).

5

4

Chapter

C B R T E K S T R A K T O R

How To : Image mode
Introduction
The following picture shows the cbrTekStraktor application when running in “Image”
mode.
When you click on the “Image” button a file selection dialog will be presented,
enabling to browse through your comic book (or other) image files.In this example,
an 18th century caricature “1024px-Caricature_gillray_plumpudding.jpg” image file is
rendered.
Note: The 5 most recent selected images can quickly be re-accessed on the “File”
menu.

1

C B R T E K S T R A K T O R

Marquee

The following buttons are available on the marquee

Save, this enables to save the image

Refresh, this will reload the original image

Info, this will open the “Image Info” dialog

Colour histogram

A color histogram is present in the bottom left corner. The histogram (or
frequency diagram) shows the distribution of the Red, Green and Blue (RGB)
color components of the pixels present in a picture (JPG,GIF, PNG) on a scale
of 256. In which 0 is the most intense and 255 the lightest value.
The circles separate from the vertical axis show the median of each RGB
component.
The circles on the vertical axis show the means of the RGB component.
The histogram also shows the frequency distribution of the luminance or the
“Alpha” channel.

2

C B R T E K S T R A K T O R

Image filters
A choice of image filters is available on the drop-down list to further process the
image.

Filter
Bleach

Description
RGB to HSB conversion followed by lowering the hue component.
(See the detailed info on HSL and HSV in the appendix to this
reference manual).

Blueprint

Binarization (by default Niblak is used) and subsequent reduction to
the Blue color component.

Convolution Blur

Blurs the image via convolution (explained in the appendix).

Convolution Edge

Applies the edge convolution filter

Convolution Gaussian

Applies a Gaussian blur filter.

Convolution Sharpen

Applies a sharpening convolution filter

3

C B R T E K S T R A K T O R

Gradient narrow

Applies a narrow gradient transformation

Gradient wide

Applies a wide gradient transformation (explained in the appendix).

Grayscale

RGB to grayscale conversion using the formula describedat the end
of this section.

Histogram Equalization

Histogram equalization image transformation (explained in the
appendix).

Info

Opens a pop-up window displaying the Image properties

Invert

Inverts the colors on the RGB image

Mainframe

Binarization and subsequent reduction to the Green color
component.

Monochrome (Niblak)

Binarization using the Niblak transformation (see appendix)

Monochrome (Otsu)

Monochromization or binarization using the Otsu transformation
(explained in the appendix).

Monochrome (Sauvola)

Binarization using the Sauvola transformation (explained in the
appendix).

Original

This option redisplays the original image.

Sobel

Applies a Sobel filter (explained in the appendix).

Sobel on grayscale

Applies a Sobel filter on the grayscaled image

4

C B R T E K S T R A K T O R

The next picture shows the result of applying the “inverse” image filter.Apart from a
bizarre aesthetical interest there is no practical usage known for inverting the color
schema of an image.

You can save the result of the image processing by pressing “Save” and providing a
file name. The screenshot below shows the how the result of the “Blueprint” image
filter are about to be saved to a PNG file.

5

C B R T E K S T R A K T O R

Comic Page Info Screen
The Comic Page Info Screen provides access to a selection of characteristics of a
Comic Page Image. One can either examine or specify various characteristics of a
comic book and page.
The Comic Page Info Screen can be accessed via the marquee buttons, the menubar
or the pop-up menu. Click on “Info” to open this dialog.

The top of the screen comprises
 The RGB Histogram of the image is located on the left hand side.
 On the right is the histogram of the gray scale information. The peaks and valleys
on this histogram are used to determine whether the picture is a monochrome
image or not.
 The Box Plot of the RGB information. The Box Plot diagram shows the first and
third quartiles, median and mean of the RGB and Grayscale values of the pixels.

Standard
Deviation

Third Quartile

Mean
Median
First Quartile

6

C B R T E K S T R A K T O R

The histograms on the top of the screen can be collapsed in order to reduce the
clutter on your desktop by setting the “Hide histogram” option to “Yes”.



[Wikipedia] In descriptive statistics, a box plot or boxplot is a convenient
way of graphically depicting groups of numerical data through their
quartiles. Box plots may also have lines extending vertically from the boxes
(whiskers) indicating variability outside the upper and lower quartiles, hence
the terms box-and-whisker plot and box-and-whisker diagram. Outliers
may be plotted as individual points. Box plots are non-parametric: they
display variation in samples of a statistical population without making any assumptions of the
underlying statistical distribution. The spacing between the different parts of the box indicate
the degree of dispersion (spread) and skewness in the data, and show outliers. In addition to
the points themselves, they allow one to visually estimate various L-estimators, notably the
interquartile range, midhinge, range, mid-range, and trimean. Box plots can be drawn either
horizontally or vertically.Box plots received their name from the box in the middle.A JAR (Java
ARchive) is a package file format typically used to aggregate many Java class files and
associated metadata and resources (text, images, etc.) into one file for distribution. JAR files are
archive files with which include a Java-specific manifest file. They are built on the ZIP format
and typically have a .jar file extension.

7

C B R T E K S T R A K T O R

The bottom of the Image Info Screen consists of
Label
CMXUID

Description
Comic Book Unique Identifier, which is a simplified normalization
of the filename of the Comic Book Page picture. The
normalization consists of removing all non-alphabeticalcharacters
from the file‟s name.

UID

A unique identifier comprised of a hexadecimal number of 32
characters.

ISBN

The International Standard Book Numberof the comic book. You
will need to enter the ISBN number manually. On www.isbn.org
most ISBNs can be found.

Series

Name of series of the comic book

Series sequence

The sequence number of a comic book in a comic book series

Book title

Comic book title

Page

The page numberof the image in the comic book

Penciller
Colorer
Writer
Comment

Information on the various authors who contributed to the
creationof the comic book.

Folder

The folder where the comic book page image is stored

File

The name of the scan (or picture) of the comic book page

Size

The size of the picture in pixels and in Bytes, as well as Dots per
Inch (DPI) information (if present in the image).

Color schema

cbrTekStraktor will determine whether the picture has a
monochrome, grayscale or colorscheme. If needed you can
overrule the color schema detected.

Language

The language used in the speech balloons or text areas. If the
matching language pack has been installed, Tesseract will be

Can be used to provide additional comments

8

C B R T E K S T R A K T O R

instructed to use it.
Binarization technique

One can select the binarization technique to be used when
processing the comic book page image. The options are
{NIBLAK, SAUVOLA, OTSU, BLEACHED, ITERATIVE}.

Cluster classification
method

This is the method used when determining which one of the
connected components cluster comprises the textual information,
a.k.a. the character cluster or text paragraph.
By default the method is set to “automatic”. If needed one can
select “Cluster 1” through “Cluster 5” to override the
automatically identified character cluster.
See section “Text Extraction” for a detailed discussion of cluster
classification.

Proximity tolerance

{WIDE, LENIENT, TIGHT, ULTRA_WIDE} The proximity level is
used when adjacent characters are combined into words and
paragraphs. The default value is “Tight”.

Crop Image

By selecting this option, one can remove the margins of a comic
book page.
Cropping the image reduces the size of an image and will
therefore shorten albeit marginally the length of the text
extraction process.

TesseractCuration

This option enables to apply additional image processing filters
prior to performing the OCR step. In particular blur an image or
increase the DPI (Dots Per Inch). Tesseract notoriouslyperforms
best on 300+ DPI images.

The characteristics of a comic page are stored in the zMetadata_.xml file,
which is part of the Archive file. See section “Developer Notes” for a discussion of the
contents of the cbrTekStraktor Archive file.

9

C B R T E K S T R A K T O R

Troubleshooting
The cbrTekStraktor application functions optimal when you are using a monitor of
substantial size and a display card supporting the higher resolution ranges. Images
present in CBR files often have widths and heights of more than 1000 pixels, so you
will also need a hefty computer to support cbrTekStraktor image processing activities.

10

5

Chapter

C B R T E K S T R A K T O R

How To : Projects
A project is an arbitrary grouping of comic book images. One could envisage to
create a cbrTekStraktor project that comprises all images of a single comic book or to
create a project of comic book images created by the same penciller.
A project is physically little more than the name of the folder on the Windows or Linux
file system containing a predefined set of files and folders which are required by the
cbrTekStraktor application.
See “Developer Notes” for detailed information on the folders and files that constitute
a project.

1

C B R T E K S T R A K T O R

Editing a project
The properties or settings of a project can be set and modified via the menu
item“File>Project>Edit project”

When cbrTekStraktor is started for the first time the “Tutorial” project is automatically
created with default configuration settings. It is recommended to change these
settings to match your local environment and preferences.

Property
Encoding

Description
Options are { UTF-8, UTF-16, ISO-8859-01, ASCII}
You can select the encoding of the files created by cbrTekStraktor.
Default encoding is ISO-8859-01 (Latin 1)

Editor backdrop

There are loads of options { BLEACHED, BLUEPRINT, GRAYSCALE
RASTERIZED, COLOUR RASTERIZED, NIBLAK , SAUVOLA,
MAINFRAME, NONE, ORIGINAL }
When in Edit mode the backdrop setting defines the picture that is
displayed on the work area. Just choose the backdrop which you like
best.

Language

The drop-down list contains all the languages supported by Tesseract.
The project language is the default language to be used for all image
2

C B R T E K S T R A K T O R

files of that project.
The language can be overruled per page.
Browser

The drop-down list contains the supported HTML browser. These
browsers just render the reports that are stored in HTML format.
Options are { MOZILLA , EXPLORER, CHROME }

Project Name

The name of the project.
The name of the project will stripped of non-alphanumeric characters
and will be used as the name of the $PROJECTDIR folder‟s name.

Description

A short description of the project

Mean Character
count
Horizontal vertical
variance threshold
Tesseract folder

See developer notes

Preferred Font
name

The name of the preferred Font. The Preferred Font is used on all
screens and dialogs.

See developer notes
This is the name of the folder where the Tesseract binaries are stored.

Be aware that the default Font is set to “Comic Sans Serif”. Change ad
lib.
Preferred Font Size

The preferred size of the font. Sizes between 10 and 12 work best.

Python home

This is the name of the folder where the Python 3.5 binaries are
installed. Python is required when using the Google Artificial
Intelligence Image Recognition (AI VR) software components. The AI
VR module is an optional module available from cbrTekStraktor V04
onwards.

Maximum number
of threads

This is the maximum number of threads that will be started when
using the AI VR component.

Logging Level

The logging level can range between 0 and 9. Level 0 is terse
logging, level 9 provides very detailed logging.
The logging and error information is displayed on stdout and stderr
3

C B R T E K S T R A K T O R

and redirected to the log and error files in the $PROJECTDIR folder.
Date format

The Java Date Format string. See docs.oracle.com on the options of
the Java Date Time format

Size

This is the totalbyte size of all objects in the current project.

Number of archives

This is the number of all objects in the current project.

First accessed

Timestamp of the moment the current project was created.

Last accessed

Timestamp of the moment when the last time an object was created.

Prior to saving the Project properties a validation of the property values will be
performed. This will for example prevent of specifying an in correct Java Date Format.

4

C B R T E K S T R A K T O R

Creating a project
A new project can be created via “Files>Projects>New project”
The configuration settings of the current project will be inherited by the newly created
project.
When creating a new project it often suffices to merely provide the name and
description of the new project.

5

C B R T E K S T R A K T O R

Selecting a project
You can switch between project via the menu item “Files>Projects>OpenProject”. An
overview of the projects which are available will be presented on the topmost dropdown list. Select the project you want to switch to and then click on the “Switch”
button.

6

C B R T E K S T R A K T O R

Project properties file and current project
The current project will be reused when the application is restarted. The current
project is saved in the Project Properties Files.
The Project Properties file is located one folder up from the current project. When
using windows this will more than likely be in C:\temp\cbrTekStraktor. When using
Linux the Project Properties files will be stored in $HOME/cbrTekStraktor.
In both cases the file is named “cbrTekStraktorProjectConfig.txt”. See the “Developer
Notes” section for more information on the contents of this file.

Defining the project via the command line
You can set the project to be used via the command line option –D.
For example: java –jar cbrTekStraktor.jar –D C:\temp\myComicProject

7

6

Chapter

C B R T E K S T R A K T O R

HowTo : Text extraction
The text extraction process is started by pressing the “Extract” button on the main
screen.
Alternatively when in “Image mode” you can opt to start the text extraction process on
the current image by double clicking on it or by pressing the “Extract” button.
The text extraction process runs through the following four steps. The extraction
process can be interrupted by pressing the “Stop” button on the marquee.

First step
You will be prompted to select an image for which the text is to be extracted. The
images need not be stored in the cbrTekStraktor $PROJECTDIR folder.

1

C B R T E K S T R A K T O R

Second step
The image which was chosen in the previous step is displayed and the Comic Book
Metadata dialog opens.

In most cases itsuffices to just select the language of the comic book‟s text on this
dialog.It is important to correctly define the language, because later on it is used by
the Tesseract OCR engine.
Additional fine-tuning of the extraction process
The image which was chosen in the previous step is displayed and the Comic Book
Metadata dialog opens.
Additional fine-tuning of the extraction process
Option
Color schema

Comment
In the event that the color scheme was not correctly identified,
one can set its correct value either to Color, Grayscale or
Monochrome.
You should ensure that the correct colorscheme is set prior to
starting the next phase of the text extraction.
The accuracy of the color schema detection algorithm is
monitored for future enhancements. The information is stored in
the zMetadata XML file and is used to track the correctness of
2

C B R T E K S T R A K T O R

the color scheme detection logic.

Binarize Method

Options are
{FAST_BLEACHED,ITERATIVE,SLOW_NIBLAK,SLOW_SAUVOLA,OT
SU}
It is advised to use the default binarization method (Sauvola or
Niblak). OTSU and Bleached are valid options too. It is not
recommended to use the “Iterative” option any longer.

Cluster Classification
Method

Options are {AUTOMATIC,
CLUSTER1,CLUSTER2,CLUSTER3,CLUSTER4,CLUSTER5}
It is advised to use the “Automatic” option.
A key element of the text extraction process is the module that
decides which of the clusters that have been created using the KMeans algorithms, contains characters. The idea is to create
clusters of similar components and subsequently classify these
clusters in
 A cluster containing frames and borders


A cluster containing groups of characters, i.e. paragraphs.



Clusters containing noise

The Cluster Classification modulesometimes picks the wrong
cluster for the Paragraph cluster. This happens when the
characters on the comic page are rather large. The symptoms of
such a misclassification can easily be spotted in “Edit” mode: most
of the characters will tagged to be “noise” and smaller pictorial
elements resembling characters will be tagged to be “Letter”.
(“Letter” is the tag used for components which are deemed to be
characters. It is a misnomer, typical for Dutch native speakers.)
In the event the Automatic Cluster Classification designates the
wrong cluster to comprise text, you can override it by arbitrarily
setting it to Clusters 1 through 5. A good approach is to start by
setting the Text Paragraph to be Cluster2 and restart the text
extraction process.
Proximity Tolerance

Options are {TIGHT,LENIENT,WIDE, ULTRA_WIDE}

3

C B R T E K S T R A K T O R

The proximity tolerance is used to group characters into words
and paragraphs. In essence this is achieved by clustering
graphical components based on the distance between the
components. The default Proximity Tolerance is “Tight”. Tight
assumes the inter-character space to be rather narrow. Adapt to
Lenient or Wide if deemed appropriate.
Crop Image

Options are {YES,NO}
When Crop is set to Yes, the margins on a comic book page will
be detected and removed in order to reduce the size of the
image. The text extraction process runs faster when images are
cropped to their actual payload.

Tesseract Curation

Options are
{IGNORE_DPI,USE_IMAGE_DPI,INCREASE_AND_CONVOLVE,INCR
EASE_AND_FOURRIER}
Tesseract reads the DPI information from the Image File
metadata. Tesseract works best on 200+ DPI images.
USE_IMAGE_DPI will ensure to set the DPI on the image
submitted to Tesseract. INCREASE_DPI will attempt to increase
the DPI of an image to approximately 300 DPI.
INCREASE_AND_CONVOLVE and INCREASE_AND_FOURRIER will
increase the DPI and will apply some blurring to the image prior
to being processed by Tesseract.
INCREASE_AND_FOURRIER is not implemented in cbrTekStraktor
V01.

The actual text extraction process will commence when pressing OK on the Image Info
dialog box.

4

C B R T E K S T R A K T O R

Third step
The original image is cropped, turned into a grayscale image and then binarized.
These intermediary images are displayed as the text extraction progresses. No user
actions are required during this phase.
Example grayscale image.

Example binarized (monochrome) image. This is an important step in the process. Its
purpose is to get crisp characters that distinctively stand out. By default the Sauvola or
Niblak binarization method is used.

5

C B R T E K S T R A K T O R

Concluding Step
The extraction process ends by displaying cut-outs of the text paragraphs which have
been identified.

Character Paragraphs have a green border and non-character paragraphs have a red
border. Frames have a lilac border.
The text extraction process stores the results in an “Archive” file in the
$ROOTDIR/Output/Archive folder. An “Archive” file is a ZIP file that holds a broad
variety of results, e.g. statistical data, image information, etc. See the “Developer
Notes” section for detailed information.

Marquee

The following actions can be performed via the buttons on the marquee.

The Save Button enables to save this resulting picture

The Refresh button will redisplay the picture

The Info button will open the Comic Book metadata dialog
The “Edit mode” can be activated by double clicking on the canvas or by
pressing the “Edit” button.
Bulk mode
The text extraction process can run on entire sets of comic book pages. The bulk
extraction will process all images within a single folder.
6

C B R T E K S T R A K T O R

The Bulk extraction process is similar to the single page text extraction process. It can
be interrupted by pressing the “Stop” button on the marquee.
Bulk mode initial step : folder selection
You need to set the “Bulk extraction” option on the main screen before the caption on
the “Extract” button will change to “Bulk”. When you click on this button you will be
prompted to select the folder containing the set of images to be processed.

Bulk mode second step : Comic Page Info
The Comic page Info screen will then appear. The settings that you define on this
screen will be applied to all images present in the bulk extraction folder. It is therefore
recommended to only run the bulk extraction process on images sharing the same
characteristics, e.g. pages from the same comic book, a set of images, which are all
monochrome, etc.

7

C B R T E K S T R A K T O R

Bulk mode : Progress monitor
The text extraction process will be performed on each image file in the folder selected.
The progress of the text extraction can be observed on the monitoring screen.

8

C B R T E K S T R A K T O R

Bulk mode : concluding step
The extraction process will stop by displaying the cut-out pictorial elements of the last
image in the folder. The monitor dialog will close automatically after a short period of
time.

Troubleshooting
The cbrTekStraktor application functions optimal when you are using a monitor of
substantial size and a display card supporting the higher resolution ranges.

9

7

Chapter

C B R T E K S T R A K T O R

How To:Edit mode
This section describes actions which can be performed when in Edit Mode. The edit
mode provide a GUI enabling
 to modify the results of the text classification process: remove paragraphs, define
character paragraphs, define non-character paragraphs, etc.


to manually enter the text within a speech balloon



to translate the text within a speech balloon

Opening an archive for editing
Click on “Edit” to start editing a previously processed Comic Page. A file browse
dialog will open enabling you to select an Archive file. Archive Files are located in
$PROJECTDIR/Output/Archive and have a name ending on “_set.zip”.

Alternatively you can double click on the canvas once the text extraction process has
been completed on a Comic Page image. You will be asked whether you want to start
editing the current comic book image.

1

C B R T E K S T R A K T O R

Editing
After choosing the “spacedetective2_set” archive on the previous dialog the below
screen will open.

Marquee

The following buttons are active on the marquee

Save


Quick



Refresh



Ingest



Option

These functions are commented upon in the remainder of this section.
The backdrop of the edit screen is the Comic Image file upon which an image
processing filter has been applied. It is recommended to select a backdrop that nicely
contrasts with the original comic page image. In the example above the “Black
Bleached” filter has been applied.
Character paragraphs have a green border

2

C B R T E K S T R A K T O R

Non-character paragraphs have an amber border
There is a crosshair mouse pointer. Crosshair pointers are tacky, but have the
advantage to be able to precisely select an image element.
In the example above you will see that some character paragraphs have erroneously
been classified to be non-character paragraphs, e.g. “Chapter One Spaceship of the
dead” has an orange border, whereas this is a character paragraph and therefore
should have a green border. In the next section you will be shown how to fix this.

Hovering
When hovering over a pictorial element an information box will be shown for a couple
of seconds, providing succinct information on that element. In the example above a
non-character paragraph, size 25x44 and featuring 2 Child objects; is in the crosshairs.
When you move the crosshairs over an element enclosed by a border, its background
will momentarily adapt a rosy sheenand its constituting elements will be outlined in
red. In the following example the characters which are part of the “Chapter One” text
balloon are displayed.

Quick edit
The quick edit screen can be accessed by clicking on the “Quick” button on the
marquee.

3

C B R T E K S T R A K T O R

The quick edit screen puts detailed information on character paragraphs, noncharacter paragraphs, frames, noise and other types of component at your fingertips.
The tick box in the first column you can define whether or not a paragraph contains
text.
The tick box on the “removed” column enables to remove (or delete) a paragraph.
The “Extracted text” column can be used to enter or edit the textual information on
the speech balloons. In the event that the image has been OCR‟ed, it will contain the
automatically extracted textualinformation.
On the drop down list you can select which type of pictorial information you want to
see displayed, e.g. noise, frames, potential text, etc.
If you changed to characteristics of a pictorial element the “Confirm” button will
become active.

Detailed edit – Ingest
The detailed edit dialog can be accessed by clicking on “Ingest” button on the
marquee or by double clicking on a paragraph.

4

C B R T E K S T R A K T O R

The dialog enables to navigate through the various paragraphs. Use the Previous and
Next Buttons.
The image of a character paragraphs is displayed between thick green vertical bars.
Non-character paragraphs have red borders.
You can define whether a paragraphs contains text or not via the “Is a text paragraph”
tick box.
The paragraph can be deleted by pressing the “Delete” button.
The monochrome tick box can used to display a monochrome version of the
paragraph image.

Keying in text
The topmost text entry box is used to edit the original text. The bottom text box can
be used to store translated text.

5

C B R T E K S T R A K T O R

In the event that the OCR process has been performed, the OCR‟ed text will be
displayed in the topmost text entry box. See the example above.

Edit options
The edit option dialog opens when you click on Edit Options

The edit option dialog is used to change the appearance of the Edit canvas
Option

Description
6

C B R T E K S T R A K T O R

Show payload boundaries

This will set out the boundaries of the comic page margins in
lilac.

Show frames

This will put bluish lines around the frames within a comic book
page.

Show paragraphs

This will set out the non-character paragraphs in red.

Show text paragraphs

This will draw a green border around character paragraphs, e.g.
speech balloons.

Show characters

This will put a pink border on the image components that have
been identified to be characters.

Show noise

This will put pink borders around any image component.

Show valid components

This will show the components which are valid.

Show invalid components

Puts border on those image components which are invalid. See
“developer notes”.

Backdrop

This drop-down list enables to set the type of backdrop you
want on see displayed in Edit mode.

Wisker

This drop-down list enables to define the color of the crosshair
pointer.

How to delete a paragraph
If you select a paragraph and left-click on it for more than 2 seconds, a thick red
border will be put around the paragraph and you will be asked whether you want the
paragraphs to be removed (deleted).

7

C B R T E K S T R A K T O R

In the example above the object that comprises the face of the Space Detective Hero
and snippet of text are combined into a single object. A possible manner to correct is
to remove the object and create a new object that only contains the text.
The objects which have been removed can be seen in the Quick Edit dialog by
selecting “Potential Text Area” and trawling for images which have been crossed-out
by a thick red line.

How to create a new text paragraph

8

C B R T E K S T R A K T O R

A new paragraph can be created by positioning the pointer on the top-left corner of
the object to be created and dragging the cursor to the bottom-right corner of the
object to be created.
Whilst dragging the pointer, a light-blue rectangle will be displayed. When the
dragging operation is completed, you will be prompted to confirm the creation of a
new object.

It is recommended to refresh the screen to reflect the changes made (by pressing the
“Refresh” button on the marquee).
The freshly create object should now be visible in both the Quick Edit and Detailed
Edit dialog screens.

9

C B R T E K S T R A K T O R

How To quickly delete or change the characteristics of a paragraph
A pop-menu will open when you position the crosshairs over a paragraph and rightclick on it.

The pop-menu permits to
 Delete the paragraph


Toggle between character and non-character

Pop-up menu
The functionality described in this section can also be accessed via the pop-menu
which is opened by right-clicking anywhere on the canvas.

Saving changes
In the event that changes have been made to any of the components of the comic
book page a greenish hue can be observed around the “Stop Edit” button. When you
click on this button you will be asked to confirm the changes.

10

C B R T E K S T R A K T O R

Note. Changes to the image components will be stored in the _stat.xml file and
changes to the text information will be stored in the _language.xml file.
These XML files are part of the Archive file. The previous version of these XML files will
be timestamped and maintained in the Archive file. This enables you to roll-back any
of the changes made.

11

8

Chapter

C B R T E K S T R A K T O R

HowTo : OCR
This section describes the Optical Character Recognition process
Tesseract is an optical character recognition engine for various operating systems. It is
free software, released under the Apache License, Version 2.0 and development has
been sponsored by Google since 2006.

Prerequisite
Tesseract is required to be installed prior to be able to perform OCR. You can also opt
to install additional language packs.
You need to set the name of the folder in which the Tesseract binaries are stored via
the Project Configuration dialog.
The Text Extraction process must have been completed on a comic page image
before you can perform OCR. The OCR process uses the Archive file as its prime
input. If you are not satisfied with the result of the Text Extraction process, you can
manually change the contents of the Archive file via the “Edit” option.
The quality of the scanned image greatly affects the results of the OCR process, in
particular the resolution and DPI of an image. Lately the resolution of scanned
images has considerably been enhanced, up to resolutions of 1700x2300 and more.

1

C B R T E K S T R A K T O R

Example
The next screenshot shows the Comic Page which is used as an example to perform
OCR upon.

Starting the OCR process
First step
The OCR process is started by clicking on the OCR button and selecting the Archive
file of the comic page that you want to OCR.

Second Step

2

C B R T E K S T R A K T O R

The text in the speech balloons will be sourced from the results of the text extraction
and edit processes. The text within a paragraph will be extracted from the original
image and it will be flattened and put onto a single line. Each paragraph will be
preceded by a Unique Identifier (UID).
The results will be displayed and saved in an Image file (OCRoutput.png).

The header line of the Tesseract OCR Image has a reference to the Comic Page
Image file (p:briningupfather08-08), the DPI (d:150) and the Comic Page 32 Character
Hex UID (u:513A-3078-1D92-55B3-A709-1F58-C3AA-7E84).
In order to enhance the Tesseract OCR process, the resolution of the image file might
be increased to 300 DPI and might be slightly blurred (using a convolution or Fourier
transformation). See “Tesseract Curation” option on the Comic Page Info dialog.

Third step
The Tesseract options set via the “Tesseract Option” dialog are fetched and stored in a
parameter file (TesseractOptionRepository.xml which is located in the
$PROJETDIR/OCR folder).
The Tesseract OCR client is called via its command line interface using the recently
created OCR image file and the Tesseract parameter file.

3

C B R T E K S T R A K T O R

The result of the Tesseract OCR process are stored in the OCR Result File
(OCRResult.txt) which is located in the $PROJECTDIR/OCR folder.
The Header information on the Tesseract OCR file is used to determine the maximum
accuracy of the OCR process. The accuracy percentage is reported on the log file but
also on the cbrTekStraktor status bar.
See the “Developer Notes” section for detailed information on the OCR Folder and
files.

Fourth Step
The OCR Result file is read and parsed. The text of each paragraph is stored in the
Language File which is part of the Archive file.
The OCR‟ed text is read from the Archive file and displayed on the following screen.

Note. V02 applied some changes to the OCR process.
 The tags, which precede each paragraph, have been reworked to contain a
repetitive numerical pattern. This enhances the ability of the OCR post-process to
link a paragraph tag to its contents when extracting the OCR‟es texts from the
OCRResult.txt file.


The horizontal alignment on the Tesseract input image of the various lines within
a single text bubble has been improved.
4

C B R T E K S T R A K T O R



The output of the Tesseract OCR process is “scrubbed” prior to being loaded into
the cbrTekStraktor application. The scrubbing process is rather coarse and
removes all characters outside the 0x00000020 and 0x000000ff range.



The contents of the Language File are overwritten after each OCR run.

Bulk mode
The OCR process can run on entire sets of comic book pages. The bulk extraction will
process all images within a single folder, of which the text has previously been
extracted.
The Bulk OCR process is similar to the single page OCR process. It can be interrupted
by clicking on the “Stop” button on the marquee.
OCR Bulk mode initial step : folder selection

OCR Bulk mode : progress monitor

5

C B R T E K S T R A K T O R

Tesseract Option File
Tesseract has loads of control parameter settings which can be used to modify its
behavior. A list of all parameters with default value and short description can be
retrieved by issuing the following command: tesseract --print-parameters
The cbrTekStraktor application enables to browse through the various Tesseract V4
parameters and to set or unset those.
The Tesseract option dialog is accessed via “File > Properties > Tesseract options”.

6

C B R T E K S T R A K T O R

Note. $$DEBUGFILE$$ is an internal cbrTekStraktor variable for the default name of
the file holding Tesseract Logging information; TesseractLog.txt which is located in the
$PROJECTDIR/OCR folder.
The options which have been activated to be used are displayed at the beginning of
the list and have the tick box “Withhold” set. The default settings are documented in
the following table.
Parameter
debug_file
paragraph_debug_level
tessedit_char_whitelist
textord_heavy_nr

Setting
c:\temp\cbrTekStraktor\Tutorial\Ocr\TesseractLog.txt
1
ABCDEFGHIJKLMNOPQRSDTUVWXYZ012345789
1

Note. If you close the monitor dialog during a bulk run, you can re-open it via “Tools
> More > Monitor”.

7

9

Chapter

C B R T E K S T R A K T O R

HowTo : Artificial
Intelligence Visual
Recognition
Context
The 2017 version of cbrTekStraktor often gives false positives when classifying the
aggregated objects into text or non-text paragraphs, i.e. too many areas of a comic
book page are wrongly identified as speech balloons. The below screenshot depicts
the results of extracting text from the cover page of “Space Detective N04” (see last
page of this manual). The root cause of this misclassification is possibly located in the
process step that groups characters into text paragraphs based on a basic proximity
rule.

A first attempt to remedy this issue comprises to “bolt on” an additional classification
process, which is leveraging the visual recognition capabilities of recently
commodified artificial intelligence (AI) software components. In the particular case of
cbrTekStraktor, Google Tensorflow‟s Inception V3 Image Classifiernow formsan
additional and concluding text extraction step.
See https://www.tensorflow.org/tutorials/image_recognition for more information on
the Inception V3 AI VR component.
The AI VR integration is an optional module in cbrTekStraktor, i.e. text extraction can
be performed without this module.
1

C B R T E K S T R A K T O R

Conceptual design
A classifier enables to determine in which category an object belongs. A classifier can
for example be used to detect speech balloons and non-textual graphical
components on a comic book page. Contemporary classifiers rely on artificial
intelligence techniques and are now readily available.
cbrTekStraktor provides an integration with Google‟s Inception V3 image classifier.
Inception V3 classification capabilities improve automatically through experience
gathering. This is commonly referred to as supervised machine learning. After
previously been shown a number of representative examples of speech balloons and
non-textual image objects from a comic book page Inception is able to make a
correct distinction between those. According to Google: “Inception uses a deep
convolutional neural network (CNN) to achieve reasonable performance on visual
recognition tasks”.

Technical design
cbrTekStraktor provides the following supporting functionality for the Google image
classifier.


A module to create a set (or sets) of standardized training images. A standardized
image is an RGB color image in JPEG format, which is either cropped or centered
to fit in a 300x300 frame. The images ae extracted from previously processed
comic book pages.




A script to retrain the visual recognition model
A component that integrates cbrTekStraktor, Python and Tensorflow. In essence,
cbrTekStraktor merely calls a Python script via the Operating System shell. Note.
Future releases of cbrTekStraktor will directly call the TensorFlow Java library.

Installation of Python and TensorFlow components
The combination of Python 3.5 and TensorFlow, versions 1.4 through 1.7 proved to
be a well-functioning basis for the additional cbrTekStraktor classification step.
2

C B R T E K S T R A K T O R

The installation of Python and TensorFlow components is a bit cumbersome. A stepby-step installation on the Windows platform has been documented in appendix.
One aspires that using the AI Visual Recognition add-ons in cbrTekStraktor is more
intuitive.
There are 2 configuration settings that affect the AI Visual Recognition component:
“Python folder” and “Maximum number of threads”.See the “How-To: Projects”
section.

Accessing the AI Visual Recognition components
The AI Visual Recognition (VR) add-ons are located on the following menu item: “File
> TensorFlow”.

3

C B R T E K S T R A K T O R







TensorFlow Setting. This item currently not implemented
Make training set. This item is used to create a set of images to train the Visual
Recognition model.
Extract single set. The purpose of this item is to extract a set of image paragraphs
from a single comic book page and use this set to manually validate the
correctness of the Visual Recognition model.
Readjust text bubbles via tensorflow. This item performs an addition classification
once cbrTekStraktor has completed it standard text extraction steps. The results of
AI Visual recognition process are used to re-adjust the previously obtained results.

Training, test and validation of the Inception model
Supervised machine learning is most often based on the following three steps.




The model is initially fit on a training dataset that consist of a set of example
images.
Successively, the fitted model is used to predict the responses for the observations
on a second dataset which is called the validation dataset
Finally, the test dataset is a dataset used to provide an unbiased evaluation of a
final model fit on the training dataset.

CbrTekStraktor facilitates the creation of the training set of images, but still requires
that you manually (re)train the Inception V3 model.
First step: Creation of the training image set
Start by “File > TensorFlow > Make training set”
cbrTekStraktor will then extract the training images from comic book pages of which
the text has already been extracted. A substantial number of test images are required
to train the visual recognition model. You will to have at least previously processed 30
comic book pages. Use the bulk extraction to do this. You might encounter the
following warning message.

4

C B R T E K S T R A K T O R

The monitor dialog box similar to the below one is displayed whilst the sample images
are being extracted from your set of comic book pages.

The following dialog box will be displayed when the sample images have been
created.

Second step: manual classification
The example images are to found in the “$ROOTDIR\Corpus\Images” folder. You will
need to determine in a visual manner which image constitutes a genuine speech
balloon and which image does not. A working approach is to configure Windows
Explorer to display “large icons” and just drag & drop the speech balloons and nontextual image objects into two separate folders.

5

C B R T E K S T R A K T O R

It is imperative (oh dear!) that you put the images in a folder called “validbubble” and
the other images in a folder called “invalidbubble”. The TensorFlow model retrain
script will use these folder names for the various categories of the classifier.
cbrTekStraktor also uses these folder names when interpreting the results of Google‟s
image recognition component.
Third step: retrain the Inception model
We will use a Python script originally provided by Google to retrain the VR model
using the set of training images created in the 2nd step.
The Python scripts and the Operating System command scripts are to be found on
GitHub in the AncillarySourceCode folder of the cbrTekStraktor project
(https://github.com/cbrtekstraktor/cbrTekStraktor/tree/master/src/AncillarySourceCod
e). Note. The script names refer to version 1.4 of TensorFlow.
There are a few pre-requisite steps to be performed.


Create a working folder, e.g. C:\Temp\cbrTekStraktor\VR\Tutorial. Hence onward
referred to as $VRDIR. It is advised to keep this working directory separate from
the $PROJDIR\TensorFlow folder. This will enable you to experiment with
retraining the VR model.



Create the folders to store the manually classified images, e.g.
$VRDIR\comics\validbubble and $VDIR\comics\invalidbubble



Copy the images that comprise speech balloons in to the $VRDIR\validbubble
folder; copy the non-textual images in the $VR\invalidbubble folder.



Download the retrain Python script (retrain14.py) from GitHub and put it in
$VRDIR.



Download the retrain cmd script (cbrTekStraktorRetrain14.cmd) from GitHub and
put it in $VRDIR



Modify the cbrTekStraktorRetrain batch script to match the installation folder of
Python and the working folder on your system.

REM cbrTekStraktor V04 retrain all command file
REM
SET PYTHON_HOME=C:\temp\devtools\Python35
PATH=%PYTHON_HOME%;%PYTHON_HOME%\Scripts;%PATH%
SET KDIR=C:\temp\cbrTekStraktor\tutorial\VR
SET KSTART=%KDIR%
SET KPROG=%KDIR%
SET KDATA=%KDIR%\comics

6

C B R T E K S T R A K T O R

python %KPROG%\retrain14.py -bottleneck_dir=%KSTART%\bottlenecks --how_many_training_steps
500 --model_dir=%KSTART%\inception -output_graph=%KSTART%\comics_graph.pb -output_labels=%KSTART%\comics_labels.txt --image_dir=%KDATA%
pause

Verify all of the above. You are all set to run the cbrTekStraktorRetrain14 batch either
by double clicking on it or from the command shell.
C:\temp\cbrTekStraktor\Tutorial\VR>python
C:\temp\cbrTekStraktor\tutorial\VR\ret
rain14.py -bottleneck_dir=C:\temp\cbrTekStraktor\tutorial\VR\bottlenecks -how_many_training_steps 500 -model_dir=C:\temp\cbrTekStraktor\tutorial\VR\inception -output_graph=C:\temp\cbrTekStraktor\tutorial\VR\comics_graph.pb -output_labels=C:\temp\cbrTekStraktor\tutorial\VR\comics_labels.txt
--image_dir=C:\temp\cbrTekStraktor\tutorial\VR\comics
>> Downloading inception-2015-12-05.tgz 100.0%
Successfully downloaded inception-2015-12-05.tgz 88931400 bytes.
Looking for images in 'InvalidBubble'
Looking for images in 'ValidBubble'
Creating bottleneck at
C:\temp\cbrTekStraktor\tutorial\VR\bottlenecks\InvalidBub
ble\PObj_0dd79ede_46629736087335.jpg.txt
<< etc >>
2018-05-27 12:33:49.765481: Step 490:
2018-05-27 12:33:49.765481: Step 490:
2018-05-27 12:33:50.030682: Step 490:
(N=100)
2018-05-27 12:33:52.355086: Step 499:
2018-05-27 12:33:52.355086: Step 499:
2018-05-27 12:33:52.604686: Step 499:
(N=100)
Final test accuracy = 100.0% (N=70)
Converted 2 variables to const ops.

Train accuracy = 98.0%
Cross entropy = 0.047499
Validation accuracy = 100.0%
Train accuracy = 100.0%
Cross entropy = 0.029145
Validation accuracy = 100.0%

The first time you run the cbrTekStraktorRetrain command, TensorFlow will download
a readily available trained Inception model. It will take a couple of minutes before the
retraining is completed. This will result in a retrained model file “comics_graph.pb” and
a label file “comics_labels.txt”. $VRDIR should have a structure similar to the below.

7

C B R T E K S T R A K T O R

Fourth step: verification


Download the file „test14.py” from GitHub and put in $VRDIR



Download the file “cbrTekStraktorTest14.cmd” file from github and put in in
$VRDIR



Adapt the cmd file to reflect where you installed Python and working folder.



Run the cbrTekStraktorTest14 batch either by double clicking on it or from the
command shell. Look for the scores for “validbubble” and “invalidbubble” and use
those to assess the accuracy of the image classification model which you have just
retrained.

Fifth step: deploy the model in cbrTekStraktor
Copy the “comics_graph.pb” and the label file “comics_labels.txt” from $VRDIR to
$PROGDIR\TensorFlow.

8

C B R T E K S T R A K T O R

Running the VR post-process
Start by “File > TensorFlow > Readjust text bubbles via TensorFlow”. You will first
need to select the comic book page to be re-classified.
Subsequently the below dialog will pop-up, enabling you to monitor the reclassification progress.

The re-classification process can be throttled by defining the number of parallel
threads. You will need to find a good balance. Too many threads might put a too
demanding workload on your computer. See the section on cbrTekStraktor projects to
learn how this parameter can be set.
The result of the re-classification of the cover page of “Space Detective N04” can be
observed in the below picture, only the area comprising “10c N0.4” is maintained as a
text paragraph.

9

A
Chapter

C B R T E K S T R A K T O R

HowTo : Report
An overview report will be created inHTML format when you click on themain screen‟s
“Report” button. The report will be displayed in Mozilla or any other browser that you
have specified to be the vehicle reporting client.

The Report functionality is rather sparse and will be enhanced in future releases.

1

B
Chapter

C B R T E K S T R A K T O R

HowTo : Miscellaneous
items
Archive browser
The archive browser utility “Tools > More > Archive Browser” enables to examine the
contents of a cbrTekStraktor archive file.

The “Action” item on the dialogs provides to
 Explore the archive file; which gives an overview all files in the archive


Examine the objects extracted; which will open the STAT xml file in a web browser



Examine the text extracted; which will open the Language (extracted text and
translated text) in a web browser



Zap an archive, i.e. delete an archive file

1

C B R T E K S T R A K T O R

Exporting all extracted text
The option “File > Export > Export Text” enables to export all textual information to a
single file. The file comprises the results of the OCR process, the modifications to the
OCR‟ed text and the translated texts. Exported text can be used to quickly modify of
translate texts.
First step
Select the folder containing the comic book pages from which the texts should be
exported.

Second step
The text extraction process runs. The batch monitor screen is displayed. The monitor
screen will close automatically a few seconds after the export is completed.

2

C B R T E K S T R A K T O R

Third step
A dialog box opens in which you are able to define the provide the location and
name of the export file.

Example export





_____________________________________________________________





$30253054928566: THAT MUST pF A PICTURE THEY ARC REHEARING
$30253054928567: The count de Cay got me to buy a studio for $100000
$30253054928568: whatstrig?
$30253054928569: OH" IM JUST CRAZY To se A MOVE Stam
$30253054928571: Lovely
$30253054928583: HE Cemtaminr DID THAT WELL
$30253054928584: SHE THE STAQ? WHAT 15 THE NaME Of THIS PAN >
$30253054928585: THAT'S NO PLAy: THAT'S THE SHEmirP. Th?? STUDIO 1??
ATTACHED FOR Gaamics
$30253054928595: cises
$30253054928596: her'. Fextune Service

$30253054928567: Le comte de Cay m'a convaincu d’achèter une studio de
cinéma pour $100000
$30253054928569: Oh je suis folle. Je veux être une star du cinéma.
$30253054928571: Parfait.



3

C B R T E K S T R A K T O R

The texts are grouped per image file, per language and per paragraph.Each
paragraph has a unique UID, e.g.30253054928571. UIDs are enclosed between a
dollar sign and a semi-colon. You can quickly assess the correctness of the OCR‟d text
and modify where deemed appropriate.Alternatively, you can quickly enter translated
text and prepare it for reloading. When entering or modifying text, make sure not to
change the structure of the file, i.e. leave a space between the UID‟s terminating semicolon and the text.

Importing text
This option which is to be found under “File > Import > Import text” enables to load
text from a text source file into the cbrTekStraktor archive file. Its purpose is to reimport modified or translated texts in the cbrTekStraktor application.
The structure of the source file must adhere to the format of the text Export file (see
previous section)
First step
Select the import file from which you want to import data.

Second step
The data from the file will be import. A monitor progress screen will be shown.

4

C B R T E K S T R A K T O R

Round tripping
Round tripping is a quick way for modifying and translating text. Just perform the
following steps.
 Export the text for a given folder


Modify, translate or spell-check the exported data in a text editor.



Import the modified data

Logging information
There are two log files to be found in the $PROJECTDIR folder.
 cbrTekStraktorErrFile.txt; which comprises the errors


cbrTekStraktorLogFile.txt; which comprises the loglines created by the
application.

[TODO – Comments on details of the logging files in cbrTekStraktor will be provided
in future versions of this manual]

5

C
Chapter

C B R T E K S T R A K T O R

Developer notes
This section contains information that might be useful for developers.

Folder structure of the application
cbrTekStraktor relies on a predefinedand static folder structure.
$PROJECTDIR folder
The root folder is the Project Directory or root folder ($PROJECTDIR).
By creating several root folders, the application is able create multiple and separated
projects.
The root of the folder structure can be specified on the command line when starting
the application (option –D). If the root folder is not specified as a command line
parameter, it will be defaulted to “c:\temp\cbrTekStraktor” or $HOME/cbrTekStraktor.
The $PROJECTDIR must adhere to the following structure.

1

C B R T E K S T R A K T O R

$PROJECTDIR folders
Folder
Cache

Description
This is a mandatory directory. It is used to cache files temporarily, for
example when in Edit mode files are cached in this folder.

Corpus

This is a mandatory directory. It comprises statistical information gathered
by the application, e.g. timing information.

Output

Required directory. Additional information to be found in the next section.

Temp

This is a required directory. It is used to store temporary files, for example
when the Image screen Info dialog is opened, the boxdiagram.png of the
RGB box plot diagram are stored in this directory.

OCR

Required directory. It is used to store the Tesseract configuration, debug
and result files; as well as a PNG image of the text to be OCR‟ed.

$PROJECTDIR files
File
cbrTekStraktor.xml

Description
This is a required configuration file

properties.txt.

This file comprises the GUI properties of the application,
for example the width and height of the main canvas. It is
created and maintained by the application.

cbrTekStraktorLogFile.txt

File with logging information

cbrTekStraktorErrFile.txt

File with error information

Output folder
These are the folders which are to be found in $PROJECTDIR/Output

2

C B R T E K S T R A K T O R

File
Archive

Description
This is a required folder in which the cbrTekStraktor Archives are stored. An
archive is a ZIP file comprising reports, statistical, pictorial and textual information
all of which have been generated by the application. See the next section for an
overview of the content of an archive file.

HTML

This a required folder in which the HTML reports are stored. It also comprises the
CSS (Custom Style Sheet) file (cbrTekStraktorCSS.txt). If the CSS file is missing, a
new one will automatically be created by the application. There is a single HTML
file for each file (.html). When the HTML report file is no longer
required it is automatically transferred from the HTML folder into the applicable
Archive file. Conversely, HTML reports areextracted from the Archive files when a
report is requested.

Images

This is a required folder for temporarily storing images created by the
application. In general temporary files will have a name preceded by „z‟ e.g.
zPeakDiag_wonderwoman.png is the RGB Peak histogram image.

Stats

This is a required folder for storing the statistical information on each image file
in XML format in a file named .xml. Detailed information on the
statistical elements maintained in this file is available at the end of this appendix.

Note. A housekeeping routine is automatically performed on a frequent basis. This
routine will remove obsolete files in the $PROJECTDIR folders. The housekeeping
routine can also manually be started from the “Tools>Housekeeping” menu.
OCR Folder
These files might be present in the $PROJECTDIR/OCR folder

3

C B R T E K S T R A K T O R

File
TesseractOptionRepository.xml

Description
This file will only be present if you open the
“properties>Tesseract Option” menu. The file
comprises all options for Apache Tesseract 4.0.

TesseractLog.txt

This is the debug file which the Tesseract client creates
during the OCR process.

TesseractConfig.txt

This is the Tesseract Parameter file which is created by
cbrTekStraktor before the Tesseract client is called. It
comprises the Tesseract options that have been defined
in “Properties>Tesseract Options”

OCRTextResult.txt

This is the result file created by Tesseract.

OCROutput.png

This is the image that has is created from extracting the
characters form the comic book image.

Corpus folder
Filename
AllStat.txt

Comment
This file collates various metrics and statistical information
on the comic book pages in the project. These statistics
are created when pressing the “Statistics” button on the
“Tools” menu.

TimingAccuracyStats.txt

Statistical information which is used to monitor the
accuracy of the execution time predictive analysis. This
module used Euclidian distances.

TimingInputStats.txt

Timing info of the various process gathered by the
application whilst executing these processes.

4

C B R T E K S T R A K T O R

File structures of the application
Project File
A reference to project which was last accessed will be stored in the configuration file
“cbrTekStraktorProjectConfig.txt”.

This file is located one folder above the current $PROJETDIR folder. In most cases this
will be in C:\temp\cbrTekStraktor or when using Linux it will be found in
$HOME/cbrTekStraktor.
The content of the Project Properties File is rather sparse and might look like this
=================================================
cbrTekStraktor V0.1 (07-May-2017)
Started=07-may-2017 13:37:30
Stopped=07-may-2017 13:38:36
=================================================
EntryFolder=c:\temp\cbrTekStraktor
RecentProject=Tutorial

cbrTekStraktor configuration file
“cbrTekStraktor.xml” is the main configuration file and is located in the $PROJECTDIR
folder.
Example configuration file




20180502102508
20180527093118
Tutorial
Created by cbrTekStraktor V0.3 (1-May-2018)
dd-MMM-yy HH:mm:ss

5

C B R T E K S T R A K T O R

French
9
500
5
Verdana
12
ISO_8859_1
BLEACHED
C:\temp\devtools\Tesseract\Tesseract_4_64Bit
C:\Temp\devtools\Python35
6
MOZILLA


Tag
Browser

Comment
Defines which Web Brower is to be used buy the
reporting subcomponent. Supported values are
{CHROME,MOZILLA,EXPLORER}

Created

The time the cbrTekStraktor project was created

Dateformat

Enables to specify the display format of date and
time information.

Description

A description of the project

Encoding

{UTF8,UTF16,LATIN1, ASCI}

HorizontalVerticalVarianceThreshold

A thresholdvalue that is used to separate
connected components, which are characters for
non-textual graphical objects.
This value should not be modified.

Logginglevel

A number ranging from 0 to 9 to define the level
of detail of the logging information. Setting the
level to 9 will provide the most detailed logging
information.

MeanCharacterCount

This is a threshold value which is used to identify
the cluster with textual information.

Name

The name of the cbrTekStraktor project
6

C B R T E K S T R A K T O R

PreferredFont

The name of the font to be used in the majority
of the application‟s screens and dialogs.

PreferredFontSize

The size of the preferred font.

TesseractFolder

This is the folder where the Tesseract OCR
application is installed.

Updated

The time the cbrTekStraktor project has been
updated

PythonFolder
MaximumNumberOfThreads

Installation folder of Python
The maximum number of threads to be used by
the TensorFlow integration component.

Content of the archive file
The Archive files are to be found in the $ROOTDIR/Output/Archive folder. An archive
file name always ends on “_set.zip”.
An archive is a ZIP file comprising reports, statistical, pictorial and textual information
generated by the application.
The archive file contains the following files.
Filename
Binarized_Out.png

Comment
This is the monochrome version of the comic book image
file. This PNG file is used by the OCR component.

{CMXUID}.html

This is the HTML report file.

{CMXUID}_lang.xml

XML file containing the extracted and translated textual
information

{CMXUID}_Lang_Ver_{YYM
MDDHHMISS}.xml

All versions of the Lang.XML file are maintained in the
archive. Versions are timestamped.

{CMXUID}_stat.xml

XML file comprising the statistical and graphical information
of the comic book image file.

7

C B R T E K S T R A K T O R

{CMXUID}_Stat_Ver_{YYMM
DDHHMISS.xml
{CMXUID}{00.NNN}.png

Previous version of the Stat.XML file

zBoxBiagr_{CMXUID}.png

Box Plot diagram of the RGB histogram. This image is used
by the reporting component.

zCharacts_{CMXUID}.png

This is the image comprising the cut-out paragraphs. It is
created and displayed at the end of the text extraction
process and stored in the archive for further usage by the
reporting component.

zClusters_{CMXUID}.png

This is an image that comprises an overview of the clusters
identified during the text extraction process.

zColrHist_{CMXUID}.png

An image of RGB histogram used by the reporting
component.

zGrayHist_{CMXUID}.png

An image of the grayscale histogram used by the reporting
component.

zMetaData_{CMXUID}.xml

This is the Comic Book Metadata XML.

zPeakDiag_{CMXUID}.png

An image of the Peak Detection histogram.

These are the cut-out images of the text and non-text
paragraphs. These images are used by the reporting
component.

cbrTekstraktor STAT XML file
The STAT XML file comprises the majority of the results of the image and text
processing activities on an Image file.
Detailed information on this file is provided in a separate section at the end of this
appendix.

Language file
The language file contains the result of the OCR and translation processes.




8

C B R T E K S T R A K T O R



20170501095026
20170501112243


French
Afrikaans
Albanian
.. etc ..



comic01.jpg
C:\temp\cmcProc\test
288775


comic01
6C58-E110-A319-547F-F401-5341-3623-2EC9




11

100
8870019149256
text
false
20170501131957










9

C B R T E K S T R A K T O R




Tag
TextBundleChangeDate

Comment
The time the content of the paragraph was created or
updated.

TextBundleIdx

This is the sequence number of the paragraph.

TextBundleRemoved

Possible vales are {True,False}

TextBundleUID

This is the Unique Identifier of the paragraph

TextConfidence

Possible values are {Text,Nontext}

TextFrom

This field contains the text in its original language

TextOCR

This is the result of the OCR operation on the
paragraph

TranslatedText_{language}

This field contains the translated text.

zFiles
The following zFiles are present in the $PROJECTDIR\temp folder when the extraction
and edit processes are active. The files are subsequently stored in the Archive file.
Image File
zBoxDiagr

Description
An image of the RGB frequency distribution‟s Quartile Box
diagrams

zCharacters

An image file in which the character clusters are visualized

zClusters

A image file in which the clusters are vizualised

zColHist

Color Histogram image

zGrayHist

Grayscale histogram image

zPeakHist

Peak Histogram, which is used to determine whether an image file
10

C B R T E K S T R A K T O R

comprises a black and white, grayscale or color image. image

zMetadata File
The characteristics of a single comic page are stored in the file
$PROJECTDIR\Output\Archive\zMetadata_.xml.
This is an example of a zMetaData _.xml file.




C:\temp\cmcProc\test
superman_01.jpg
6C58-E110-A319-547F-F401-5341-3623-2EC9
superman01


0

1



FRENCH
colour
TIGHT
SLOW_SAUVOLA
AUTOMATIC
USE_IMAGE_DPI
CHROME
CROP_IMAGE


Tesseract command file

11

C B R T E K S T R A K T O R

The following command will be generated to run Tesseract. This is a temporary file
which is removed by the application once Tesseract has completed the OCR.
C:\temp\Tesseract-OCR-4\Tesseract-OCR>tesseract
c:\temp\cbrTekStraktor\Tutorial\Ocr\OCROutput.png
c:\temp\cbrTekStraktor\Tutorial\Ocr\OCRTextResult -l eng
c:\temp\cbrTekStraktor\Tutorial\Ocr\TesseractConfig.txt

Example TesseractOptionRepository file





falseallow_blob_division1Use divisible blobs
chopping


File abbreviated

Example OCROutput.PNG file

12

C B R T E K S T R A K T O R

Example TesseractConfig file
debug_file c:\temp\cbrTekStraktor\Tutorial\Ocr\TesseractLog.txt
paragraph_debug_level 1
tessedit_char_whitelist ABCDEFGHIJKLMNOPQRSDTUVWXYZ012345789
textord_heavy_nr 1

Example OCRResultFile file
[p:bringingupfather08-08] [d:150] [u:5A3A-3078-1D92-55B3-A7091F58-C3AA-7E84]
P22826615692200
MRJIGGS THIS 1S Miss Peacy~® JUST ENgAgep FOR OUR MELODRaMA
P22826615692201
THAT 1S MISS TAKE- SHE IS TO TAKE A STAR® PART IN THE COMEDy
P22826615692203
VERyGOOD
P22826615692205
SHE LOOKS LIKE A COMEDY
P22826615692208
ELL WHAT'S TO BE DONE IN THE STUDIO TODAY *
P22826615692210
we CAn Only - POY on ome Tooay | ETHER THE mELopRam - OR THE
COMEDY. | was - THiNKNq
P22826615692211
NEVER MIND | Thinkin' weir PUY On THE MELODRAmA
P22826615692218
Jo -g reatume. sarvice. 1

Example TesseractLog file
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica
# Final Paragraph Segmentation

13

C B R T E K S T R A K T O R

#row space .. lword[widthSEL]
rword[widthSEL]
[lmarg,lind;rind,rmarg] model text
0
8
[p:bringingupfather08-08][236SEl] [u:5A3A-3078-1D92-55B3A709-1F58-C3AA-7E84][490sEl] [ 1, 0; 0, 0]
S:1
[p:bringingupfather08-08] [d:150] [u:5A3A-3078-1D92-55B3-A709-1F58-C3AA7E84]
1
12
P22826615692200[180Sel]
P22826615692200[180sel]
[ 1, 1;737, 0]
C:1
P22826615692200
2
13
MRJIGGS[108Sel]
MELODRaMA[147sel]
[ 1, -1; 68, 0]
S:1
MRJIGGS THIS 1S Miss Peacy~® JUST ENgAgep
FOR OUR MELODRaMA
3
12
P22826615692201[177Sel]
P22826615692201[177sel]
[ 1, 2;739, 0]
C:1
P22826615692201
4
13
THAT[62Sel]
COMEDy[111sel]
[ 1, 0; 0, 0]
S:1
THAT 1S MISS TAKE- SHE IS TO TAKE A STAR®
PART IN THE COMEDy
Active Paragraph Models:
1: margin: 1, first_indent: 0, body_indent: 0, alignment: LEFT
# Final Paragraph Segmentation
#row space .. lword[widthSEL]
rword[widthSEL]
[lmarg,lind;rind,rmarg] model text
0
7
P22826615692203[180Sel] P22826615692203[180sel] [ 1, 1; 1, 1]
U:0
P22826615692203
1
7
VERyGOOD[142Sel]
VERyGOOD[142sel]
[ 1, -1;
39, 1]
U:0
VERyGOOD
Active Paragraph Models:
# Final Paragraph Segmentation
#row space .. lword[widthSEL]
rword[widthSEL]
[lmarg,lind;rind,rmarg] model text
0
7
P22826615692205[180Sel] P22826615692205[180sel] [ 1,
1;136, 1]
U:0
P22826615692205
1
8
SHE[44Sel]
COMEDY[98sel]
[ 1, -1; 1, 1]
U:0
SHE LOOKS LIKE A COMEDY
Active Paragraph Models:
# Final Paragraph Segmentation
#row space .. lword[widthSEL]
rword[widthSEL]
[lmarg,lind;rind,rmarg] model text
0
11
P22826615692208[180Sel] P22826615692208[180sel] [ 1,
1;453, 1]
U:0
P22826615692208
1
11
ELL[47Sel]
*[8SEL]
[ 1, -1; 1, 1]
U:0
ELL WHAT'S TO BE DONE IN THE STUDIO TODAY *
Active Paragraph Models:
# Final Paragraph Segmentation
#row space .. lword[widthSEL]
rword[widthSEL]
[lmarg,lind;rind,rmarg] model text
0
10
P22826615692210[180Sel] P22826615692210[180sel] [ 1,
0;1017, 1]
U:0
P22826615692210
1
11
we[39sel]
THiNKNq[111sel]
[ 1, -1; 1, 1]
U:0
we CAn Only - POY on ome Tooay | ETHER THE mELopRam OR THE COMEDY. | was - THiNKNq
Active Paragraph Models:
# Final Paragraph Segmentation
#row space .. lword[widthSEL]
rword[widthSEL]
[lmarg,lind;rind,rmarg] model text
0
7
P22826615692211[177Sel] P22826615692211[177sel] [ 1,
1;585, 1]
U:0
P22826615692211
1
11
NEVER[84Sel]
MELODRAmA[168sel]
[ 1, -1; 1, 1]
U:0
NEVER MIND | Thinkin' weir PUY On THE MELODRAmA
Active Paragraph Models:
# Final Paragraph Segmentation

14

C B R T E K S T R A K T O R

#row space .. lword[widthSEL]
rword[widthSEL]
[lmarg,lind;rind,rmarg] model text
0
7
P22826615692218[180Sel] P22826615692218[180sel] [
34, 1]
U:0
P22826615692218
1
10
Jo[25Sel]
1[5SeL]
[
1, 1]
U:0
Jo -g reatume. sarvice. 1
Active Paragraph Models:

1,

1;

1, -1; -

AllStat File
The AllStat.txt file is to be found in the $PROJECTDIR/Corpus/Stats. The file is
generated by selecting “Tools > More > Statistics”. Its purpose is to provide key
statistical information (metrics) on each Comic Book Page processed. These metrics
can be used in further analysis or for various other prediction purposes.
The file is a comma delimited ASCI text file.
Column
UID

Comment
The UID of the image processed

UncroppedWidth

The Uncropped Width in pixels

UncroppedHeigth

The Uncropped Height in pixels

PayloadWidth

The Payload width

PayloadHeigth

The PayloadHeight

NbrOfElementsInLetterCluster

The number of elements found in the paragraph
classified to be the text paragraph.

NbrOfParagraphs

The number of paragraphs

NbrOFLetterParagraphs

The number of paragraphs which have been
identified to comprise text

NbrOfLetters

The total number of characters

Colour

{TRUE, FALSE} Set to TRUEby the Color schema
detection algorithm if a full color scheme is
detected.

MonochromeDetected

{TRUE,FALSE} set to TRUE if the color scheme
detection logic found a monochrome schema
15

C B R T E K S T R A K T O R

MonochromeDetectionStatus

{TRUE,FALSE}-{POSITIVE,NEGATIVE}
True-positive : correctly identified monochrome
schema
False-positive : incorrectly identified monochrome
schema
True-negative: correctly identified color schema
False-negative: incorrectly identified color schema

NbrPeak

The number of peaks found by the color schema
detection logic

NbrValidPreak

The number of valid peaks

%PeakCoverage

Coverage of pixels within valid peaks

Example
UID,UncroppedWidth,UncroppedHeigth,PayloadWidth,PayloadHeigth,NbrOfElementsI
nLetterCluster,NbrOfParagraphs,NbrOFLetterParagraphs,NbrOfLetters,Colour,Monoch
romeDetected,MonochromeDetectionStatus,NbrPeak,NbrValidPreak,%PeakCoverage
e7f2-2af1-d553-b70e-8cb8-b350-ebc3bd42,975,1465,877,1355,858,25,15,826,unknown,false,false-negative,8,0,46%,
ce6d-3395-7b87-e18e-b53f-2f55-112fd37e,1200,1857,1093,1656,597,36,10,485,unknown,false,false-negative,9,0,21%,
795b-7aa8-39ea-7ee8-f1d4-3e10-06db0717,1024,1105,998,1012,321,31,10,250,unknown,true,false-positive,39,0,87%,

TimingInputStat file
The TimingInputStat.txt file is to be found in the $PROJECTDIR/Corpus/Stats. The file
stores the elapsed time in nanoseconds of various text extraction and image loading
processes.
The timing information is gathered during the image loading and text extraction
activities.
It is the prime source for estimating the duration of image loading and text extraction
activities, i.e. the lead time of a process is calculated using Euclidian Distance and is
used by the progress indicator on the main screen.
It is a vertical pipe delimited ASCI file.
16

C B R T E K S T R A K T O R

Column
UID

Comment
UID of the image

Width

Uncropped width of the image (in pixels)

Heigth

Uncropped height of the image (in pixels)

FileSize

The filesize of the image (in bytes)

ColourScheme

{COLOR,GRAYSCALE,MONOCHROME}

FileType

{PNG,JPG,GIF,JPEG}

BinarizationType

This is the binarization mechanism used, e.g. OTSU,
SAUVOLA, etc.

ConnectedComponents

The overall number of connected components

Paragraphs

The number of paragraphs

BWDensity

Black and White density of the entire picture

ImageLoadTime

The time required to load the image (in nanoseconds)

PageLoadTime

The time required to load the comic book page (the
page is an object that envelops the image) in
nanoseconds.

Preprocess

The duration of the pre-process step (in nanoseconds)

BinarizeTime

The duration of the binarization process (in
nanoseconds)

CoCoTime

The duration of the connected components
process(expressed in nanoseconds)

LetterTime

The duration of the process that identifies and expands
characters (in nanoseconds). See the section where the
text extraction process is explained.

ParagraphTime

The duration for creating an processing the contents of
17

C B R T E K S T R A K T O R

paragraphs
OverheadTime

This is the duration of the end to end text extraction
process minus all of the above durations.

Timestamp

The time the record was written to the file

Example
UID|Width|Heigth|FileSize|ColourScheme|FileType|ConnectedComponents|Paragraphs|
BWDensity|ImageLoadTime|PageLoadTime|Preprocess|BinarizeTime|CoCoTime|LetterT
ime|ParagraphTime|OverheadTime
629E-BF04-2D3B-5C27-29BB-C6BC-FBFC-E7F2-2AF1-D553-B70E-8CB8-B350-EBC3BD42|975|1465|256874|COLOR|JPG|SLOW_SAUVOLA|10183|25|0.3010666072368622|
245314798|61627908|199194031|4462646481|246614265|904632564|834179973|925
567697|03-JUN-2017 08:27:33
CE6D-3395-7B87-E18E-B53F-2F55-112FD37E|1200|1857|634270|COLOR|JPG|SLOW_SAUVOLA|31328|36|0.3828845024108886
7|310097424|83530141|245702033|7934692773|346997986|1079128458|1232142512|
1286570859|03-JUN-2017 08:27:46
795B-7AA8-39EA-7EE8-F1D4-3E10-06DB0717|1024|1105|354385|GRAYSCALE|JPG|SLOW_SAUVOLA|3518|31|0.1837499141693
1152|186662399|64333023|179845335|3157337579|63852631|549405945|691859669|
989257304|03-JUN-2017 08:27:52

TimingAccuracyStat
The TimingAccuracyStat.txt file is to be found in $PROJECTDIR/Corpus/Stats. The file
stores the difference between the calculated and actual process times of various
image loading and text processing activities.
The timing information is gathered during the image loading and text extraction
activities.
Its major objective tis to report on the accuracy of the process time prediction
algorithm.
Column
Timestamp
EstimatedImage
ActualImage
Ratio

Comment
The time the record was created
Estimated time to load the image
Actual time to load the image
Correctness or accuracy ratio (estimated – actual) /
18

C B R T E K S T R A K T O R

EstimatedBeforePreprocess
ActualbeforPreprocess
Ratio
EstimatedBeforeBinarize
ActualBeforeBinarize
Ratio
EstimatedBeforeCoCo
ActualBeforeCoCo
Ratio

actual
Estimated Before preprocess
Actual before preprocess
Correctness ratio
Estimated before Binarize step
Actual before binarize step
Correctness ratio
Estimated before Connected component
Actual before connected components
Accuracy ratio

Example
TimeStamp (EstimatedImage,ActualImage,Ratio)
(EstimatedBeforePreprocess,ActualbeforPreprocess,Ratio)
(EstimatedBeforeBinarize,ActualBeforeBinarize,Ratio)
(EstimatedBeforeCoCo,ActualBeforeCoCo,Ratio)
03-JUN-2017 13:05:45 ( 250ms, 283ms, -11%) ( 5886ms,
5882ms, 14896ms, -60%) ( 5882ms, 14896ms, -60%)
03-JUN-2017 13:12:00 ( 306ms, 408ms, -25%) ( 7879ms,
7879ms, 8606ms, -8%) ( 7879ms, 8606ms, -8%)
03-JUN-2017 13:12:13 ( 393ms, 301ms, 30%) ( 12518ms,
12518ms, 12436ms,
0%) ( 12518ms, 12436ms,
0%)
03-JUN-2017 13:12:19 ( 283ms, 151ms, 87%) ( 5882ms,
5882ms, 5926ms,
0%) ( 5882ms, 5926ms,
0%)

14896ms,

-60%) (

8606ms,

-8%) (

12436ms,
5926ms,

0%) (
0%) (

cbrTekstraktor STAT XML file
The Stat file comprises detailed information on the characteristics of the file and
image; as well as the results of the image analysis and text extraction processes.

19

C B R T E K S T R A K T O R

Major segments in the Stat file
The root node in the Stat file is . The following segments are its
immediate descendants.
Tag
ProcessHistory

Comment
Creation and modification timestamps

File

Details on the Image file

Image

High level details on the image

OriginalHistogram

The RGB histogram data of the original image

PayloadHistogram

The RGB histogram of the image once the margins
have been removed.
20

C B R T E K S T R A K T O R

ConnectedComponentFrequenc
yDistribution
ClusterClassification

Connected Components frequency distribution

ConnectedComponentClusters

Details on the connected components before the
post-process

FinalConnectedComponentClust
ers

Details on the post-processed connected
components, e.g., reallocation of character
components.

paragraphs

Detailed information on the text and non-text
paragraphs characteristics.

GraphicalEditorArea

A dump of the connected components and
objects that are part of the text and non-text
paragraphs

TimingInfoNanoSec

Timing information on nanosecond granularity.

Results of the Cluster classification process

ProcessHistory
[TODO – Will be covered in future releases of the cbrTekStraktor manual]
File
[TODO – Will be covered in future releases of the cbrTekStraktor manual]
Image
[TODO – Will be covered in future releases of the cbrTekStraktor manual]
OriginalHistogram
[TODO – Will be covered in future releases of the cbrTekStraktor manual]
PayloadHistogram
[TODO – Will be covered in future releases of the cbrTekStraktor manual]
ConnectedComponentFrequencyDistribution
[TODO – Will be covered in future releases of the cbrTekStraktor manual]
ClusterClassification
21

C B R T E K S T R A K T O R

[TODO – Will be covered in future releases of the cbrTekStraktor manual]
ConnectedComponentClusters
[TODO – Will be covered in future releases of the cbrTekStraktor manual]
FinalConnectedComponentClusters
[TODO – Will be covered in future releases of the cbrTekStraktor manual]
Paragraphs
[TODO – Will be covered in future releases of the cbrTekStraktor manual]
GraphicalEditorArea
[TODO – Will be covered in future releases of the cbrTekStraktor manual]
TimingInfoNanoSec
[TODO – Will be covered in future releases of the cbrTekStraktor manual]

22

C B R T E K S T R A K T O R

Text extraction process
Step1 . Determining the payload area of a comic book page image (determining the
width and height of the margins)
Step 2. Cropping the image to its payload area (on the Comic Book Metadata Dialog
one can opt not to crop the image)
Step 3. The grayscale version of the cropped image is displayed.
Step 4.Binarization of the image. Various binarization methods can manually be
selected on the previously displayed Comic Book Metadata Dialog screen: Otsu,
Niblak or Sauvola. It is recommended to use either Niblak or Sauvola.
Step 5. Display of binarized image
Step 6.Gathering the connected components, i.e. gathering information on every
single graphical element present on the image. See Connected Component in
appendix.
Step 7.K-Means clustering of the connected components. This is a straightforward
classification of the connected components. The classification criterion is the pixel
height of a connected component
On a comic book page, letters or characters more or less all have the same height.
The idea is to create groups of connected components which all have a similar height
and therefore constitute a cluster which contains merely characters. cbrTekStraktor
uses K = 5. See K-Means clustering in appendix.
Step 8. Identification of those connected components which are characters. This is
performed in various steps, e.g. by analyzing the number of vertical and horizontal
elements of a single connected component. Typically a character of the Latin Script
has 2 or 3 horizontal elements and 1 or 2 vertical elements. There are also more
white pixels than dark pixels in a character, i.e. the density of these connected
components should be less than 50%. Connected component having a pixel height
of less than 6 are excluded and classified as noise.
Step 9. Identification of the K-Means cluster comprising the characters. There are
typically between 300 and 500 characters on a single comic book page. So the cluster
holding approximately 500 connected components resembling a character is initially
chosen to be the “Text Cluster” (a.k.a. in Dutch as the Letter Cluster).Additional finetuning steps are performed .
23

C B R T E K S T R A K T O R

It is possible to override the automatic detection of the cluster comprising characters
via the “Cluster Classification Method” drop-down on the Image Info dialog.
Step 10.Expansion of the characters. Previously determined characters which are part
of the TextCluster are subsequently grouped based on their proximity on the image.
This will result in a set of characters that are part of the same speech balloon or text
area. CbrTekStraktor uses the noun “paragraphs” for these groups. The “Proximity
Tolerance” can be set on the Comic Book metadata Screen to be either tight, lenient,
wide or ultra-wide.
Step 11.Adjustment of characters. Any Connected component that is part of the area
determined by the previously found Text Paragraphs boundaries are re-assessed
whether to be a potential character or not. Paragraphs that comprise little or no
characters are then set to be “non-character” or “non-text” paragraphs; the remaining
paragraphs are hence onward referred to as “character” or “text” paragraphs.
Step 12. The results of the text extraction are stored in an archive file (see the
definition of the stat.xml and language file). Cut-outs of the paragraphs are displayed.
Text paragraphs have a green border and non-textual paragraphs have a red border.
Frames have a bluish border.

24

C B R T E K S T R A K T O R

Image processing concepts
This section comprises a quick overview of the image processing concepts, e.g. image
processing filters, used in the application.

Concept
Convolution

Comment
[Wikipedia] In mathematics convolution is a mathematical operation
on two functions; it produces a third function, that is typically viewed
as a modified version of one of the original functions, giving the
integral of the pointwise multiplication of the two functions as a
function of the amount that one of the original functions is translated.
It has applications that include probability, statistics, computer vision,
natural language processing, image and signal processing,
engineering, and differential equations.
In image processing, a kernel, convolution matrix, or mask is a small
matrix. It is used for blurring, sharpening, embossing, edge detection,
and more. This is accomplished by doing a convolution between a
kernel and an image.

Gaussian blur

[Wikipedia] In image processing, a Gaussian blur is the result of
blurring an image by a Gaussian function. It is a widely used effect in
graphics software, typically to reduce image noise and reduce detail.
The visual effect of this blurring technique is a smooth blur
resembling that of viewing the image through translucent scree.
Gaussian smoothing is also used as a pre-processing stage in
computer vision algorithms in order to enhance image structures at
different scales.
Mathematically, applying a Gaussian blur to an image is the same as
convolving the image with a Gaussian function. Since the Fourier
transform of a Gaussian is another Gaussian, applying a Gaussian blur
has the effect of reducing the image's high-frequency components; a
Gaussian blur is thus a low pass filter.

Grayscale

[Wikipedia] In photography and computing, a grayscale digital image
is an image in which the value of each pixel is a single sample, that is,
it carries only intensity information. Images of this sort, also known as
black-and-white, are composed exclusively of shades of gray, varying
from black at the weakest intensity to white at the strongest.
25

C B R T E K S T R A K T O R

A common strategy is to use the principles of photometry or, more
broadly, colorimetry to match the luminance of the grayscale image
to the luminance of the original color image.
To convert a color from a colorspace based on an RGB color model
to a grayscale representation of its luminance, weighted sums must
be calculated in a linear RGB space, that is, after the gamma
compression function has been removed first via gamma expansion.
Formula: 0.2126R + 0.7152G + 0.0722B
Histogram
equalization

[Wikipedia] Histogram equalization is a method in image processing
of contrast adjustment using the image's histogram.
This method usually increases the global contrast of many images,
especially when the usable data of the image is represented by close
contrast values. Through this adjustment, the intensities can be better
distributed on the histogram. This allows for areas of lower local
contrast to gain a higher contrast. Histogram equalization
accomplishes this by effectively spreading out the most frequent
intensity values.
The method is useful in images with backgrounds and foregrounds
that are both bright or both dark. In particular, the method can lead
to better views of bone structure in x-ray images, and to better detail
in photographs that are over or under-exposed. A key advantage of
the method is that it is a fairly straightforward technique and an
invertible operator. A disadvantage of the method is that it is
indiscriminate. It may increase the contrast of background noise,
while decreasing the usable signal.

HSL/HSV

[Wikipedia] HSL and HSV are the two most common cylindricalcoordinate representations of points in an RGB color model. The two
representations rearrange the geometry of RGB in an attempt to be
more intuitive and perceptually relevant than the Cartesian (cube)
representation. Developed in the 1970s for computer graphics
applications, HSL and HSV are used today in color pickers, in image
editing software, and less commonly in image analysis and computer
vision.
HSL stands for hue, saturation, and lightness (or luminosity), and is
also often called HLS. HSV stands for hue, saturation, and value, and
26

C B R T E K S T R A K T O R

is also often called HSB (B for brightness). A third model, common in
computer vision applications, is HSI (I for intensity). However, while
typically consistent, these definitions are not standardized, and any of
these abbreviations might be used for any of these three or several
other related cylindrical models. (For technical definitions of these
terms, see below.)
In each cylinder, the angle around the central vertical axis
corresponds to "hue", the distance from the axis corresponds to
"saturation", and the distance along the axis corresponds to
"lightness", "value" or "brightness". Note that while "hue" in HSL and
HSV refers to the same attribute, their definitions of "saturation" differ
dramatically.
Because HSL and HSV are simple transformations of devicedependent RGB models, the physical colors they define depend on
the colors of the red, green, and blue primaries of the device or of
the particular RGB space, and on the gamma correction used to
represent the amounts of those primaries. As a result, each unique
RGB device has unique HSL and HSV absolute color spaces to
accompany it (just as it has unique RGB absolute color space to
accompany it), and the same numerical HSL or HSV values (just as
numerical RGB values) may be displayed differently by different
devices.
Image
gradient

[Wikipedia] An image gradient is a directional change in the intensity
or color in an image. In graphics software for digital image editing,
the term gradient or color gradient is also used for a gradual blend
of color which can be considered as an even gradation from low to
high values, as used from white to black in the images to the right.
Another name for this is color progression.
Mathematically, the gradient of a two-variable function (here the
image intensity function) at each image point is a 2D vector with the
components given by the derivatives in the horizontal and vertical
directions. At each image point, the gradient vector points in the
direction of largest possible intensity increase, and the length of the
gradient vector corresponds to the rate of change in that direction.

Niblack
Sauvola

Niblack and Sauvola thresholds are local thresholding techniques that
are useful for images where the background is not uniform, especially
for text recognition. Instead of calculating a single global threshold
27

C B R T E K S T R A K T O R

for the entire image, several thresholds are calculated for every pixel
by using specific formulae that take into account the mean and
standard deviation of the local neighborhood (defined by a window
centered around the pixel).
OTSU

[Wikipedia] In computer vision and image processing, Otsu's method,
named after Nobuyuki Otsu, is used to automatically perform
clustering-based image thresholding or the reduction of a graylevel
image to a binary image. The algorithm assumes that the image
contains two classes of pixels following bi-modal histogram
(foreground pixels and background pixels), it then calculates the
optimum threshold separating the two classes so that their combined
spread (intra-class variance) is minimal, or equivalently (because the
sum of pairwise squared distances is constant), so that their interclass variance is maximal.

RGB

[Wikipedia] The RGB color model is an additive color model in which
red, green and blue light are added together in various ways to
reproduce a broad array of colors. The name of the model comes
from the initials of the three additive primary colors, red, green and
blue.
The main purpose of the RGB color model is for the sensing,
representation and display of images in electronic systems, such as
televisions and computers, though it has also been used in
conventional photography. Before the electronic age, the RGB color
model already had a solid theory behind it, based in human
perception of colors.
To form a color with RGB, three light beams (one red, one green and
one blue) must be superimposed (for example by emission from a
black screen or by reflection from a white screen). Each of the three
beams is called a component of that color, and each of them can
have an arbitrary intensity, from fully off to fully on, in the mixture.
Zero intensity for each component gives the darkest color (no light,
considered the black), and full intensity of each gives a white; the
quality of this white depends on the nature of the primary light
sources, but if they are properly balanced, the result is a neutral white
matching the system's white point. When the intensities for all the
components are the same, the result is a shade of gray, darker or
lighter depending on the intensity. When the intensities are different,
the result is a colorized hue, more or less saturated depending on the
difference of the strongest and weakest of the intensities of the
primary colors employed.
28

C B R T E K S T R A K T O R

When one of the components has the strongest intensity, the color is
a hue near this primary color (reddish, greenish or bluish), and when
two components have the same strongest intensity, then the color is
a hue of a secondary color (a shade of cyan, magenta or yellow).
RGBA

[Wikipedia] RGBA stands for red green blue alpha. While it is
sometimes described as a color space, it is actually simply a use of
the RGB color model, with extra alpha channel information. The color
is RGB, and may belong to any RGB color space, but an integral
alpha value as invented by Catmull and Smith between 1971 and
1972 enables alpha compositing.
The alpha channel is normally used as an opacity channel. If a pixel
has a value of 0% in its alpha channel, it is fully transparent (and,
thus, invisible), whereas a value of 100% in the alpha channel gives a
fully opaque pixel (traditional digital images). Values between 0% and
100% make it possible for pixels to show through a background like a
glass, an effect not possible with simple binary (transparent or
opaque) transparency. It allows easy image compositing.

Sobel

[Wikipedia] The Sobel operator, sometimes called the Sobel–Feldman
operator or Sobel filter, is used in image processing and computer
vision, particularly within edge detection algorithms where it creates
an image emphasizing edges.
Technically, it is a discrete differentiation operator, computing an
approximation of the gradient of the image intensity function. At
each point in the image, the result of the Sobel–Feldman operator is
either the corresponding gradient vector or the norm of this vector.
The Sobel–Feldman operator is based on convolving the image with
a small, separable, and integer-valued filter in the horizontal and
vertical directions and is therefore relatively inexpensive in terms of
computations.

29

D
Chapter

C B R T E K S T R A K T O R

HowTo : Install Google
Inception on Windows
Summary
TensorFlow is most often used on Linux. It is however possible to install and use
TensorFlow in CPU mode on Windows. This appendix describes how to install Google
TensorFlow locally on a Windows 64-bit based Operating Systems, e.g.
 Windows 7 Service Pack 1
 Windows 10
Caveat
TensorFlow only works with Python3.5 and 3.6.
Source: https://stackoverflow.com/questions/38896424/tensorflow-not-found-in-pip

Install Python
Prerequisites




If you are using Windows 7, you need to have Service Pack 1 installed, otherwise
the installation will not even start.
Some authors state that TensorFlow only works on Python 3.5.2; so better be safe
than sorry and stick to this Python version.
Tensorflow is installed via the Python package installer pip3. You need to use pip3
in particular version 1.8, which is not part of the Python 3.5.2 installation. We will
need to upgrade to pip3.

Download
1

C B R T E K S T R A K T O R

Download version 3.5.2. From https://www.python.org/downloads/release/python352/
Select the “Windows x86-64 executable installer” file.
Perform the Python installation via “Run as administrator”
Optionally tick the following boxes
 Install for all users. Python3.5.2 will then be installed in “c:\Program
Files\Python35” rather than in your home folder. You might also opt to store
Python in a bespoke folder and hence maintain multiple versions of Python on
your computer.
 “Update PATH” at the beginning of the installation process. This is not
necessaryfor cbrTekStraktor, because the integration scripts explicitly set the
PATH.
Install TensorFlow
Next, we are going to install the TensorFlow packages via “pip3”
Prepare TensorFlow installation
Create a Windows Batch or Command file (.bat/.cmd). The idea is to extend the PATH
and then open another shell in which python will be executed from the command line.
During installation of TensorFlow, you will be warned that the Scripts folder is not
included on the PATH, so let us add this also in the batch file.
SET PYTHON_HOME=”c:\Program files\Python35”
PATH=PYTHON_HOME%;%PYTHON_HOME%\Scripts;%PATH%
cmd.exe
pause

Note. If during the installation you opted to have the PATH updated, then you only
need to extend the PATH with the %PYTHON_HOME%\Scripts folder.
Upgrade pip
Run the batch file (run As Administrator, so you will have extended privileges when
upgrading pip)
From the command prompt, upgrade pip via the following command.
python -m pip install --upgrade pip

2

C B R T E K S T R A K T O R

Note. You might get a message stating that you are on pip3 version 1.8 and a
suggestion to upgrade pip to 1.10. Do not perform this upgrade, TensorFlow only
works on pip3 1.8.
Note. Although pip3 is to be upgraded the package name is pip and not pip3.
Install TensorFlow
Caveats: Current version of TensorFlow is V1.8.0. This version however appears to be
incompatible with Python 3.5.2.TensorFlow 1.4 has successfully been tested with
Python 3.5.2 and pip 3 1.8.
From the command line
pip3 install --upgrade TensorFlow==1.4

Perform a smoke test
Just run Python from the command line
>>>import tensorflow as tf
>>>hello = tf.constant('Hello, TensorFlow!')
>>>sess = tf.Session()
>>>print(sess.run(hello))

You will know whether the installation is successful as soon as you enter the first
command (i.e. import tensorflow as tf).
The error messages provided by Tensorflow are extensive. The following typical issues
might occur.


(A) msvcp140.ddl is missing (https://www.microsoft.com/enus/download/details.aspx?id=53587)
Visual C++ 2015 redistributable DLLs are missing (Msvcp140.d)
The msvcp140.ddl is part of the Microsoft Visual C++ 2015 Redistributable 64-bit
component. It is located in C:\Windows\System32
The TensorFlow error message will provide the URL where you can download the
Visual C++ redistributable package.
https://www.microsoft.com/en-us/download/details.aspx?id=53587
Download the installer (vc_redist.exe) and install as “Run as administrator”
Note. This issue has only observed on Windows 7 SP1



(B) Libraries which fail to be loaded, e.g. python\pywrap_tensorflow_internal.py",
line 18, in swig_import_helper (followed by a complete stack trace)

3

C B R T E K S T R A K T O R

The issue appears to be related to the TensorFlow 1.8 version and the type of
CPU of your computer (a.k.a. the AVX processor issue).
https://github.com/tensorflow/tensorflow/issues/17386
The issue can be solved by downgrading to a version of TensorFlow, known to
work on Python 3.5.2, for example V1.4. You might then gradually uninstall
TensorFlow and upgrade to 1.5 and so forth (or just stick to V1.4).

Uninstall TensorFlow can be performed by: pip3 uninstall tensorflow
Force a reinstall: pip3 install - -upgrade - - no-deps - - force-reinstall
tensorflow==1.5

Re-do the smoke test
Run Python
>>>import TensorFlow as tf
>>>hello = tf.constant('Hello, TensorFlow!')
>>>sess = tf.Session()
>>>print(sess.run(hello))

You are all set (hopefully)

Retrain script issues
The Retrain.py script is no longer part of the TensorFlow examples on GitHub. To
make things worse: the replacement script appears to be incompatible with the
current TensorFlow version. Issues are related to Tensor_Hub.
Previous versions of this script are however still available on GitHub. So go back in
time until you find a functioning one.
An easier approach: a copy functioning version of retrain.py can be found here
https://raw.githubusercontent.com/tensorflow/tensorflow/r1.1/tensorflow/examples/i
mage_retraining/retrain.py
Source : https://stackoverflow.com/questions/41433282/tensorflow-retrain-onwindows

4

C B R T E K S T R A K T O R

5

C B R T E K S T R A K T O R

Space Detective - Issue 4
Published April 1952 by Avon Publications.
A groovy funny mystery book, cover art by Gene Fawcette and artwork by Gerald
McCann.

6

7



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Page Count                      : 115
Language                        : nl-BE
Tagged PDF                      : Yes
Author                          : Berton Koen
Creator                         : Microsoft® Office Word 2007
Create Date                     : 2018:05:27 10:42:24
Modify Date                     : 2018:05:27 10:42:24
Producer                        : Microsoft® Office Word 2007
EXIF Metadata provided by EXIF.tools

Navigation menu