INCEpTION User Guide
User Manual:
Open the PDF directly: View PDF .
Page Count: 58
Download | |
Open PDF In Browser | View PDF |
INCEpTION User Guide The INCEpTION Team Version 0.3.1 Table of Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Run as Java application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Optional configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Upgrade. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Make a backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Version 3.2.x to 3.3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Version 2.3.1 to 3.0.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Logging in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Main Menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Opening a Document. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Creating annotations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Spans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Primitive Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Link Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Choosing Layers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Changing role names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Recommendation Sidebar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Curation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Document Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Creating a custom layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Built-in layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Technical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Behaviours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Tagsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 ClassificationTool types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Conditional features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Constraints for slot features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Constraints language grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 User Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 WebAnno TSV 3.2 File format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Encoding and Offsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 File Header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 File Body / Annotations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Reserved Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Sentence Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Token and Sub-token Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Span Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Disambiguation IDs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Slot features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Chain Annotations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Relation Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Introduction This guide summarizes the functionality of INCEpTION from the user’s perspective. It is assumed that you plan to test the INCEpTION standalone version or an already existing server installation of INCEpTION. For information on how to set up INCEpTION for a group of users on a server, please refer to the Administrator Guide. All materials, including this guide, are available via the INCEpTION homepage. 1 System Requirements Table 1. Requirements for users Browser Chrome or Safari Table 2. Requirements to run the standalone version Java Runtime Environment version 8 or higher Table 3. Requirements run the server version Java Runtime Environment version 8 or higher Apache Tomcat version 8.5 or higher (Servlet API 3.1.0) MySQL Server version 5 or higher 2 Workflow The following image shows an exemplary workflow of an annotation project with INCEpTION. First, the projects need to be set up. In more detail, this means that users are to be added, guidelines need to be provided, documents have to be uploaded, tagsets need to be defined and uploaded, etc. The process of setting up and administrating a project are explicitly described in Projects. After the setup of a project, the users who were assigned with the task of annotation annotate the documents according to the guidelines. The task of annotation is further explained in Annotation. The work of the annotators is managed and controlled by monitoring. Here, the person in charge has to assign the workload. For example, in order to prevent redundant annotation, documents which are already annotated by several other annotators and need not be annotated by another person, can be blocked for others. The person in charge is also able to follow the progress of individual annotators. All these tasks are demonstrated in Monitoring in more detail. The person in charge should not only control the quantity, but also the quality of annotation by looking closer into the annotations of individual annotators. This can be done by logging in with the credentials of the annotators. After at least two annotators have finished the annotation of the same document by clicking on Done, the curator can start his work. The curator compares the annotations and corrects them if needed. This task is further explained in Curation. The document merged by the curator can be exported as soon as the curator clicked on Done for the document. The extraction of curated documents is also explained in Projects. 3 Installation Run as Java application All-in-one version which does not require a database server or servlet container to be set up. By default, INCEpTION creates and uses an embedded database. It is not recommended to use the application in such a configuration for production use. Instead, please use a database server when using it in production. For more information, please refer to the Administrator Guide. Get the stand-alone JAR from the downloads page and start it simply with a double-click in your file manager. The application stores its data in a folder called .inception (_dot inception) within your home folder, Optional configuration Alternatively, you can start INCEpTION from the command line, in particular if you wish to provide it with additional memory (here 1 GB) or if you want it to store its data in a different folder. java -Xmx1g -Dinception.home=/my/inception/home -jar inception-app-standalone-XXX.jar Mind to replace /my/inception/home with path of a folder where the application can store its data. By default the server starts on port 8080 and you can access it via a browser at http://localhost:8080 after you started it. You can add the parameter -Dserver.port=9999 at the end of the command line to start the server on port 9999 (or choose any other port). INCEpTION uses Spring Boot. If you need to set additional parameters of the embedded webserver of the stand-alone version, please refer to the Spring Boot embedded container documentation. 4 Upgrade This section describes how to upgrade the standalone version of INCEpTION using an embedded database. For further information on how to upgrade INCEpTION, in particular the WAR version when using a MySQL database or older versions of INCEpTION, please refer to the Administrator Guide. Make a backup Before any upgrade, make a copy of your INCEpTION home folder. If INCEpTION is configured to use an external database, e.g. MySQL, make a backup of this database as well. See the Administrator Guide for further information. Version 3.2.x to 3.3.0 • When upgrading from 3.2.x or earlier to 3.3.0 or later, Automation projects break. Version 2.3.1 to 3.0.0 • The access permissions of the super admin have changed. Super admins can no longer access annotation, curation, and monitoring pages for all projects. They can only access them if they are annotators, admins, or curators in the respective projects. However, they still have full access to the project settings of all projects and can simply give themselves the missing permissions. After an upgrade to 3.0.0, all super admins who require project permissions on existing projects should assign these permissions to themselves. This also applies when importing old projects. For new projects, the creator of the project always starts with annotator, curator, and project admin permissions. If these permissions are not required by the project creator, they should be removed after project creation. 5 Logging in Upon opening the application in the browser, the login screen opens. Please enter your credentials to proceed. 6 When INCEpTION is started for the first time, a default user called admin with the password admin is automatically created. Be sure to change the passwort for this user after logging in (see User Management). Main Menu After login, you will be presented with the overview screen. This screen can be reached at any time from within the GUI by clicking on the Home link in the left upper corner. Here, you can navigate to one of the currently seven options: • Annotation - The page to perform annotations • Curation - Compare and merge annotations from multiple users (only for curators) • [sect_correction] - Correcting automatic annotation (under development) • [sect_automation] - Creating automatically annotated data • Projects - Set up or change annotation projects (only for administrators) • Monitoring - Allows you to see the projects, their progress and change documentstatus (only for administrators and curators) • User Management - Allows you to manage the rights of users Please click on the functionality you need. The individual functionalities will be explained in further chapters. 7 Annotation This functionality is only available to annotators, project managers, and administrators. Annotators and project managers only see projects in which they hold the respective roles. Opening a Document When navigating to the Annotation page, a dialogue opens that allows you to select a project, and a document within the project. If you want to open a different project or document later, click on Open to open the dialog. Projects appear as folders, and contain the documents of the project. Double-click on a document to open it for annotation. Document names written in black show that the document has not been opened by the current user, blue font means that it has already been opened, whereas red font indicates that the document has already been marked as done. Navigation Sentence numbers on the left side of the annotation page show the exact sentence numbers in the document. 8 The arrow buttons first page, next page, previous page, last page, and go to page allow you to navigate accordingly. The Prev. and Next buttons in the Document frame allow you to go to the previous or next document on your project list. You can also use the following keyboard assignments in order to navigate only using your keyboard. Table 4. Navigation key bindings Key Action HOME jump to first sentence END jump to last sentence PAGE DOWN move to the next page, if not in the last page already PAGE UP move to previous page, if not already in the first page SHIFT+PAGE DOWN go to next document in project, if available SHIFT+PAGE UP go to previous document in project, if available A click on the Help button displays the Guidelines for the tool and The Annotator’s Guide to NERAnnotation. When you are finished with annotating or curating a document, please click on the Done button, so that the document may be further processed. If the button above the Done is a cross symbol, it means the documents have already been finished. If the symbol has a tick, it is still open. Annotation of spans works by selecting the span, or double-clicking on a word. This activates the Actions-box on the right, where you can choose a layer. One can also type in the initial letters and chose the needed layer. After having chosen a layer, the drop-down menu inside the Features-box displays the features you can use during the annotation. The tag can be selected out of the dropdown menu inside the Features-box which contains the tags of the chosen layer. 9 To change or delete an annotation, double-click on the annotation (span or link annotations). The Actions-box is now activated. Changes and Deletions are possible via the respective buttons. Link annotations (between POS tags) are created by selecting the starting POS-tag, then dragging the arrow to connect it to its target POS tag. All possible targets are highlighted. Creating annotations To create annotations, you have mainly two options: • select a span of text to create a span annotation • click on an existing span annotation and drag an arc to another span annotation The definition of layers is covered in section Layers. Spans To create an annotation over a span of text, click with the mouse on the text and drag the mouse to create a selection. When you release the mouse, the selected span is activated and highlighted in orange. The annotation detail editor is updated to display the text you have currently selected and to offer a choice on which layer the annotation is to be created. As soon as a layer has been selected, it is automatically assigned to the selected span. To delete an annotation, select a span and click on Delete. To deactivate a selected span, click on Clear. Depending on the layer behavior configuration, spans annotations can have any length, can overlap, can stack, can nest, and can cross sentence boundaries. Example For example, for NE annotation, select the options as shown below (red check mark): 10 NE annotation can be chosen from a tagset and can span over several tokens within one sentence. Nested NE annotations are also possible (in the example below: "Frankfurter" in "Frankfurter FC"). Lemma annotation, as shown below, is freely selectable over a single token. POS can be chosen over one token out of a tagset. Zero-width spans To create a zero-length annotation, hold SHIFT and click on the position where you wish to create the annotation. To avoid accidental creations of zero-length annotations, a simple single-click triggers no action by default. The lock to token behavior cancels the ability to create zero-length annotations. A zero-width span between two tokens that are directly adjacent, e.g. the full stop at the end of a sentence and the token before it (end.) is always considered to be at the end of the first token rather than at the beginning of the next token. So an annotation between d and . in this example would rendered at the right side of end rather than at the left side of .. Forward annotation To improve the speed of POS-annotation, select forward annotation in the Actions box on the left side of your screen. This allows you to select POS-tags via the keys of your keyboard. Pushing a key several times successively proposes every POS-tag starting with the respective letter inside the Features box. Pressing a key whose letter does not represent the beginning of any tag leads to the 11 first tag in the tagset. Once a POS-tag has been selected, pushing space and Enter keys automatically assigns the POS-tag to the token in focus and the next token can be annotated as described. Note that the Enter key will not work for the Safari browser. Also the Forward annotation works only for span annotations with 1) tagset and 2) a layer with only one feature. Co-reference annotation can be made over several tokens within one sentence. A single token sequence has several co-ref spans simultaneously. Relations To create a relation annotation, click on a span annotation and drag the mouse to another span annotation. While you drag, an arc is drawn. It is not possible to create arbitrary relation annotations. In order to create one, a corresponding relation layer needs to be defined between the source and target spans. Depending on the layer behavior configuration, relation annotations can stack, can cross each other, and can cross sentence boundaries. Self-looping relations To create a relation from a span to itself, press the SHIFT key before starting to drag the mouse and hold it until you release the mouse button. To abort the creation of an annotation, hold the CTRL key when you release the mouse button. 12 Currently, there can be at most one relation layer per span layer. Relations Not all arcs displayed in the annotation view are belonging to chain or relation between spans of different layers are not supported. layers. Some are induced by Link Features. When moving the mouse over an annotation with outgoing relations, the info popup includes the yield of the relations. This is the text transitively covered by the outgoing relations. This is useful e.g. in order to see all text governed the head of a particular dependency relation. The text may be abbreviated. Figure 1. Example of the yield of a dependency relation Chains A chain layer includes both, span and relation annotations, into a single structural layer. Creating a span annotation in a chain layer basically creates a chain of length one. Creating a relation between two chain elements has different effects depending on whether the linked list behavior is enabled for the chain layer or not. To enable or disable the linked list behaviour, go to Layers in the Projects Settings mode. After choosing Coreference, linked list behaviour is displayed in the checkbox and can either be marked or unmarked. Figure 2. Configuration of a chain layer in the project settings Figure 3. Example of chain annotations 13 To abort the creation of an annotation, hold CTRL when you release the mouse button. Table 5. Chain behavior Linked List Condition Result disabled the two spans are already in the nothing happens same chain disabled the two spans are in different chains enabled the two spans are already in the the chain will be re-linked such same chains that a chain link points from the source to the target span, potentially creating new chains in the process. enabled the two spans are in different chains the two chains are merged the chains will be re-linked such that a chain link points from the source to the target span, merging the two chains and potentially creating new chains from the remaining prefix and suffix of the original chains. Primitive Features Supported primitive features types are string, boolean, integer, and float. Boolean features are displayed as a checkbox that can either be marked or unmarked. Integer and float features are displayed using a number field. String features are displayed using a text field or - in case they have a tagset - using a combobox. Link Features Link features can be used to link one annotation to others. Before a link can be made, a slot with a role must be added. Enter the role label in the text field and press the add button to create the slot. Next, click on field in the newly created slot to arm it. The field’s color will change to indicate that it is armed. Now you can fill the slot by double-clicking on a span annotation. To remove a slot, arm it and then press the del button. Choosing Layers • Choose one of the predefined layers in the Actions box on the right side of the screen. The Actions box always shows the presently activated layer, the Features box shows the layer of the activated instance. Subsequently, the settings of the two boxes can differ. To change settings during the annotation process, cancel the previously selected layer by clicking on Clear in the Action box and choose a new layer. 14 Changing role names To change a previously selected role name, no prior deletion is needed. Just double-click on the instance you want to change, it will be highlighted in orange, and chose another role name. Settings Once the document is opened, a default of 5 sentences is loaded on the annotation page. The Settings button will allow you to specify the settings of the annotation layer. Next to Annotation layers, you to select the annotation layer which is displayed during annotation. This is useful to reduce clutter if there are many annotation layers. Mind that hiding a layer which has relations attached to it will also hide the respective relations. E.g. if you disable POS, then no dependency relations will be visible anymore. The Remember layer checkbox controls if the annotation layer selected in the Actions box. It will work as main layer during the annotation process. Only instances of this layer will be created, even if an annotation in another layer is selected. If necessary, it is possible to change active instances. Still, if a new instance is selected, the main layer is automatically activated. The Sidebar size controls the width of the sidebar containing the annotation detail edtior and actions box. In particular on small screens, increasing this can be useful. The sidebar can be configured to take between 10% and 50% of the screen. The Number of sentences controls how many sentences are visible in the annotation area. The more sentences are visible, the slower the user interface will react. The Auto-scroll setting controls if the annotation view is centered on the sentence in which the last annotation was made. This can be useful to avoid manual navigation. If Use the same color for all tags in a layer is chosen, annotations are colored per layer. If this option is off, then annotations are colored by their labels (all annotations with the same label also have the same color). Mind that there is a limited number of colors such that eventually colors will be reused. 15 Export Annotations are always immediately persistent in the backend database. Thus, it is not necessary to save the annotations explicitly. Also, losing the connection through network issues or timeouts does not cause data loss. To obtain a local copy of the current document, click on export button. The following frame will appear: Choose your preferred format. Please take note of the facts that the plain text format does not contain any annotations and that the files in the binary format need to be unpacked before further usage. For further information the supported formats, please consult the corresponding chapter Formats. The document will be saved to your local disk, and can be re-imported via adding the document to a project by a project administrator. Please export your data periodically, at least when finishing a document or not continuing annotations for an extended period of time. Recommendation If you have configured one or more recommenders in the Project Settings, you will see recommendations like in the screenshot below. Clicking Reset in the Workflow area will remove all predictions, however it will also remove all hand-made annotations. Predictions made by a specific recommender can be deleted by removing the corresponding recommender in the Project Settings . 16 Recommendation Sidebar Clicking the speech bubble on the left opens the recommendation sidebar. There you can set the maximum number of recommendations for each token. Don’t forget to click on Submit. Active Learning Active learning is a family of methods which seeks to optimize the learning rate of classification algorithms by intelligently soliciting labels from a human user for which the system only has low confidence. This means that recommenders can make better suggestions with less user interactions, allowing the user to perform quicker and more accurate annotations. Once the recommenders are set in the Project Settings, and assuming the project contains documents for annotation and enough annotations for recommenders to generate recommendations, one can now switch to the annotation page. The recommendations should be shown above the tokens: 17 One can now click the active learning icon on the left side and the Active Learning sidebar shows up . One can now select the layer like POS layer for annotation and click Start for starting an active learning session: The Active Learning sidebar will then start showing recommendations, one by one, according to the uncertainty sampling learning strategy. For every recommendation, it shows the related text, the suggested annotation, the confidence score and a delta that represents the difference between the given score and the closest score calculated for another suggestion made by the same recommender to that text. The recommendation is also highlighted in the central annotation editor. One can now Accept, Reject or Skip this recommendation in the Active Learning sidebar: 18 The acceptance, rejection or skipping will be recorded and displayed in the learning history of the Active Learning sidebar. After a suggestion is accepted, the text is annotated with that recommended annotation. If the user rejects a suggestion, the recommendation is deleted. Finally, if the user skips a suggestion, that recommendation will continue being shown in the central annotation editor. Eventually, it could be shown again at the end of the active learning session, when there are no more undealt suggestions. After the user takes an action on the current suggestion, the next recommendation will then show up in the sidebar, and the central annotation editor will jump to its corresponding location (which could sometimes be in another document). The learning history contains a log of all the actions that were taken by the user regarding the suggestions given by the recommenders (acceptances, rejections and skippings). Any entry in the learning history can be deleted by clicking the corresponding trash bin icon. If this learning history 19 is a valid acceptance, after the learning record is deleted, a confirmation dialogue box pops up to confirm whether to delete the annotation too. The user may finish the current active learning session whenever he wants. If there are pending suggestions, they might be shown in the next active learning session that he starts. 20 Curation This functionality is only available to project managers (managers of existing projects), curators, and administrators. Curators and project managers only see projects in which they hold the respective roles. When navigating to the Curation Page, the procedure for opening projects and documents is the same as in Annotation. The navigation within the document is also equivalent to Annotation. Table 6. Explanation of the project colors in the curation open document dialog No curatable documents Red Curatable documents Green Table 7. Explanation of the document colors in the curation open document dialog New Black Annotation in progress Black Curation in progress Blue Curation finished Red In the left frame of the window, named Sentences, an overview of the chosen document is displayed. Sentences are represented by their number inside the document. Sentences containing a disagreement between annotators are colored in red. Click on a sentence in order to select it and to to edit it in the central part of the page. The center part of the annotation page is divided into the Annotation pane which is a full-scale annotation editor and contains the final data from the curation step. Below it are multiple read- 21 only panes containing the annotations from individual annotators. Clicking on an annotation in any of the annotator’s panes transfers the respective annotation to the Annotation pane. When a document is opened for the first time in the curation page, the application analyzes agreements and disagreemens between annotators. All annotations on which all annotators agree are automatically copied to the Annotation pane. Any annotations on which the annotators disagree are skipped. The annotator’s panes are color-coded according to their relation with the contents of the Annotation pane and according to the agreement status. If the annotations were the same, they are marked grey in the lower panes. If the annotations are disparate, the markings are dark blue in the lower frames. By default, they are not taken into the merged file. If you choose one annotation to be right by clicking on it, the chosen annotation will turn green in the frame of the corresponding annotator. Also, the annotation will say USE next to the classification. Note that the Annotation pane is not color-coded. It uses whatever coloring strategy is configured in the Settings dialog. The annotations which were not chosen to be in the merged file are marked dark blue. The annotations which were wrongly classified are marked in red. Table 8. Explanation of the annotation colors in the annotator’s panes (lower panes) Grey all annotators agree Blue disagreement requiring curation; annotators disagree and there is no corresponding annotation in the upper Annotation pane yet Green accepted; matches the corresponding annotation in the upper Annotation pane Red rejected; different to the corresponding annotation in the upper Annotation pane 22 Monitoring This functionality is only available to project managers (managers of existing projects), curators, and administrators. Curators and project managers only see projects in which they hold the respective roles. As an administrator, you are able to observe the progress and document status of projects you are responsible for. Moreover, you are able to see the time of the last login of every user and observe the agreement between the annotators. After clicking on Monitoring in the main menu, the following page is displayed: In the right frame, the overall progress of all projects is displayed. In the left frame one sees all projects, that one has an administrator role in. By clicking on one of the projects on the left, it may be selected and the following view is opened: The percentual progress out of the workload for individual annotators may be viewed as well as the number of finished documents. Below the document overview, a measuring for the inter-annotator-agreement can be selected by opening the Measure dropdown menu. Three different units of measurement are possible: Cohen’s 23 kappa as implemented in DKPro Statistics, Fleiss' kappa and Krippendorff’s alpha. Below the Measure dropdown menu, an export format can be chosen. Currently, only CSV format is possible. Above the Measure dropbdown menu, the Feature box allows the selection of layers for which an agreement shall be computed. Doubleclicking on a layer starts the computation of the agreement and an outline is shown to the left side of the box: Document Status The following table will explain the different symbols which explain the status of a document for a user and the described task. Symbol Meaning Annotation has not started yet Document not available to user 24 Symbol Annotation is in progress Annotation is complete Curation is in progress You can also alter the document status of annotators. By clicking on the symbols you can change between Done and In Progress. You can also alter between New and Locked status. The second column of the document status frame displays the status of the curation. As there is only one curator for one document, curation is not divided into individual curators. Scrolling down, two further frames become visible. The left one, named Layer, allows you to chose a layer in which pairwise kappa agreement between annotators will be calculated. Agreement Agreement can be inspected on a per-feature basis and is calculated pair-wise between all annotators across all documents. The first time a feature is selected for agreement inspection, it takes a moment to calculate the differences between the annotated documents. Switching between different features subsequently is much faster. Agreement is calculated in two steps: 1. Generation of positions and configuration sets - all documents are scanned for annotations and annotations located at the same positions are collected in configuration sets. To determine if two annotations are at the same position, different approaches are used depending on the layer type. For a span layer, the begin and end offsets are used. For a relation layer, the begin and end offsets of the source and target annotation are used. Chains are currently not supported. 2. Calculation of pairwise agreement - based on the generated configuration sets, agreement is calculated. There are two cases where a configuration set may be omitted from the pairwise agreement calculation: a. one of the users did not make an annotation at the position; 25 b. one or both of the users did not assign a value to the feature on which agreement is calculated at the position. The lower part of the agreement matrix displays how many configuration sets were used to calculate agreement and how many were found in total. The upper part of the agreement matrix displays the pairwise Cohen’s kappa scores. The agreement calculations considers an unset feature (with a null value) to be equivalent to a feature with the value of an empty string. Empty strings are considered valid labels and are not excluded from agreement calculation. Annotations for a given position are considered complete when both annotators have made an annotation. Unless the agreement measure supports null values (i.e. missing annotations), incomplete annotations are implicitly excluded from the agreement calculation. If the agreement measure does support incomplete annotations, then excluding them or not is the users' choice. Table 9. Possible combinations for agreement Feature value annotator 1 Feature value annotator 2 Agreement Complete X X yes yes X Y no yes no annotation Y no no empty Y no yes empty empty yes yes null empty yes yes empty no annotation no no 26 Multiple interpretations in the form of stacked annotations are not supported in the agreement calculation! This also includes relations for which source or targets spans are stacked. Projects This functionality is only available to project managers (managers of existing projects), project creators (users with the ability to create new projects), and administrators. Project managers only see projects in which they hold the respective roles. Project creators only see projects in which they hold the project manager role. This is the place to specify/edit annotation projects. You can either select one of the existing projects for editing, or click Create Project to add a project. Although correction and automation projects function similarly, the management differs after the creation of the document. For further description, look at the corresponding chapters [sect_automation] and [sect_correction]. Only admins are allowed to create projects. Click on Create Project to create a new project. 27 After doing so, a new pane is displayed, where you can name and describe your new project. It is also important to chose the kind of project you want to create. You have the choice between annotation, automation, and correction. Please do not forget to save. After saving the details of the new project, it can be treated like any other already existing one. Also, a new pane with many options to organize the project is displayed. To delete a project, click on it in the frame Details. The project details are displayed. Now, click on Delete. The pane with the options to organize and edit a project, as described above, can also be reached by clicking on the desired project in the left frame. 28 By clicking on the tabs, you can now set up the chosen project. Users After clicking on Users, you are displayed a new pane in which you can add new users by clicking on the button Add User. After doing so, you get a list of users in the system which can be added to the project. By making a tick in front of the login, you can chose a new user. Please do not forget to save after choosing all members of the project. Close the pane by clicking on Cancel. The rights of users created like this are that of an annotator. If you want to expand the user’s status, you can do so by clicking on the user and then on Change Permission. The following frame will pop up. After ticking the wished permissions, click on Update. To remove a user, click on the login and then Remove User. 29 Documents To add or delete documents, you have to click on the tab Documents in the project pane. Two frames will be displayed. In the first frame you can import new documents. Choose a document by clicking on Choose Files. Please mind the format, which you have to choose above. Then click on Import Document. The imported documents can be seen in the frame below. To delete a document from the project, you have to click on it and then click on Delete in the right lower corner. Layers All annotations belong to an annotation layer. Each layer has a structural type that defines if it is a span, a relation, or a chain. It also defines how the annotations behave and what kind of features it carries. Creating a custom layer This section provides a short walkthrough on the creation of a custom layer. The following sections act as reference documentation providing additional details on each step. In the following example, we will create a custom layer called Sentiment with a feature called Polarity that can be negative, neutral, or positive. 1. Create the layer Sentiment ◦ Go to the Layers tab in your project’s settings and press the Create layer button ◦ Enter the name of the layer in Layer name: Sentiment ◦ Choose the type of the layer: Span ◦ Enable Allow multiple tokens because we want to mark sentiments on spans longer than a single token. ◦ Press the Save layer button 2. Create the feature Polarity ◦ Press the New feature button ◦ Choose the type of the feature: uima.cas.String ◦ Enter the name of the feature: Polarity ◦ Press Save feature 3. Create the tagset Polarity values ◦ Go to the Tagsets tab and press Create tagset 30 ◦ Enter the name of the tagset: Polarity values ◦ Press Save tagset ◦ Press Create tag, enter the name of the tag: negative, press Save tag ◦ Repeat for neutra and positive 4. Assign the tagset Polarity values to the feature Polarity ◦ Back in the Layers tab, select the layer: Sentiment and select the feature: Polarity ◦ Set the tagset to Polarity values ◦ Press Save feature Now you have created your first custom layer. Built-in layers INCEpTION comes with a set of built-in layers that allow you to start annotating immediately. Also, many import/export formats only work with these layers as their semantics are known. For this reason, the ability to customize the behaviors of built-in layers is limited and it is not possible to extend them with custom features. Table 10. Built-in layers Layer Type Enforced behaviors Chunk Span Lock to multiple tokens, no stacking, no sentence boundary crossing Coreference Chain (no enforced behaviors) Dependency Relation over POS, No stacking, no sentence boundary crossing Lemma Span Locked to token offsets, no stacking, no sentence boundary crossing Named Entity Span (no enforced behaviors) Part of Speech (POS) Span Locked to token offsets, no stacking, no sentence boundary crossing The coloring of the layers signal the following: Table 11. Color legend Color Description green built-in annotation layer, enabled blue custom annotation layer, enabled red disabled annotation layer To create a custom layer, select Create Layer in the Layers frame. Then, the following frame will 31 be displayed. Properties Table 12. Properites Property Description Layer name The name of the layer (obligatory) Description A description of the layer. This information will be shown in a tooltip when the mouse hovers over the layer name in the annotation detail editor panel. Enabled Whether the layer is enabled or not. Layers can currently not be deleted, but they can be disabled. When a layer is first created, only ASCII characters are allowed for the layer name because the internal UIMA type name is derived from the initial layer name. After the layer has been created, the name can be changed arbitrarily. The internal UIMA type name will not be updated. The internal UIMA name is e.g. used when exporting data or in constraint rules. 32 Technical Properties In the frame Technical Properties, the user may select the type of annation that will be made with this layer: span, relation, or chain. Table 13. Technical Properites Property Description Internal name Internal UIMA type name Type The type of the layer (obligatory, see below) Attach to layer (Relations) Determines which span layer a relation attaches to. Relations can only be created between annotations of this span layer. The layer type defines the structure of the layer. Three different types are supported: spans, relations, and chains. Table 14. Layer types Type Description Span Continous segment of text delimited by a start and end character offset. The example shows two spans. Relation Binary relation between two spans visualized as an arc between spans. The example shows a relation between two spans. Example 33 Type Description Example Chain Directed sequence of connected spans in which each span connects to the following one. The example shows a single chain consisting of three connected spans. For relation annotations the type of the spans which are to be connected can be chosen in the field Attach to layer. Here, only non-default layers are displayed. To create a relation, first the span annotation needs to be created. Currently for each span layer there can be at most one relation layer attaching to it. It is currently not possible to create relations between spans in different layers. For example if you define span layers called Men and Women, it is impossible to define a relation layer Married to between the two. To work around this limitation, create a single span layer Person with a feature Gender instead. You can now set the feature Gender to Man or Woman and eventually define a relation layer Married to attaching to the Person layer. Behaviours Table 15. Behaviors Behavior Description Read-only The layer may be viewed but not edited. Lock to token offsets (span, chain) Annotation boundaries are forced to coincide with token boundaries. If the selection is smaller than a token, the annotation is expanded to the next larger token covering the selection. If the selection covers multiple tokens, the annotation is reduced to the first covered token. Allow multiple tokens (span, chain) Like Lock to token offsets except that the annotation may cover multiple tokens. If this is enabled, then Lock to token offsets is automatically disabled. Allow stacking Allow multiple annotations in this layer to be made at exactly the same position. If this option is disabled, a new annotation made at the same location as an existing annotation will replace the existing annotation. Allow crossing sentence boundary (chain) Allow annotations to cross sentence boundaries. 34 Behavior Description Behave like a linked list Controls what happens when two chains are connected with each other. If this option is disabled, then the two entire chains will be merged into one large chain. Links between spans will be changed so that each span connects to the closest following span no arc labels are displayed. If this option is enabled, then the chains will be split if necessary at the source and target points, reconnecting the spans such that exactly the newly created connection is made - arc labels are available. In order to create sub-token annotations, both Lock to token offsets and Allow multiple tokens need to be disabled. Features In this section, features and their properties can be configured. When a feature is first created, only ASCII characters are allowed for the feature name because the internal UIMA name is derived from the initial layer name. After the feature has been created, the name can be changed arbitrarily. The internal UIMA feature name will not be updated. The internal UIMA name is e.g. used when exporting data or in constraint rules. Table 16. Feature properties Property Description Internal name Internal UIMA feature name Type The type of the feature (obligatory, see below) Name The name of the feature (obligatory) Description A description that is shown when the mouse hovers over the feature name in the annotation detail editor panel. Enabled Features cannot be deleted, but they can be disabled 35 Property Description Show Whether the feature value is shown in the annotation label. If this is disabled, the feature is only visible in the annotation detail editor panel. Remember Whether the annotation detail editor should carry values of this feature over when creating a new annotation of the same type. This can be useful when creating many annotations of the same type in a row. Tagset (String) The tagset controlling the possible values for a string feature. The following feature types are supported. Table 17. Feature types Type Description uima.cas.String Textual feature that can optionally be controlled by a tagset. It is rendered as a text field or as a combobox if a tagset is defined. uima.cas.Boolean Boolean feature that can be true or false and is rendered as a checkbox. uima.cas.Integer Numeric feature for integer numbers. uima.cas.Float Numeric feature for decimal numbers. uima.tcas.Annotation (Span layers) Link feature that can point to any arbitrary span annotation other span layers (Span layers) Link feature that can point only to the selected span layer. Please take care that when working with non-custom layers, they have to be exand imported, if you want to use the resulting files in e.g. correction projects. Tagsets To administer the tagsets, click on the tab Tagsets in the project pane. 36 To administer one of the existing tagsets, select it by a click. Then, the tagset characteristics are displayed. In the Frame Tagset details, you can change them, export a tagset, save the changes you made on it or delete it by clicking on Delete tagset. To change an individual tag, you select one in the list displayed in the frame Tags. You can then change its description or name or delete it by clicking Delete tag in Tag details. Please do not forget to save your changes by clicking on Save tag. To add a new tag, you have to click on Create tag in Tag details. Then you add the name and the description, which is optional. Again, do not forget to click Save tag or the new tag will not be created. To create an own tagset, click on Create tagset and fill in the fields that will be displayed in the new frame. Only the first field is obligatory. Adding new tags works the same way as described for already existing tagsets. If you want to have a free annotation, as it could be used for lemma or meta information annotation, do not add any tags. 37 To export a tagset, choose the format of the export at the bottom of the frame and click Export tagset. Constraints To import a constraints file, go to Project and click on the particular project name. On the left side of the screen, a tab bar opens. Choose Constraints. You can now choose a constraint file by clicking on Choose Files. Then, click on Import. Upon import, the application checks if the constraints file is well formed. If they conform to the rules of writing constraints, the constraints are applied. Guidelines To add or delete guidelines, which will be accessible by users in the project, you have to select the tab Guidelines. Two new frames will be displayed. To upload guidelines, click on Choose files in the first frame – Add guideline document, select a file from your local disc and then click Import guidelines. Uploaded guidelines are displayed in the second frame – Guideline documents. To delete a guideline document, click on it and then on Delete in the right lower corner of the frame. Import This functionality is only available to administrators. Projects are associated with the accounts of users that act as project managers, annotators, or curators. When importing a previously exported project, you can choose to automatically generate missing users (enabled by default). If this option is disabled, projects still maintain their association to users by name. If the respective user accounts are created manually after the import, the users will start showing up in the projects. 38 Generated users are disabled and have no password. They must be explicitly enabled and a password must be set before the users can log in again. Export Two modes of exporting projects are supported: • Export the whole project for the purpose of creating a backup, of migrating it to a new INCEpTION version, of migrating to a different INCEpTION instance, or simply in order to reimport it as a duplicate copy. • Export curated documents for the purpose of getting an easy access to the final annotation results. If you do not have any curated documents in your project, this export option is not offered. The format of the exported annotations is selected using the Format drop-down field. When AUTO is selected, the file format corresponds to the format of the source document. If there is no write support for the source format, the file is exported in the WebAnno TSV3 format instead. Some browsers automatically extract ZIP files into a folder after the download. Zipping this folder and trying to re-import it into the application will generally not work because the process introduces an additional folder level within the archive. The best option is to disable the automatic extraction in your browser. E.g. in Safari, go to Preferences → General and disable the setting Open "safe" files after downloading. When exporting a whole project, the structure of the exported ZIP file is as follows: 39 •.json - project metadata file • annotation ◦
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.4 Linearized : No Page Count : 58 Page Mode : UseOutlines Title : INCEpTION User Guide Author : The INCEpTION Team Creator : Asciidoctor PDF 1.5.0.alpha.16, based on Prawn 2.2.2 Producer : The INCEpTION Team Modify Date : 2018:05:24 15:10:54+02:00 Create Date : 2018:05:24 15:10:54+02:00EXIF Metadata provided by EXIF.tools