Users Manual 1 4

User Manual: Pdf

Open the PDF directly: View PDF PDF.
Page Count: 22

DownloadUsers Manual 1 4
Open PDF In BrowserView PDF
https://milegroup.github.io/gasatad/

GASATaD – USER'S GUIDE
Version 1.4
Daniel Pereira Alonso, Leandro Rodríguez-Liñares, María J. Lado
Escola Superior Enxeñería Informática de Ourense
Universidade de Vigo
Ourense, Spain
dpalonso@esei.uvigo.es, leandro@uvigo.es, mrpepa@uvigo.es

TABLE OF CONTENTS
1.

OVERVIEW ............................................................................................................................ 3

2.

GETTING STARTED ............................................................................................................ 4

2.1. License and Disclaimer ............................................................................................................................ 4
2.2. System Requirements ............................................................................................................................... 4
2.3. Downloading GASATaD .......................................................................................................................... 4
2.4. Installing GASATaD ................................................................................................................................ 5
2.5. Uninstalling GASATaD ........................................................................................................................... 6
2.6. Updating GASATaD ................................................................................................................................ 6
2.7. Known issues............................................................................................................................................. 6

3.

USING GASATaD ................................................................................................................... 8

3.1. Running GASATaD ................................................................................................................................. 8
3.2. Main window ............................................................................................................................................ 9
3.3. Task Bar .................................................................................................................................................... 9
3.3.1. File .................................................................................................................................................... 10
3.3.2. Edit .................................................................................................................................................... 11
3.3.3. Options .............................................................................................................................................. 14
3.3.4. About ................................................................................................................................................. 15
3.4. Left Panel ................................................................................................................................................ 15
3.4.1. Basic statistics ................................................................................................................................... 15
3.4.2. Significance tests .............................................................................................................................. 16
3.4.3. Histogram Plot .................................................................................................................................. 17
3.4.4. Scatter Plot ........................................................................................................................................ 18
3.4.5. Pie Chart ........................................................................................................................................... 19
3.4.6. Box Plot ............................................................................................................................................. 20
3.4.7. Bar Chart .......................................................................................................................................... 21

2

1. OVERVIEW
Presently, statistical analyses are becoming more and more important, since different measure
systems, applications and programs yield a vast amount of data that should be analyzed. When
performing statistical analysis, data collection becomes an essential task1. Moreover, it must be
considered that data can be available in many different ways, depending on the type of study. Apart
from the data format, it is beyond doubt that the selection of an adequate modeling and analysis is a
fundamental key to obtain coherent, concluding results2.
At the moment, many statistical software packages capable of dealing with data analysis can be
found. Many of them are commercial, proprietary applications, and users should pay for a license of
use and/or maintenance. As an alternative, more and more free, open source programs are being
published, thus allowing users to perform statistical analyses without a purchase price obligation.
Sometimes, due to the nature of the study to perform, standard software routines are not adequate,
and then customization of the existing code seems to be a good option. This is one of the main
advantages of the free software: users can include modifications in the code to adapt it to their
particular needs.
Frequently, researchers need to unify software functionality into one open source and easily
extendible tool. Attending these needs, we have developed GASATaD, a free, open source software
package for statistical analysis that can be used in the analysis of data coming from different files.
GASATaD includes an intuitive graphical interface, since central ideas behind its design are ease of
installation and of use.
GASATaD has been implemented using Python3, a programming language based on the object
oriented programming paradigm (although it also supports imperative and functional programming),
that gives clean and legible code, thus improving software maintenance. Besides, it makes a quite
efficient use of memory and it is very extensible, thanks to libraries available to programmers. Main
functionalities of GASATaD are:
●

It can merge data from more than one file (.csv or .xlsx formats are imported).

●

It can calculate basic statistics.

●

It can be used to compare data employing different significant analyses.

●

It can plotting data and export figures to different graphic formats.

Details of the implementation and installation instructions are given in the next Section.

1

Efron, Bradley, and Robert Tibshirani. Statistical data analysis in the computer age. Science
(1991): 390-395
2

Ott, R. Lyman, and Micheal T. Longnecker. An introduction to statistical methods and data
analysis. Nelson Education, 2015.
3

http://python.org
3

2. GETTING STARTED
2.1. License and Disclaimer
Copyright (c) 2018 LIA2 Research Group, University of Vigo, Spain
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or
substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
FITNESS
FOR
A
PARTICULAR
PURPOSE
AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

2.2. System Requirements
Binaries of GASATaD are available for Microsoft Windows, debian-based GNU Linux
platforms and Apple MacOS. Minimal system requirements are:
Microsoft Windows Systems: Windows XP or newer, 32 or 64 bits.
Apple MacOS: Mac OSX Lion or newer.
GNU Linux: any recent distribution based on deb packages. Tested on Ubuntu 16.04 LTS
Xenial Xerus, Debian 9 stretch and Linux Mint 18.3 Sylvia.

2.3. Downloading GASATaD
GASATaD, is available from https://milegroup.github.io/gasatad/. To download GASATaD,
either click on the DOWNLOAD button, or on the DOWNLOAD link on the upper right corner,
and select the corresponding version for your operating system (Figure 1). Version 1.4 is a stable and
recommended version, that will be explained in this Manual. In this documentation, X and Y refer to
the version and the subversion of the GASATaD program.
Appart from the binaries for the different operating systems, you can also download the source
code as a .zip file, as well as example files in .csv and .xlsx formats.

4

Figure 1. Binary files downloadable from

2.4. Installing GASATaD
Installation of GASATaD is as follows.
Microsoft Windows
Windows binary is available as a GASATaD_X_Y.msi file. Installation is straightforward for
each version: double click on the file and follow the instructions to complete the installation process.
After license agreement, GASATaD is installed on your system. Installation also creates a link
to the tool in your desktop.
Apple MacOS
Installation of GASATaD in a MacOS system is done by a script available in the program web
page. The only thing to do is to open a terminal and paste the following text:
bash <(curl -fsSL
https://github.com/milegroup/gasatad/raw/master/docs/packages/GASATaD_1_4.MacInstall.sh)

This installs (root access rights are required), and all the packages GASATaD depends on. The
downloading and installation process will take some minutes and will need 900 MB of disk,
approximately.
GNU Linux
GASATaD is distributed as a GASATaD_X.Y_all.deb package. The easiest way to install is to
download this package, open a terminal and change to the directory where the file is. Then use the
following commands:
$ sudo apt install gdebi
$ sudo gdebi GASATaD_X.Y_all.deb

gdebi will install GASATaD and its dependencies. The program is avalaible both in the Start
menu and in console mode as GASATaD.
You can also install GASATaD using the Software Centre available in some distributions of
GNU Linux.

5

Source Code
GASATaD sources are distributed as a file named GASATaD_X.Y.zip. Advanced users can
run the program from the source code. A working installation is needed of the Python programming
language (version 2) including the following libraries:
● Matplotlib
● wxPython
● Numpy
● Scipy
● Pandas
● xlrd, xlwt and openpyxl
In Linux debian-based systems, just open a terminal and use the following command:
$ sudo apt install python-numpy python-wxgtk3.0 python-matplotlib pythonscipy python-pandas python-xlrd python-xlwt python-openpyxl

Then, go to the directory containing the .zip file, uncompress it and use:
$ python GASATaD_X_Y.py

2.5. Uninstalling GASATaD
To remove GASATaD from your computer, use the following instructions.
Microsoft Windows
To uninstall GASATaD, go to Start -> Control Panel -> Add or Remove programs, select
GASATaD and press the Remove button.
Apple MacOS
To uninstall GASATaD, just open a terminal and type:
bash <(curl -fsSL https://github.com/milegroup/gasatad/raw/ghpages/packages/GASATaD.MacUninstall.sh)

GNU Linux
To uninstall GASATaD, open a terminal and use the command:
$ sudo apt remove GASATaD

2.6. Updating GASATaD
Updating GASATaD when new versions are available is an easy task that can be performed
just installing the new versions as it were a fresh install.

2.7. Known issues
Apple MacOS
There is some problem with focus the in Apple MacOS implementation of wxpython libraries.
The effect of this problem is that sometimes, just after running GASATaD, new windows are not
shown. Normally, clicking on the GASATaD icon in the dock solves this problem.
6

GNU Linux
If Linux version is called from a terminal, there may be a warning (Gtk-WARNING **: Unable to
retrieve the file info...) in the terminal when saving a new project. This is caused by the
implementation of file dialogs in some libraries. Users can ignore this message because the file is
saved correctly.

7

3. USING GASATaD
3.1. Running GASATaD
GASATaD provides a wide range of functionalities related to statistical analysis, and can be
easily and intuitively used, presenting results in a very illustrative and attractive way for the user. To
run GASATaD just double-click on the corresponding icon (Figure 2).

Figure 2. GASATaD icon: double-click for running the application.

When opening GASATaD, a notification button at the lower left corner may appear if a new
version is available (Figure 3). By clicking on this button, the new version can be downloaded and
installed.

Figure 3. Initial view for GASATaD indicating new version.

Once the application is running, the appearance of GASATaD is quite similar to other widely
used softwares (Figure 4).

8

Figure 4. Initial view for GASATaD.

3.2. Main window
The main window of GASATaD is composed by several different areas, and each of one allows
the user to have access to all its functionalities:
●

Task bar: containing File, Edit, Options and About menus.

●

Right panel: in this area, data will be displayed when files are loaded. Columns are labelled
with letters, following the alphabetical order, while rows are numbered from 1 to the total
number of rows of the file. Initially, interactions with rows and columns are not allowed; only
when files are loaded, operations with rows and columns are possible.

●

Left panel: two different areas can be distinguished: the upper one corresponds to some
information about the data; just under this area, options that allow manage data and files, and
perform statistical analyses (basic or significant) are presented. Users can also generate a wide
variety of plots. Initially, all these options are disabled, becoming automatically available
when a data file is loaded.

When starting GASATaD, only a few actions can be performed: the user can open a new file
(File menu), select the format of file (Options menu) or consult the information in the About menu.
The remaining functionalities are activated either when data are avaliable and then it becomes
possible to perform the action.
A detailed description about all the options is given in the next Sections of this manual. The
data used for demostration purposes are fictitious, specifically fabricated for this task. They can be
downloaded from files testfile1.xlsx and testfile2.xlsx, available on the GASATaD Website.

3.3. Task Bar
Menus File, Edit, Options and About are included in the task bar, allowing users to perform
different operations.

9

3.3.1. File
Open new file...
A new file to process can be opened with this option. As an example, we have used
testfile1.xlsx. The shortcut Ctrl+N can be also used to open the file, as well as the Open new file
option on the left panel. Once the file is opened, all the options of GASATaD become available for
users. Numerical values appear in white cells, while yellow background is reserved to textual,
categorical variables.
Add file...
Any number of additional files can be added to the data by either clicking on this option, using
the shortcut Ctrl+O, or by clicking on the Add file button on the left panel. For this to be performed,
the number of rows of the previous data must match the number of rows of the file. If not, an error
message appears (Figure 5).

Figure 5. Error message displyed when opening files of different number of rows.

When an additional file is correctly opened, data are added to existing data as columns. In case
column names are coincident, some characters are added to the new column, (see in Figure 6 “Case”
is renamed to “Case_2”). Cells can be edited using double-click, and values can be modified. Empty
cells are marked as “null”, and are not considered for the different analyses.

Figure 6. Combining data from different files in GASATaD.

10

Save data...
The data present in GASATaD can be saved in a new file, by clicking on Save data... option
from the File menu. Alternatively, users can also type Ctrl+S, or press the Save data on the left panel.
As an example, combined data from files testfile1.xlsx and testfile2.xlsx were saved. A dialogue box
appears, and the user can select the output format as either .csv or .xlsx files. In case the user select
the .csv option,a dialogue window with the csv export options is opened.
Close data
Data can be closed by clicking on this option, typing Ctrl+W, or pressing the corresponding
button on the left panel. When this operation is finished, functionalities are disabled, until the user
opens a new file.
Quit
To exit the application, this option should be pressed or, alternatively, the shortcut Ctrl+Q must
be typed.
3.3.2. Edit
In this menu, functionalities to deal with data in rows and columns are included. When either
one or more rows or columns are selected, options may become available. Many of the options for
columns are also displayed by selecting the corresponding column, and clicking on the mouse right
button (Figure 7). An explanation about each option is included on the following.

Figure 7. Funcitonalities on columns available using right-click.

Undo
The last operation can be undone by clicking on this option.
Delete selected columns/rows
When a row/column (or group of them) is selected, users have the possibilty of deleting it. This
task can also be performed by clicking the mouse right button. In our example, column “Case_2” has
been removed from data (Figure 8).

11

Figure 8. Deleting a selected colum.

Rename selected column
By pressing this button, a dialogue box appears, and the column/row name can be changed
(Figure 9).

Figure 9. Renaming a colum.

Move selected column
Sometimes, it can be interesting to move columns from one position to another. This can be
done with this option. A dialogue box offering the user the posibility to select the new position for
the selected column is displayed (Figure 10).

Figure 10. Moving a selected colum to another position.

12

Sort using selected column
By selecting a column and clicking on this option, all data can be sortered acording to the
selected column. As an example, rows were sorted according to increasing values of “Height” column
(Figure 11).

Figure 11. Sorting rows according to “Height” column.

Convert selected column to text
On some occasions, it becomes necessary to convert numerical values to text, which means that
numerical variables should become categorical variables. This funcionality is available in GASATaD
by pressing this option. As an example, the “Age” column has been converted to text, and its values
appear now with yellow background, as expected (Figure 12).

Figure 12. Converting “Age” data to text.

Convert selected column to numbers
Similarly, text data can be converted to numerical values by clicking on this option.
Add text column...
This option allows users to incorporate new categorical data from a numerical column. A
dialogue box appears and the user can select the new category name. In the example, a “Smoker
Category” column was added from data belonging to the “Number of cigarettes per day” column
(Figure 13).
13

Figure 13. Creating the “Smoker Category” column.

As a result, a new text column was added to the right side of the data (Figure 14).

Figure 14. Data incorporating the new column constructed.

Delete columns
Finally, one or more columns can be simultaneously selected and deleted by using this option.
A dialogue box appears, and the user can select columns to be deleted (Figure 15).

Figure 15. Deleting a set of selected columns.

3.3.3. Options
Discard first column in csv files
Sometimes, the first column of the CSV file can contain irrelevant information that cannot be
interesting for the study. In this case, it can be discarded when opening the file by selecting this
option.
CSV character separator
With this menu entry, the CSV character separator of the file can be selected. Options are
Comma, Semicolon, and Tabulator.

14

Reset options
By using this entry, all the options are reset to their default values.
3.3.4. About
About GASATaD
It opens a dialogue box with general information about the program.

Figure 16. About GASATaD.

3.4. Left Panel
3.4.1. Basic statistics
GASATaD automatically identifies columns containing only integer values, which can be used
to select rage of data for analysis. In our example, Case, Age, Years of Schooling, Age of initiation
in smoking, and Number of cigarettes per day are detected as integer values, and ranges can be
established to perform different analyses (Figure 17).

Figure 17. Detection of integer values (in bold) allowing to establish numerical ranges.

15

If any value in each of these columns is set to a value different from an integer, the column will
not be included in the previous list.
A quantitave analysis of numerical data can be performed by pressing this button. A dialogue
box appears, and the user can select one or more numerical columns to perform the analysis. As an
example, Figure 18 shows the statistical analysis obtained for the “Age of initiation in smoking”
column, for women cathegorized as severe smokers. Number of cases, minimum, maximum, mean,
median and mode of values are presented, as well as standard deviation, variance and covariance,
25%-50%-75% quartiles, Pearson correlation, kurtosis and data skew.

Figure 18. Basic statistics.

3.4.2. Significance tests
Several significance tests (standard t-test, Welch’s unequal variances, Kolmogorv-Smirnov and
Wilcoxon rank-sum) can be used to find correlations between sets of data. GASATaD does not
analyze if the test selected by the user is the most appropiate for a specific set of data; thus, it is users’
responsibility to verify the validity of the test applied.
When the Significance test button is pressed, a dialogue box appears, and options to perform
the analysis can be selected. When data and subsets by category haven been selected, and the Show
Results option has been clicked, results appear on the right panel of the dialogue box (Figure 19).

16

Figure 19. Significance tests.

3.4.3. Histogram Plot
Data can be plotted in different ways. First option corresponds to Histogram plot. By clicking
on the corresponding icon, a dialogue box appears, where users can indicate their preferences, such
as title, axis labels, display setting, legend position, x variable and tag (Figure 20).

Figure 20. Histogram plot generation.

The corresponding plot is then generated, according to users’ selection (Figure 21).

17

Figure 21. Final histogram plot generated with GASATaD.

3.4.4. Scatter Plot
In a similar way, a scatter plot can be generated when the corresponding button is pressed.
Users can select title, axis labels, display settings, legend, and x and y (one or more) variables.
Furthermore, linear fit can be also estimated and plotted in the graph (Figure 22).

Figure 22. Scatter plot generation.

18

Figure 23 shows the final result for the scatter plot.

Figure 23. Final scatter plot generated with GASATaD.

3.4.5. Pie Chart
Pie charts can also be generated with GASATaD. Similar to previous plots, title, legend, and
tag can also be selected according to users’ preferences (Figure 24).

Figure 24. Pie chart generation.

19

As a result, a pie chart is then generated (Figure 25).

Figure 25. Final pie chart generated with GASATaD.

3.4.6. Box Plot
If a box plot is needed, it can be obtained by clicking on the corresponding button of the left
panel. A dialogue box appears, and title, display settings and variables can be selected. GASATaD
also offers the possibility to group data to be plotted according to the established data categories
(Figure 26).

Figure 26. Box plot generation.

20

Figure 27 shows the box plot corresponding to the user’s preferences.

Figure 27. Final box plot generated with GASATaD.

3.4.7. Bar Chart
Finally, it is also possible to generate bar charts. When this button is pressed, a dialogue box
appears that allows users to define title, axis labels, display setting, x variables and tags, and different
operations to be applied over data (Figure 28).

Figure 28. Bar chart generation.

21

The bar chart generated according to the options indicated in Figure 28 is now presented (Figure
29).

Figure 29. Final bar chart generated with GASATaD.

All the windows of the plots created by GASATaD include a set of buttons which give access
to additional functionalities of the graphics (Figure 30).

Figure 30. Tools to manage plots.

It can be used to move the plot along both the x and y axis.

Allows users to zoom in to a selected area in the plot

Configuration of margins of plots

Undo/redo

After modifications, revert plot to its initial state.

Plots can be saved in different formats, such as .eps, .pgf, .pdf, .png, .ps, .raw, .svg, .tiff.
22



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
Linearized                      : No
Page Count                      : 22
PDF Version                     : 1.4
Title                           : Microsoft Word - UsersManual.docx
Producer                        : Mac OS X 10.13.1 Quartz PDFContext
Creator                         : Word
Create Date                     : 2018:01:13 19:09:56Z
Modify Date                     : 2018:01:13 19:09:56Z
EXIF Metadata provided by EXIF.tools

Navigation menu