Thought Spot Administration Guide 4.2

User Manual: Pdf

Open the PDF directly: View PDF PDF.
Page Count: 320

DownloadThought Spot Administration Guide 4.2
Open PDF In BrowserView PDF
ThoughtSpot
Administration Guide
Version 4.2
February 2017

Table of Contents

Contents
Chapter 1: Installation and setup............................................................ 8
Login credentials for administration........................................................................... 10
Log in to the Linux shell using SSH............................................................................ 10
Log in to ThoughtSpot from a browser...................................................................... 11
Set your ThoughtSpot locale..........................................................................................12
Software updates.................................................................................................................12
Test network connectivity between nodes............................................................... 12
Set the relay host for SMTP (email)............................................................................ 13
Verify that email is working................................................................................. 13
Set up a fiscal calendar year..........................................................................................14
About SSL (secure socket layers)................................................................................ 15
Configure SSL for web traffic............................................................................. 15
Set the recommended TLS version.................................................................. 16
Configure SAML................................................................................................................... 17
About LDAP integration................................................................................................... 18
Configure OpenLDAP............................................................................................. 18
Configure LDAP for Active Directory............................................................. 20
Add the SSL certificate for LDAP.....................................................................22
Test the LDAP configuration...............................................................................23
Sync users and groups from LDAP..................................................................23
Mount a NAS file system.................................................................................................25
Add a custom support contact.................................................................................... 27
Set up monitoring.............................................................................................................. 29
Connect to the ThoughtSpot Support file server................................................. 30
Set up remote support access.......................................................................................31
Enable the call home capability................................................................................... 32
Network ports...................................................................................................................... 33

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

2

Table of Contents

About load balancing and proxies............................................................................. 40

Chapter 2: Load and manage data..................................................... 43
About case configuration............................................................................................... 46
Generate CSV files with the data to be loaded..................................................... 46
Load data from a web browser....................................................................................47
Append data from a web browser............................................................................. 50
Plan the schema..................................................................................................................52
Data types..................................................................................................................54
Constraints..................................................................................................................57
Sharding...................................................................................................................... 60
Chasm traps...............................................................................................................63
Chasm trap limitations..........................................................................................64
Build the schema................................................................................................................66
About TQL, the SQL command line interface..............................................67
Connect to the database with the ThoughtSpot SQL Command Line
(TQL)....................................................................................................................... 67
Create the schema in TQL.................................................................................. 68
Write a SQL script to create the schema......................................................69
Schema creation examples................................................................................. 70
Import a schema (use the SQL editor).......................................................... 73
Change the schema...........................................................................................................75
Change the primary key for a table................................................................ 75
Change a relationship between tables........................................................... 76
Change sharding on a table............................................................................... 78
About data type conversion...............................................................................79
Load data with ThoughtSpot Loader.........................................................................83
Import CSV files with ThoughtSpot Loader................................................. 84
Use a script to load data..................................................................................... 85
Bulk load files in parallel...................................................................................... 86

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

3

Table of Contents

Delete a data source.........................................................................................................88
Delete a data source from the browser......................................................... 89
Delete or change a table in TQL...................................................................... 92
About the Schema Viewer............................................................................................. 93

Chapter 3: Model, link, and tag your data for searching............. 97
Model the data for searching........................................................................................98
Model data in the ThoughtSpot application................................................ 98
Model data in bulk in the modeling file....................................................... 100
Data modeling settings.......................................................................................104
Link tables using relationships.................................................................................... 123
Create a relationship............................................................................................ 124
Delete a relationship.............................................................................................128
About stickers.................................................................................................................... 129
Create stickers........................................................................................................ 130
Apply a sticker......................................................................................................... 131
Filter by a sticker................................................................................................... 132

Chapter 4: Simplify searching with worksheets........................... 134
Create a new worksheet................................................................................................ 137
Add sources and columns to a worksheet.................................................. 137
How the inclusion rule works............................................................................ 141
How the worksheet join rule works................................................................143
About formulas in worksheets......................................................................... 150
Edit a worksheet............................................................................................................... 153
Rename a worksheet or table.......................................................................... 154
Change the inclusion or join rule for a worksheet....................................155
Delete a worksheet or table.........................................................................................157

Chapter 5: Manage users, groups, and privileges........................ 160

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

4

Table of Contents

About privileges................................................................................................................ 162
Add a group and set security privileges.................................................................163
Edit or delete a group....................................................................................................166
Add a user........................................................................................................................... 167
Add multiple users to a group....................................................................................169
Edit or delete a user....................................................................................................... 170
Forgotten password.......................................................................................................... 171

Chapter 6: Manage jobs......................................................................... 172
About scheduled pinboards......................................................................................... 173
Schedule a pinboard job.....................................................................................175
Scheduled pinboards management................................................................178

Chapter 7: About security..................................................................... 182
System security..................................................................................................................183
Get audit logs..........................................................................................................183
Security policies..................................................................................................... 185
Data security.......................................................................................................................186
Share tables and columns.................................................................................. 187
Share worksheets.................................................................................................. 190
Share a pinboard....................................................................................................192
Revoke access (unshare).................................................................................... 194
Row level security................................................................................................. 196
Network security............................................................................................................... 212

Chapter 8: System administration...................................................... 213
System monitoring........................................................................................................... 214
About the Space Utilization chart................................................................. 220
Generate and send a log bundle............................................................................... 222
Send logs to the administrator...................................................................................222

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

5

Table of Contents

Set up recording for Replay Search......................................................................... 223

Chapter 9: Backup and restore.......................................................... 228
About backups..................................................................................................................230
Take a snapshot.....................................................................................................232
Configure periodic snapshots.......................................................................... 233
Take a backup of a snapshot........................................................................... 235
Configure periodic backups............................................................................. 236
About restore operations..............................................................................................238

Chapter 10: About troubleshooting................................................. 240
Get logs.................................................................................................................................241
Upload logs to ThoughtSpot Support......................................................... 243
Network connectivity issues....................................................................................... 244
Change the timezone.....................................................................................................245
Browser untrusted connection error........................................................................245
Characters not displaying correctly......................................................................... 246
Clear the browser cache...............................................................................................247
Cannot open a saved answer that contains a formula..................................... 249
Data loading too slowly................................................................................................. 251
Search results contain too many blanks.................................................................252

Chapter 11: Reference............................................................................. 254
TQL reference.................................................................................................................... 255
ThoughtSpot Loader flag reference.........................................................................266
tscli command reference.............................................................................................. 270
Formula reference........................................................................................................... 290
Date and time formats reference..............................................................................305
Row level security rules reference............................................................................307

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

6

Table of Contents

Contact ThoughtSpot..............................................................................317
Open source software.............................................................................319
Copyright.....................................................................................................320

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

7

Installation and setup

Chapter 1: Installation and setup
Installation and setup

Topics:
•

Login credentials for
administration

•

Log in to the Linux shell
using SSH

•

•

you through the basic steps required to set up and
configure ThoughtSpot. It will also assist you in
troubleshooting some common problems, finding
additional resources, and contacting ThoughtSpot.

Log in to ThoughtSpot

ThoughtSpot enables you to access and analyze

from a browser

your data through a search-based user interface. You

Set your ThoughtSpot
locale

•

Software updates

•

Test network connectivity
between nodes

•

This ThoughtSpot Administrator Guide will walk

Set the relay host for

can create your searches on the fly by typing into a
search bar, like you do when using an internet search
engine. ThoughtSpot makes it easy to see your data,
get your questions answered, create interactive
graphs, and customize pinboards. You do not need to
understand how the data is stored or know SQL to do
these things.

SMTP (email)
•

Set up a fiscal calendar
year

•

About SSL (secure socket
layers)

•

Configure SAML

•

About LDAP integration

•

Mount a NAS file system

•

Add a custom support

ThoughtSpot gives administrators the ability to
modify data properties to meet business needs, for
example by providing search synonyms for common
terms, boosting the importance of a column in
search results, or formatting how the data appears.
Collaboration and security features make it easy for
you to protect sensitive data and for users to share
information safely with others.

contact
•

Set up monitoring

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

8

Installation and setup

•

Connect to the
ThoughtSpot Support file
server

•

Set up remote support
access

•

Enable the call home
capability

•

Network ports

•

About load balancing and
proxies

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

9

Installation and setup

Login credentials for administration
You will need administrative permissions to perform the actions discussed in this
guide. You can access ThoughtSpot via SSH at the command prompt and from a
Web browser.
There are two separate default administrator users, an operating system user
that you type in at the Linux shell prompt, and an application user for access
through a browser. Make sure you use the correct login and password for the
method you are using to log in. Passwords are case sensitive.
Table 1: Default administrative user credentials
Login Type

User

Access Method

Password

OS user

admin

Access remotely via SSH from

Contact ThoughtSpot to obtain

the command prompt on a

the default password.

client machine.
Application user

tsadmin

Access through a Web

Contact ThoughtSpot to obtain

browser.

the default password.

Log in to the Linux shell using SSH
To perform basic administration such as checking network connectivity, starting
and stopping services, and setting up email, log in remotely as the Linux
administrator user "admin". To log in with SSH from a client machine, you can
use the command shell or a utility like Putty.
In the following procedure, replace  with the hostname or IP
address of a node in ThoughtSpot. The default SSH port (22) will be used.
1. Log in to a client machine and open a command prompt.
2. Issue the SSH command, specifying the IP address or hostname of the
ThoughtSpot instance:
ssh admin@

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

10

Installation and setup

3. Enter the password for the admin user.

Log in to ThoughtSpot from a browser
To set up and explore your data, access ThoughtSpot from a standard Web
browser using a username and password.
Before accessing ThoughtSpot, you need:
• The Web address (IP address or server name) for ThoughtSpot.
• A network connection.
• A Web browser.
• A username and password for ThoughtSpot.
Supported Web browsers include:
Table 2: Supported browsers
Browser

Version

Operating System

Google Chrome

20 and above

•

Windows 7 or greater

•

Linux

•

MacOS

•

Windows 7 or greater

•

Linux

•

MacOS

•

Windows 7 or greater

Mozilla Firefox

Internet Explorer

14 and above

11

To log in to ThoughtSpot from a browser:
1. Open the browser and type in the Web address for ThoughtSpot:
http://

2. Enter your username and password and click Enter Now.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

11

Installation and setup

Set your ThoughtSpot locale
In addition to American English, ThoughtSpot also supports German and
Japanese.
The language displayed in ThoughtSpot is based off of your browser locale. So
if you set Japanese as your default language in your browser settings, then the
interface will update to reflect that after you refresh your page.
Keywords, operators, and error messages are included in the translated material.
Formulas, however, are not translated. Also, all metadata will remain as user
inputted.
This feature is supported on all browsers that support ThoughtSpot.
To set your ThoughtSpot locale:
1. Go to the settings page of your browser.
2. Change your default language to one that is supported by ThoughtSpot.
3. Save your settings.
4. Refresh your ThoughtSpot browser page.
Your ThoughtSpot interface should reflect your new chosen language.

Software updates
The ThoughtSpot software is updated by ThoughtSpot Support.
ThoughtSpot Support will contact you to schedule an update when one
becomes available.

Test network connectivity between nodes
This procedure tests the network connectivity between the ThoughtSpot nodes,
and to the LAN. If you can perform these steps successfully, the network settings
on ThoughtSpot are correct.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

12

Installation and setup

1. Log in to the Linux shell using SSH.
2. Ping each of the other nodes in the cluster.
3. Ping another machine that exists outside of the cluster, for example, a
machine that you will use to stage data to be loaded.
Note: If you cannot perform these tests successfully, there is a problem
with the network setup.
4. If the tests fail, check Network connectivity issues.

Set the relay host for SMTP (email)
To enable alert emails, you'll need to set up a relay host for SMTP traffic
from ThoughtSpot. This routes the alert and notification emails coming from
ThoughtSpot through an SMTP email server.
To set up a relay host:
1. Log in to the Linux shell using SSH.
2. Issue the setup command, providing the IP address of the relay host:
$ tscli smtp set-relayhost 

3. Verify your settings:
$ tscli smtp show-relayhost

4. Verify that email is working.

Verify that email is working
Check if the email settings are working properly by using this procedure.
ThoughtSpot uses emails for sending critical notifications to ThoughtSpot
Support.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

13

Installation and setup

ThoughtSpot sends alerts to the email address specified during installation. If no
email address was entered, no alerts will be sent. But you can add an email to
receive alerts by issuing:
$ tscli monitoring set-config --email 

You can add a list of email addresses separated by commas, with no spaces.
To verify that the send email function is working correctly:
1. Log in to the Linux shell using SSH.
2. Try sending an email to yourself by issuing:
$ echo | mail -s Hello 

3. If you receive the email at the address you supplied, email is working correctly.

Set up a fiscal calendar year
Many companies start their fiscal calendar in a month other than January. If
this is the case in your company, setting a fiscal calendar quarter makes the
ThoughtSpot date searches reflect your fiscal year.
When you set a custom fiscal year, you will designate the month on which your
company's fiscal year begins. All the date language will then reflect your change,
so if someone searches for this quarter or q3, the answer will conform to the
fiscal quarter in use. When you make this change, existing pinboards also change
to reflect the custom fiscal calendar. Because of this, if you make this change
after your users have been using ThoughtSpot for any period of time, you should
alert them of the change you will be making and how it affects previous saved
searches.
Call ThoughtSpot Support, so they can help you set the custom fiscal year.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

14

Installation and setup

About SSL (secure socket layers)
You should use SSL (secure socket layers) for sending data to and from
ThoughtSpot. SSL provides authentication and data security. This section applies
to both SSL to enable secure HTTP and secure LDAP.
Many IT departments require SSL for their applications that access data. To use
SSL with ThoughtSpot, you'll need your company's own SSL certificate. The
certificate is issued per domain, so if you want to use SSL for both HTTP and
LDAP, you will need two separate certificates - one for the HTTP domain and one
for the LDAP domain.
If you do not have an SSL certificate:
• Check with your IT department to see if they already have an SSL certificate
you can use.
• If not, you will need to obtain the certificate from an issuing authority.
• Alternatively, you may disable SSL if you don't want the security it provides by
using the command tscli ssl off.
There are many SSL vendors to choose from. Check with your existing Web
hosting provider first, to see if they can provide the certificate for you.
When you apply for the SSL certificate, you may specify a SAN, wildcard, or
single domain certificate. Any of these can work with ThoughtSpot.

Configure SSL for web traffic
This procedure shows how to add SSL (secure socket layers) to enable secure
HTTP (HTTPS) in ThoughtSpot.
To set up SSL, you will need:
• The SSL certificate
• The private key
To install the SSL certificate:

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

15

Installation and setup

1. Follow the instructions from your certifying authority to obtain the certificate.
This is usually sent via email or available by download.
2. Copy the certificate and key files to ThoughtSpot:
$ scp   admin@:

3. Log in to the Linux shell using SSH.
4. Change directories to where you copied the certificate:
$ cd 

5. Issue the tscli command to install the certificate:
$ tscli ssl add-cert  

6. To test that the certificate was installed correctly, Log in to ThoughtSpot from
a browser.
You should see that the URL begins with https://.

Set the recommended TLS version
There are a couple of security vulnerabilities due to SSL certificates supporting
older versions of TLS (Transport Layer Security). This procedure shows you how
to set the recommended TLS version to avoid these vulnerabilities.
The PCI (Payment Card Industry) Data Security Standard and the FIPS 140-2
Standard require a minimum of TLS v1.1 and recommends TLS v1.2.
ThoughtSpot supports SSL v3, TLS v1.0, and TLS v1.1 for backwards compatibility.
However, the recommended version is TLS v1.2. Therefore, to set the
recommended TLS version:
1. Enable your web browser to support TLS v1.2. This can be done in your
browser's advanced settings.
2. Log in to the Linux shell using SSH.
3. Issue the following command:
tscli security set-min-version 1.2

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

16

Installation and setup

This will block all usage of older versions.

Configure SAML
ThoughtSpot can use Security Assertion Markup Language (SAML) to
authenticate users. You can set up SAML through the shell on ThoughtSpot using
a tscli based configurator.
Before configuring SAML, you will need this information:
• IP of the server where your ThoughtSpot instance is running.
• Port of the server where your ThoughtSpot instance is running.
• Protocol, or the authentication mechanism for ThoughtSpot.
• Unique service name that is used as the unique key by IDP to identify the
client.
It should be in the following format: urn:thoughtspot:callosum:saml
• Allowed skew time, which is the time after authentication response is rejected
and sent back from the IDP. It is usually set to 86400.
• The absolute path to the idp-meta.xml file. This is needed so that the
configuration persists over upgrades.
• This configurator also checks with the user if internal authentication needs to
be set or not. This internal authentication mechanism is used to authenticate
tsadmin, so set it to true if you do not know what it does.
Use this procedure to set up SAML on ThoughtSpot for user authentication. Note
that this configuration persists across software updates, so you do not need to
reapply it if you update to a newer release of ThoughtSpot.
1. Log in to the Linux shell using SSH.
2. Execute the command to launch the interactive SAML configuration:
tscli saml configure

3. Complete the configurator prompts with the information you gathered above.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

17

Installation and setup

4. When the configuration is complete, open a Web browser and go to the
ThoughtSpot login page. It should now show the Single Sign On option.

About LDAP integration
Some companies use LDAP (Lightweight Directory Access Protocol) to manage
user authentication. Using LDAP provides security and makes user management
more centralized.
ThoughtSpot can be configured to authenticate users against an LDAP
server. You can choose to authenticate users against an LDAP server, against
ThoughtSpot, or both.
ThoughtSpot supports both anonymous and non-anonymous LDAP integration.
Non-anonymous LDAP binding is more rigorous than anonymous authentication,
but it should help you track what your users are querying and keep a log trace
for auditing purposes.
If you have been using ThoughtSpot with users you created manually, and you
now want to transition to LDAP, please contact ThoughtSpot Support. They can
assist you in migrating existing users to their LDAP equivalents.
ThoughtSpot supports these types of LDAP servers:
• OpenLDAP
• Active Directory

Configure OpenLDAP
Use this procedure to set up integration with LDAP using OpenLDAP.
Before configuring OpenLDAP, you will need this information:
• URL to connect to OpenLDAP
For example, ldap://192.168.2.48:389
• Distinguished Name template

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

18

Installation and setup

The template for usernames, for example cn={0},ou=users,dc=thoughtspot,dc=com
• Automatically add LDAP users in ThoughtSpot?
If you choose 'yes' for this, when a user is authenticated against LDAP, if that
user does not exist in ThoughtSpot, the user is automatically created. When
users are created in this way, their passwords exist only in LDAP and are not
stored in ThoughtSpot.
If you choose 'no' for this, users who will authenticate against LDAP have to
be manually created with a dummy password as a placeholder in ThoughtSpot
before they can log in. In order to log in to ThoughtSpot, the user has to exist
in ThoughtSpot independent of whether that user is authenticated against
LDAP or against ThoughtSpot's internal authentication.
• Also use ThoughtSpot internal authentication?
If you choose 'yes' for this, when a user logs in, ThoughtSpot will first attempt
to authenticate the user against LDAP. If that attempt fails, it will then attempt
to authenticate the user against ThoughtSpot. If either of these succeed, then
the user is successfully logged in. This option is useful in scenarios where
some users are not in LDAP and are created only in ThoughtSpot.
You do not need to create a user called tsadmin on your LDAP server. Internal
authentication can be used for tsadmin. To configure LDAP for OpenLDAP:
1. Log in to the Linux shell using SSH.
2. Run the command to configure LDAP:
$ tscli ldap configure

3. Answer the prompts using the information you collected. For example:
Choose the LDAP protocol:
[1] Active Directory
[2] OpenLDAP
Option number: 2
Configuring Open LDAP

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

19

Installation and setup
URL to connect to OpenLDAP (Example: ldap://192.168.2.100:389):
ldap://192.168.2.48:389
Distinguished name template (Example: cn={0},ou=users,dc=thoughtspot,dc=com):
cn={0},ou=users,dc=thoughtspot,dc=com
Automatically add LDAP users in ThoughtSpot (y/n): n
Also use ThoughtSpot internal authentication (y/n): y

4. If you are using SSL, Add the SSL certificate for LDAP.
5. If you want to remove the LDAP configuration, issue:
$ tscli ldap purge-configuration

Configure LDAP for Active Directory
Use this procedure to set up integration with LDAP using Active Directory.
Before you configure LDAP for Active Directory, collect this information:
• URL to connect to Active Directory.
For example, ldap://192.168.2.48:389
• Default LDAP domain.
The default domain is the domain under which users who want to be
authenticated against Active Directory reside. When a user logs in with a
username, the default domain is added to the username before sending it to
the LDAP server. If users reside in multiple domains, you can still designate
one of them as the default. Users belonging to a non-default domain will
have to explicitly qualify their username when they log in, for example:
username@ldap1.thoughtspot.com.

• Whether you will use SSL.
If yes, you'll need the certificate from the issuing authority.
• LDAP search base.
This prompt adds the search base information that allows ThoughtSpot to find
user properties such as email and displayname from LDAP.
• Automatically add LDAP users in ThoughtSpot?

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

20

Installation and setup

If you choose 'yes' for this, when a user is authenticated against LDAP, if that
user does not exist in ThoughtSpot, then the user is automatically created.
When users are created in this way, their passwords exist only in LDAP and are
not stored in ThoughtSpot.
In order to log in to ThoughtSpot, the user has to exist in ThoughtSpot
independent of whether that user is authenticated against LDAP or against
ThoughtSpot's internal authentication. If you choose 'no' for this, users
who will authenticate against LDAP have to be manually created with a
dummy password as a placeholder in ThoughtSpot before they can log
in. The username you specify when creating the LDAP authenticated
user manually in ThoughtSpot has to be domain qualified, for example:
username@ldap1.thoughtspot.com.

• Also use ThoughtSpot internal authentication?
If you choose 'yes' for this, when a user logs in, ThoughtSpot will first attempt
to authenticate the user against LDAP. If that attempt fails, it will then attempt
to authenticate the user against ThoughtSpot. If either of these succeed, then
the user is successfully logged in. This option is useful in scenarios where
some users are not in LDAP and are created only in ThoughtSpot.
You do not need to create a user called tsadmin on your LDAP server. Internal
authentication can be used for tsadmin. To configure LDAP for OpenLDAP:
1. Log in to the Linux shell using SSH.
2. Run the command to configure LDAP:
$ tscli ldap configure

3. Answer the prompts using the information you collected. For example:
Choose the LDAP protocol:
[1] Active Directory
[2] OpenLDAP
Option number: 1
Configuring Active Directory

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

21

Installation and setup
URL to connect to Active Directory. (Example: ldap://192.168.2.100:389):
ldap://192.168.2.100:389
Default domain (Example: ldap.thoughtspot.com): ldap.thoughtspot.com
Use SSL (LDAPS) (y/n): n
LDAP search base (Example: cn=Users): cn=Users
Automatically add LDAP users in ThoughtSpot (y/n): y
Also use ThoughtSpot internal authentication (y/n): y

4. If you are using SSL, Add the SSL certificate for LDAP.
5. If you want to remove the LDAP configuration, issue:
$ tscli ldap purge-configuration

Add the SSL certificate for LDAP
When you set up LDAP, you specified whether or not to use SSL for LDAP
(LDAPS). If using SSL, you must install the LDAP SSL certificate.
Before you can add the SSL certificate, you must LDAP using one of these
procedures:
• Configure OpenLDAP
• Configure LDAP for Active Directory
You must have the SSL certificate before you start. For more information on
obtaining an SSL certificate, see About SSL (secure socket layers).
To add the SSL certificate for LDAP:
1. Follow the instructions from your certifying authority to obtain the certificate.
This is usually sent via email or available by download.
2. Copy the certificate to ThoughtSpot:
$ scp  admin@:

3. Log in to the Linux shell using SSH.
4. Change directories to where you copied the certificate:
$ cd 

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

22

Installation and setup

5. Run the command to configure SSL for LDAP, designating an alias for this
certificate using the  parameter:
$ tscli ldap add-cert  

Test the LDAP configuration
After configuring LDAP, you can test to make sure it is working by issuing a
command.
This procedure allows you to test the LDAP connection you created.
1. Log in to the Linux shell using SSH.
2. Issue the LDAP testing command, supplying the information for the LDAP
server you configured, as in this example:
$ ldapsearch -x -h 192.168.2.61 -p 389 -D "testuser@ldap.thoughtspot.com" -W
-b "dc=ldap,dc=thoughtspot,dc=com" cn

3. Supply the LDAP password when prompted.
4. If the connection works, you'll see a confirmation message.

Sync users and groups from LDAP
Use this procedure to synchronize your ThoughtSpot system with an LDAP
server.
Before synchronizing users and groups, you will need this information:
• IP address and port of the server where your ThoughtSpot instance is running.
This hostport is needed in the following format http(s)://: or
http(s)://.

• Administrator login username and password for your ThoughtSpot instance.
• URL of the LDAP server, or hostport.
For example, ldap://192.168.2.48:389
• Login username and password for the LDAP system.
An example username would be moo_100@ldap.thoughtspot.com

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

23

Installation and setup

• Distinguished Name (DN) for the base to start searching for users in the LDAP
system.
For example, DC=ldap,DC=thoughtspot,DC=com
There are two ways for you to fetch users and groups from LDAP and populate
them into your ThoughtSpot system:
• Run the synchronization script in interactive mode, which will walk you
through the process (shown here).
• Create your own Python script by using the ThoughtSpot Python APIs. If you
need details on the Python APIs, contact ThoughtSpot Support. If you choose
this method, you can run the script periodically using a cron job.
To run the LDAP sync script in interactive mode:
1. Log in to the Linux shell using SSH.
2. Run the command to start the script:
python syncUsersAndGroups.py interactive

3. Answer the prompts using the information you collected above. For example:
Complete URL of TS server in format "http(s)://:":
http://10.77.145.24:8088
Disable SSL authentication to TS server (y/n): y
Login username for ThoughtSpot system: admin
Login password for ThoughtSpot system: 12345
Complete URL of server where LDAP server is running in format ldap(s)://
:: ldap://192.168.2.48:389
Login username for LDAP system: moo_100@ldap.thoughtspot.com
Login password for LDAP system: 12345
Syncs user and groups between LDAP and TS systems (y/n): y
Delete entries in ThoughtSpot system that are not currently in LDAP tree
being synced (y/n): n
Distinguished name for the base to start searching groups in LDAP System:
DC=ldap,DC=thoughtspot,DC=com
Scope to limit the search to (choice number)
0:base Searching only the entry at the base DN
1:one Searching all entries on level under the base DN - but not including
the base DN
2:tree Searching of all entries at all levels under and including the
specified base DN: 2
Filter string to apply the search to: (|(CN=TestGroupAlpha)
(CN=TestGroupBeta))

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

24

Installation and setup

Answering this prompt is optional. If left blank, the default value of '(CN=*)'will
be used.
Apply sync recursively, i.e. Iterates through group members and creates
member groups, users and relationships in a recursive way. (y/n): n

This prompt is asking if you would like to include group members even if they
do not belong to the current sub tree that is being synced.
4. Alternatively, to input your own shorthand script commands:
1. Issue the Python script commands, supplying all of the above information,
following this format example:
python syncUsersAndGroups.py script \
–-ts_hostport  \
--disable_ssl \
--ts_uname  \
--ts_pass  \
--ldap_hostport '' \
--ldap_uname '' \
--ldap_pass '' \
--sync \
--purge \
--basedn 'DC=ldap,DC=thoughtspot,DC=com' \
--filter_str '(|(CN=TestGroupAlpha)(CN=TestGroupBeta))' \
--include_nontree_members

The bottom half of the above command example targets sub trees under
the DC called TestGroupAlpha and TestGroupBeta, and iterates through
them recursively to create/sync users, groups, and their relationships in
the ThoughtSpot system. It also deletes any other entities created in the
ThoughtSpot system from this LDAP system that are not currently being
synced.

Mount a NAS file system
Some operations, like backup/restore and data loading, require you to either
read or write large files. You can mount a NAS (network attached storage) file
system for these operations.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

25

Installation and setup

This procedure shows you how to mount a NAS file system for storing or
accessing large files. The file system will be mounted at the same location on
each node in the cluster automatically. When any node is restarted, the file
system will be mounted again automatically, if it can be found.
When supplying a directory for writing or reading a backup, you can specify
the mountpoint as the directory to use. Likewise, you can stage data there for
loading.
Note that backups are written by the Linux user "admin". If that user does not
have permission to write to the NAS file system, you could write the backups to
disk (for example /export/sdc1, /export/sdd1, /export/sde1, or /export/sdf1) and
then set up a cron job that executes as root user and copies the backup to the
NAS device every night, then deletes it from the directory.
Do not send the periodic backups or stage files on /export/sdb1 since it is a
name node. It is used internally by Hadoop Distributed File System (HDFS) and
if this drive fills up, it can cause serious problems. Do not allow backups or data
files to accumulate on ThoughtSpot. If disk space becomes limited, the system
will not function normally.
1. Log in to the Linux shell using SSH.
2. Mount the directory to the file system, by issuing the appropriate command:
• For an NFS (Network File System) directory:
tscli nas mount-nfs
--server 
--path_on_server 
--mount_point 

• For a CIFS (Common Internet File System) directory:
tscli nas mount-cifs
--server 
--path_on_server 
--mount_point 
--username 
--password 
--uid 
--gid 

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

26

Installation and setup

3. Use the mounted file system as you wish, specifying it by referring to its
mount point.
4. When you are finished with it, you may optionally unmount the NAS file
system:
tscli nas unmount --dir 

Add a custom support contact
You can designate a support contact (email and phone number) at your
organization for first level technical support. That person can answer questions
about data and searching, and submit any system and software-related
questions to ThoughtSpot Support.
After you set the custom support contact information, here's where your users
will see it:
• In the Help Center, when they click Contact Support.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

27

Installation and setup

Figure 1: Help Center support contact
• In error messages, when they click What Happened?

Figure 2: Error message support contact

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

28

Installation and setup

To designate the custom support contact:
1. Log in to the Linux shell using SSH.
2. Issue the tscli command to set the email address:
$ tscli support set-admin-email 

3. Issue the tscli command to set the phone number:
$ tscli support set-admin-phone 

4. If you need to reset both of these to the default (ThoughtSpot Support), issue:
$ tscli support rm-admin-email
$ tscli support rm-admin-phone

Set up monitoring
To configure monitoring of your cluster, set up the frequency of heartbeat and
monitoring reports and an email address to receive them.
Use tscli to set up monitoring. This is a one time operation.
1. Log in to the Linux shell using SSH.
2. Issue the tscli command to set up monitoring:
tscli monitoring set-config
--email 
--heartbeat_interval 
--report_interval 

The parameters are:
• --email  is a comma separated list (no spaces) of email addresses
where the cluster will send monitoring information.
• --heartbeat_interval  is the heartbeat email generation
interval in seconds. Must be greater than 0.
• --report_interval  sets the cluster report email generation
interval in seconds. Must be greater than 0.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

29

Installation and setup

3. To view your settings and verify that they have been applied, issue:
tscli monitoring show-config

You should see information like:
Monitoring Configuration:
Alert Email: dev-alerts@thoughtspot.com
Heartbeat Interval: 900 sec
Report Interval: 21600 sec

4. After the heartbeat interval has passed, check your email to verify that emails
are being delivered.
5. If you don't receive any emails, Verify that email is working.

Connect to the ThoughtSpot Support file server
ThoughtSpot Support uses a secure file server to provide new releases and
to receive logs and troubleshooting files that you upload. The secure server
connection is also required if you want to enable the optional statistics collection
using the call home feature.
Before you can upload a file to the secure file server, obtain your user name
and password for logging in to the secure file server. You can get these from
ThoughtSpot Support.
Configuring the connection to the file server is a one time operation. You do not
need to reconfigure the connection unless your password changes. Note that
you can do a one time override of the user and password you used to configure
the connection. This is done by passing a different user and password on the
command line when uploading or downloading a file.
To configure the connection to the secure file server:
1. Log in to the Linux shell using SSH.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

30

Installation and setup

2. Issue the command to configure the file server:
$ tscli fileserver configure --user  [--password ]

Note that if you do not use the optional --password parameter, you will be
prompted to enter the password.

Set up remote support access
You can set up a reverse tunnel to allow ThoughtSpot Support to get access to
your ThoughtSpot instance, to perform support-related activities. This setup is
a much simpler, more secure, and scalable than the alternative option of using a
virtual meeting room.
Before you can do this procedure, your networking team needs to open port 22
in your firewall outgoing rules.
Granting remote support access can streamline troubleshooting activities,
since it enables your support agent to work directly on your in a secure setting.
The remote tunnel enables SSH and HTTP access to your by ThoughtSpot
Support. This access can be granted and revoked easily, so you can enable it for
a troubleshooting session, and then disable it again. Before doing this procedure,
make sure it is allowed by your internal security policies.
To enable remote support:
1. Contact ThoughtSpot to open a support ticket for making the appropriate
reverse tunnel settings on our end. Provide the cluster name of the cluster for
which you want to enable remote support.
2. When the ticket has been completed, continue with the remaining steps in this
procedure to make the settings on your side.
3. Log in to the Linux shell using SSH.
4. Issue the command to configure the destination for the remote tunnel.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

31

Installation and setup

You only need to do this once, when you are enabling the tunnel for the very
first time. After that, this setting persists when you start and stop the remote
tunnel.
$ tscli support set-remote --addr tunnel.thoughtspot.com --user ubuntu

5. Test that the setting has been applied:
$ tscli support show-remote

6. Enable the remote tunnel:
$ tscli support start-remote

7. Contact ThoughtSpot again, so you can test the setup with your ThoughtSpot
Support contact.
8. After your remote session with ThoughtSpot Support, you should turn the
remote tunnel off, until you need to use it again:
$ tscli support stop-remote

You can repeat the steps to start and stop the remote tunnel as needed for
future support operations.
9. Test that the remote tunnel has been disabled:
$ tscli support show-remote

Enable the call home capability
The optional "call home" capability sends usage statistics to ThoughtSpot
Support once a day via a secure file server.
Before you can enable the call home feature:
1. Configure the connection to the file server.
2. Obtain the customer name as recognized by the file server.
The customer name is formatted like this example: Shared//stats. If
you do not know the customer name, contact ThoughtSpot Support.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

32

Installation and setup

This can be helpful when troubleshooting problems with ThoughtSpot Support,
because they will be able to see basic usage information over time for your
ThoughtSpot instance.
To set up the call home feature:
1. Log in to the Linux shell using SSH.
2. Enable the call home feature by issuing:
$ tscli callhome enable --customer_name 

3. If you want to disable call home in the future, you can do so by issuing:
$ tscli callhome disable

Network ports
For regular operations and for debugging, there are some ports you will need to
keep open to network traffic from end users. Another, larger list of ports must be
kept open for network traffic between the nodes in the cluster.
Required ports for operations and debugging
The following ports need to be opened up to requests from your user
population. There are two main categories: operations and debugging.
Table 3: Network ports to open for operations
Port

Protocol Service Name

Direction

22

SSH

bidirectional Administrators All nodes

SSH

Source

Destination Description

IP addresses

Secure shell access.
Also used for scp
(secure copy).

80

HTTP

HTTP

bidirectional All users IP
addresses

All nodes

Hypertext Transfer
Protocol for website
traffic.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

33

Installation and setup
Port

Protocol Service Name

Direction

443

HTTPS

bidirectional All users IP

HTTPS

Source

Destination Description
All nodes

Secure HTTP.

addresses
12345 TCP

Simba

bidirectional Administrators All nodes
IP addresses

Port used by ODBC
and JDBC drivers
when connecting to
ThoughtSpot.

Table 4: Network ports to open for debugging
Port

Protocol Service Name

Direction

2201

HTTP

bidirectional Administrator All nodes

Orion master

Source

HTTP
2101

HTTP

Oreo HTTP

Destination Description

IP addresses

the cluster manager.

bidirectional Administrator All nodes
IP addresses

4001

HTTP

Falcon worker

4251

HTTP

Sage master

IP addresses

Port used to debug
the data cache.

bidirectional Administrator All nodes

HTTP

Port used to debug
the node daemon.

bidirectional Administrator All nodes

HTTP

Port used to debug

IP addresses

Port used to debug
the search engine.

Required ports for inter-cluster operation
Internally, ThoughtSpot uses static ports for communication between services in
the cluster. Do not close these ports from inter-cluster network communications.
In addition, a number of ports are dynamically assigned to services, which
change between runs. The dynamic ports come from the range of Linux
dynamically allocated ports (20K+).
Table 5: Network ports to open between the nodes in the cluster
Port

Protocol Service Name

Direction

Source

Dest.

Description

80

TCP

inbound

All nodes

All nodes

Primary app HTTP

nginx

port (nginx)

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

34

Installation and setup
Port

Protocol Service Name

Direction

Source

Dest.

Description

443

TCP

inbound

All nodes

All nodes

Primary app

Secure nginx

HTTPS port
(nginx)
2100

RPC

Oreo RPC port

bidirectional All nodes

All nodes

Node daemon RPC

2101

HTTP

Oreo HTTP port

bidirectional Admin IP

All nodes

Node daemon

addresses

HTTP

and all nodes
2181

2200

RPC

RPC

Zookeeper servers

bidirectional All nodes

All nodes

Zookeeper servers

listen on this

listen on this

port for client

port for client

connections

connections

Orion master RPC

bidirectional All nodes

All nodes

port

Internal
communication
with the cluster
manager

2201

HTTP

Orion master

bidirectional Admin IP

HTTP port

2210

RPC

Cluster stats

All nodes

Port used to

addresses

debug the cluster

and all nodes

manager

bidirectional All nodes

All nodes

service RPC port

Internal
communication
with the stats
collector

2211

HTTP

Cluster stats

bidirectional Admin IP

service HTTP port

2230

RPC

Callosum stats

All nodes

Port used to

addresses

debug the stats

and all nodes

collector

bidirectional All nodes

All nodes

collector RPC port

Internal
communication
with the BI stats
collector

2231

HTTP

Callosum stats

bidirectional Admin IP

All nodes

Port used to

collector HTTP

addresses

debug the BI stats

port

and all nodes

collector

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

35

Installation and setup
Port

Protocol Service Name

Direction

2240

RPC

bidirectional All nodes

Alert manager

Source

Dest.

Description

All nodes

Port where
alerting service
receives alert
events

2888

3888

4000

RPC

RPC

RPC

Ports used by

bidirectional All nodes

All nodes

Ports used by

Zookeeper servers

Zookeeper servers

for communication

for communication

between

between

themselves

themselves

Ports used by

bidirectional All nodes

All nodes

Ports used by

Zookeeper servers

Zookeeper servers

for communication

for communication

between

between

themselves

themselves

Falcon worker RPC bidirectional All nodes

All nodes

port

Port used by
data cache for
communication
between
themselves

4001

HTTP

Falcon worker

bidirectional Admin IP

HTTP port

4021

RPC

Sage metadata

All nodes

addresses

debug the data

and all nodes

cache

bidirectional Admin IP

All nodes

HTTP

Port where search

service port

addresses

service contacts

(exported by

and all nodes

metadata service

Tomcat)
4201

Port used to

Sage auto

for metadata
bidirectional Admin IP

All nodes

Port used to

complete server

addresses

debug the search

HTTP interface

and all nodes

service

port

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

36

Installation and setup
Port

Protocol Service Name

Direction

4231

HTTP

bidirectional Admin IP

Sage index server
HTTP port

4232

RPC

Sage index

Source

Dest.

Description

All nodes

Port used to

addresses

debug the search

and all nodes

service

bidirectional All nodes

All nodes

Port used

server metadata

for search

subscriber port

service internal
communication

4233

RPC

Sage index server

bidirectional All nodes

All nodes

RPC port

Port used
for search
service internal
communication

4241

4242

HTTP

RPC

Sage auto

bidirectional Admin IP

All nodes

Port used to

complete server

addresses

debug the search

HTTP port

and all nodes

service

Sage auto

bidirectional All nodes

All nodes

Port used

complete server

for search

RPC port

service internal
communication

4243

RPC

Sage auto

bidirectional All nodes

All nodes

Port used for

complete server

search internal

metadata

communication

subscriber port
4251

RPC

Sage master RPC

bidirectional All nodes

All nodes

port

Port used
for search
service internal
communication

4405

RPC

Diamond

bidirectional All nodes

(graphite) port

All nodes

Port used for
communication
with monitoring
service

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

37

Installation and setup
Port

Protocol Service Name

Direction

4500

RPC

bidirectional All nodes

Trace vault service

Source

Dest.

Description

All nodes

Trace collection

RPC port

for ThoughtSpot
services

4501

HTTP

Trace vault service

bidirectional Admin IP

HTTP port

All nodes

addresses

Debug trace
collection

and all nodes
4851

RPC

Graphite manager

bidirectional All nodes

All nodes

RPC port

Communication
with graphite
manager

4852

HTTP

Graphite manager

bidirectional Admin IP

HTTP port

All nodes

addresses

Debug graphite
manager

and all nodes
4853

4853

RPC

HTTP

Elastic search

bidirectional All nodes

All nodes

Communication

stack (ELK)

with log search

manager RPC port

service

Elastic search

bidirectional Admin IP

stack (ELK)

addresses

manager HTTP

and all nodes

All nodes

Debug log search
service

port
5432

Postgres Postgres database

bidirectional All nodes

All nodes

server port

Communication
with Postgres
database

8020

RPC

HDFS namenode

bidirectional All nodes

All nodes

server RPC port

Distributed file
system (DFS)
communication
with clients

8080

HTTP

Tomcat

bidirectional All nodes

All nodes

BI engine
communication
with clients

8787

HTTP

Periscope (UI)

bidirectional All nodes

service HTTP port

ThoughtSpot Administration Guide

Copyright

All nodes

Administration UI
back end

©

2017 by ThoughtSpot. All Rights Reserved.

38

Installation and setup
Port

Protocol Service Name

Direction

8888

HTTP

bidirectional All nodes

HTTP proxy server

Source

Dest.

Description

All nodes

Reverse SSH

(tinyproxy)
11211

Memcached

12345

ODBC

tunnel

Memcached server bidirectional All nodes

All nodes

BI engine cache

All nodes

Port used for ETL

port
Simba server port

bidirectional All nodes

(extract, transform,
load)
50070 HTTP

HDFS namenode

bidirectional All nodes

All nodes

server HTTP port
50075 HTTP

HDFS datanode

Debug DFS
metadata

bidirectional All nodes

All nodes

Debug DFS data

server HTTP port

Required ports for inbound and outbound cluster access
ThoughtSpot uses static ports for inbound and outbound access to a cluster.
Table 6: Network ports to open for inbound access
Port

Protocol Service Name Direction

22

SCP

SSH

Source

Dest.

Description

bidirectional ThoughtSpot All nodes Secure shell access.
Support

80

HTTP

HTTP

bidirectional ThoughtSpot All nodes Hypertext Transfer
Support

Protocol for website
traffic.

443

HTTPS

HTTPS

bidirectional ThoughtSpot All nodes Secure HTTP.
Support

12345 TCP

Simba

bidirectional ThoughtSpot All nodes Port used by ODBC
Support

and JDBC drivers
when connecting to
ThoughtSpot.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

39

Installation and setup

Table 7: Network ports to open for outbound access
Port Protocol Service Name Direction

Source

Destination Description

443 HTTPS

All nodes

208.83.

HTTPS

outbound

110.20

For transferring files to
thoughtspot.egnyte.com
(IP address 208.83.110.20).

25

SMTP

or

SMTP or

outbound

Secure SMTP

587

All nodes

All nodes

Allow outbound access

and SMTP

for the IP address of

relay

whichever email relay

(provided by

server is in use. This is

customer)

for sending alerts to
ThoughtSpot Support.

389 TCP

LDAP or

or

LDAPS

outbound

636

All nodes

All nodes

Allow outbound access

and LDAP

for the IP address of the

server

LDAP server in use.

(provided by
customer)

Required ports for IPMI (Intelligent Platform Management Interface)
ThoughtSpot uses static ports for out-of-band IPMI communications between
the cluster and ThoughtSpot Support.
Table 8: Network ports to open for IPMI access
Port

Protocol Service Name Direction

80

HTTP

HTTP

Source

Dest.

bidirectional ThoughtSpot All
Support

nodes

Description
Hypertext Transfer
Protocol for website traffic.

About load balancing and proxies
A load balancer is needed in front of a server group in order to direct traffic to
individual servers in a way that maximizes efficiency. Here are some of the best
practices and guidelines for a typical implementation with ThoughtSpot. Your
experience may differ depending on your environment and preference.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

40

Installation and setup

Load balance across ThoughtSpot nodes
The load balancer is an appliance in your infrastructure that routes traffic
automatically to nodes to provide failover. You can also place a load balancer or
proxy in front of the ThoughtSpot appliance if you'd like external network users
to access the system.
The best way to load balance across all ThoughtSpot nodes in a cluster is to map
one domain name (FQDN) to all the IPs in the cluster in a round robin fashion.
For example, if you want to use a DNS server based load balancing, then you can
define multiple "A" resource records (RR) for the same name.
Below is an example of how you could set that up
thoughtspot.customer.com
thoughtspot.customer.com
thoughtspot.customer.com
thoughtspot.customer.com

IN
IN
IN
IN

A
A
A
A

69.9.64.11
69.9.64.12
69.9.64.13
69.9.64.14

The example indicates that IP addresses for the domain
thoughtspot.customer.com are 69.9.64.11, 69.9.64.12, 69.9.64.13, and 69.9.64.14.
Session Affinity
Session Affinity refers to directing requests to the same application server for
the time it takes to complete a task.
In order for session affinity to work on ThoughtSpot, HTTPS (an SSL certificate)
has to be installed on the load balancer level. If it is installed outside of the load
balancer, session affinity may not occur and the ThoughtSpot system will fail.
Web proxies
You can access ThoughtSpot through any standard web proxy server. Web
proxies are fairly universal regardless of the application they are proxying.
However, ThoughtSpot doesn't use any new protocols, like SPDY or HTTP/2,

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

41

Installation and setup

which may have a dependency on the proxy. Instead, ThoughtSpot is commonly
placed behind a web HTTP/HTTPS proxy.
Additionally, the proxy can round robin across multiple nodes in the
ThoughtSpot backend. You can essentially use the web proxy as a load balancer.
Therefore, your session will carry over if the proxy round robins between the
ThoughtSpot backends as long as the URL doesn’t change.
Network architectural diagram

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

42

Chapter 2: Load and manage data
Load and manage data

Topics:
•

About case configuration

•

Generate CSV files with

•

•

There are several methods of loading data into
ThoughtSpot. This section describes each method
and why you might choose it above the others.

the data to be loaded

The fastest and easiest way to load a new table is

Load data from a web

by importing it using the Web browser. This is best

browser

for one time data loads of small tables which do

Append data from a web
browser

not have complex relationships to other tables. This
method is limited to tables that are under 50 MB
(megabytes) in size.

•

Plan the schema

•

Build the schema

Using ThoughtSpot Loader, you can script recurring

•

Change the schema

loads and work with multi-table schemas.

•

Load data with

If your data already exists in another database with

ThoughtSpot Loader

the schema you want to use in ThoughtSpot, you can

•

Delete a data source

pull the schema and data in using the ODBC or JDBC

•

About the Schema Viewer

driver.
These are the methods you can use to load data,
along with the benefits of each method:
Table 9: Data loading methods
Method

Description

Load data from a

Use the ThoughtSpot Easy way to do

web browser

Web interface to

a one-time data

upload an Excel

load of a small file

or CSV (comma

(under 50MB).

separated values)
file from your local

Benefits

End users can
upload their own

machine.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

43

Load and manage data
Method

Description

Benefits
data and explore it
quickly.

Use ThoughtSpot

This is a premium

Easy way to

Data Connect. For

feature, available at

connect to multiple

details, see the

additional cost. Use

sources of data

ThoughtSpot Data ThoughtSpot Data

directly and set

Connect Guide

Connect to connect

up recurring

directly to external

loads. You won't

data sources and pull

need to define a

in tables and columns schema to accept
from them. You can

the data loads,

also set up recurring

because this is done

loads to keep the

automatically for

data fresh.

you.

Load data with

Use TQL and tsload

Best way to load

ThoughtSpot

to load data directly

large amounts of

Loader

into the back end

data or a schema

database that

with multiple tables.

ThoughtSpot uses.

Can be scripted and
used for recurring
data loads, such
as monthly sales
results or daily logs.
Can be integrated
with an ETL
solution for
automation.

Use the ODBC/

Use the ODBC or

Make use of an

JDBC driver

JDBC client with

established ETL

to connect to

your ETL tool. For

process and tool(s).

ThoughtSpot

information, see the
ThoughtSpot Data

Connect to
ThoughtSpot using

Integration Guide.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

44

Load and manage data
Method

Description

Benefits
third party tools like
SSIS.
You don't need to
define a schema
to accept the data
load.

Use the

Use the Informatica

Works with your

Informatica

Connector if

established data

Connector

you already use

migration processes

Informatica to

in Informatica.

connect to your
other data sources.
For information, see
the ThoughtSpot
Data Integration
Guide.

If you're uploading data through the Web interface,
you can use a native Excel file. If you want to use a
CSV (comma separated values) or delimited file, or
you are loading using ThoughtSpot Loader, you'll
need to Generate CSV files with the data to be
loaded first.
Note: End users will almost always work with
worksheets and data they upload.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

45

Load and manage data

About case configuration
You can set the type of case sensitivity you would like to see reflected in the
ThoughtSpot display.
Before you load your data, you should consider the type of casing you would like
your data to reflect. The case sensitivity for source data strings is preserved in
the display. So, the visual display of results is identical to the input case that is
loaded.
Note: The casing will remain lowercase in other parts of the application,
such as when you ask a question or filter.
It is important to note that string casings aren’t applied globally, but by column.
So datasets will have different string casings as long as they're in different
columns. Tables that are already compacted will keep their lowercase format.
In these cases, in order to get the specific string case that you want, you would
have to truncate related tables and reload them.
To take advantage of case configuration, you need to have ThoughtSpot Support
enable it on your cluster for you. In addition, title casing should be disabled for
string casing to properly work.

Generate CSV files with the data to be loaded
The first step in loading data is to obtain or create one or more flat files that
contain the data to be loaded into ThoughtSpot.
Your data should be in a CSV (comma separated values) or delimited flat file
before you load it. A CSV file is a text file made up of data fields separated by
a delimiter and optionally enclosed with an enclosing character. If your data
contains multiple tables, you'll have a separate CSV for each table.
A CSV file contains:

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

46

Load and manage data

• A delimiter that marks the separation between fields in the data. The delimiter
is usually comma, but it can be any character.
• Fields optionally enclosed with double quotes.
Use these guidelines when creating the CSV file:
• Columns in the CSV file must be in the same order as defined in the target
table.
• If the CSV contains column headers, they must match the column names in
the database exactly.
• Often a | (pipe) or tab is used as the delimiter, because it may be less likely to
occur within the data values.
• When a field contains a double quote, it must be escaped with the character
specified in the escape character argument in tsload.
• When a field contains the delimiter, the field must be enclosed in double
quotes.
For more information about CSV files and the rules for creating them, check
http://en.wikipedia.org/wiki/Comma-separated_values.
1. If your source is another database:
a) Connect to the source database.
b) Export each of the tables you wish to import into ThoughtSpot as a CSV
file, specifying a delimiter of comma, | (pipe) or tab.
2. If your source is an Excel spreadsheet, save it as a CSV file.

Load data from a web browser
The simplest way to load data is to upload a CSV or Excel file from the
ThoughtSpot Web interface. This method is recommended for small, one time
data loads. Using this method, the data schema is created for you automatically.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

47

Load and manage data

Loading data from a Web browser requires your data to be in a CSV (comma
separated values) or a native Excel file.
Any user who belongs to a group that has the privilege Has administration
privileges or Can upload user data will be able to upload their own data from
the browser.
CSV is a common format for transferring data between databases. Your ETL
(extract, transform, load) process will typically generate CSV files. You can also
create a CSV file from a Microsoft Excel spreadsheet by opening the spreadsheet
in Excel, choosing Save As and selecting CSV.
ThoughtSpot supports a wide range of date and timestamp formats in the CSV
file.
Loading data through the Web browser is recommended for smaller tables
(under 50MB) with simple relationships between them. If you are loading a fact
table that joins to dimension tables, you must load the fact table first, and then
the dimension tables. The joining key must be a single column of unique values in
the dimension table. NULL values in the fact table will not be able to be joined.
Blank values in user uploaded CSV files are interpreted as NULL values. These
include the values (case insensitive):
• NULL
• \N
• NA
• N/A
• [space]
To load the CSV or Excel file into ThoughtSpot:
1. Log in to ThoughtSpot from a browser.
2. Click on Data, on the top navigation bar.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

48

Load and manage data

Figure 3: Data
3. Click the Actions button in the upper right corner, and select Upload Data.

Figure 4: Upload data
4. Upload the CSV or Excel file by doing one of these options:
• Click on Browse your files and select the file.
• Drag and drop the file into the drop area.
5. Answer the question Are the column names already defined in the file
header?
6. Answer the question Are the fields separated by? Click Next.
7. Click on the column header names to change them to more useful names, if
you'd like. Click Next.
8. Review the automatically generated data types for each column, and make
any changes you want. There are four data types: Text, Integer, Decimal, and
Date.
9. Click Import.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

49

Load and manage data

10.Click Link to Existing Data if you want to link the data you uploaded to the
data in another table or worksheet. Or click Search if you want to begin a new
search.

Append data from a web browser
You can append data to your existing system tables through the ThoughtSpot
application, even if the tables were initially loaded using Data Connect or tsload.
Loading data from a Web browser requires your data to be in a CSV (comma
separated values) or a native Excel file. The file must have the same structure as
the table it is being loaded into, including number and type of columns, in the
same order as the target table.
Any user who belongs to a group that has the privilege Has administration
privileges or Can upload user data will be able to upload their own data from
the browser.
CSV is a common format for transferring data between databases. Your ETL
(extract, transform, load) process will typically generate CSV files. You can also
create a CSV file from a Microsoft Excel spreadsheet by opening the spreadsheet
in Excel, choosing Save As and selecting CSV.
ThoughtSpot supports a wide range of date and timestamp formats in the CSV
file.
Loading data through the Web browser is recommended for smaller tables
(under 50MB) with simple relationships between them. If you are loading a fact
table that joins to dimension tables, you must load the fact table first, and then
the dimension tables. The joining key must be a single column of unique values in
the dimension table. NULL values in the fact table will not be able to be joined.
Blank values in user uploaded CSV files are interpreted as NULL values. These
include the values (case insensitive):
• NULL

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

50

Load and manage data

• \N
• NA
• N/A
• [space]
To append data into ThoughtSpot:
1. Log in to ThoughtSpot from a browser.
2. Click on Data, on the top navigation bar.

Figure 5: Data
3. Click the on the table you would like to append data to.
4. Click the Load data button.

Figure 6: Load data
5. Upload the CSV or Excel file by doing one of these options:
• Click on Browse your files and select the file.
• Drag and drop the file into the drop area.
6. Answer the question Are the column names already defined in the file
header?
7. Answer the question Do you want to append to the existing data or
overwrite it?
8. Answer the question Are the fields separated by? Click Next.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

51

Load and manage data

9. Click on the column header names to change them to more useful names, if
you'd like. Click Next.
10.Review the automatically generated data types for each column, and make
any changes you want. There are four data types: Text, Integer, Decimal, and
Date.
11. Click Import.
12.Click Link to Existing Data if you want to link the data you uploaded to the
data in another table or worksheet. Or click Search if you want to begin a new
search.

Plan the schema
Before you can load data with ThoughtSpot Loader, you must create a schema
to receive it, using the SQL command line interface (TQL).
The TQL syntax is similar to the SQL used in other relational databases, but
with some important differences. You'll use DDL (data definition language) to
create the schema into which you'll load the data. You'll probably want to put all
your DDL statements into a text file, which you'll use as a script for creating the
schema.
Before writing your TQL script, you need to understand some basic ThoughtSpot
concepts.
About databases and schemas
ThoughtSpot organizes objects in a hierarchical namespace. Databases contain
schemas, which contain tables.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

52

Load and manage data

Figure 7: Namespace diagram
ThoughtSpot can contain one or more databases, and each database can
have multiple schemas. If you do not specify a schema, the default schema
(falcon_default_schema) is used automatically. This makes it easier to add tables
to the database without the need to explicitly create a schema.
If you do create an additional schema, you must refer to its objects using the
syntax .. If you do not qualify the schema when
referencing its objects, the default schema (falcon_default_schema) will always
be assumed.
By default, ThoughtSpot creates an internal database to host tables
corresponding to data that is imported by users from a Web browser.
Review the structure of your data
The schema you create to hold the data needs to be a good fit for your data.
First, familiarize yourself with the tables you want to load, and understand their
structure. Make note of this information for each table:
• The column names and data types
• Type of table (fact or dimension)
• Primary key column(s)
• The size of the table on disk
• Any other tables it can be joined with (foreign keys)
Here's what you'll need to take into account in your TQL for creating each table,
based on these properties:

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

53

Load and manage data

Table 10: Table properties that determine schema settings
Table type

Fact

Table

To be joined

Schema recommendations

size

with

Any

Small

•

Sharded.

dimension

•

Foreign key references the primary key in the

table(s)
Fact

Any

dimension table.

Large

•

dimension

Sharded on the same distribution key as the dimension
table it will be joined with.

table(s)

•

Foreign key references the primary key in the
dimension table.

Fact

Any

Another fact

•

table

Sharded on the same distribution key as the fact table
it will join with.

•

Many-to-many relationship defines how the tables will
be joined.

Dimension

under

Fact table(s)

50MB
Dimension

over

Fact table(s)

•

Replicated (not sharded).

•

Has a primary key.

•

Distributed dimension table, sharded on the same

50MB

distribution key as the fact table it will be joined with.
•

Primary key must be the same as the distribution key.

Data types
ThoughtSpot supports the common data types. Compare these with the data
types you want to load, and do any necessary conversion ahead of loading the
data.
Supported data types
The tables you create to receive the data must have the same number of
columns and data types as the data you will be loading. Choose a data type for
each column from the list of supported data types:

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

54

Load and manage data

Table 11: Supported data types
Kind of

Supported data

data

types

Character

•

VARCHAR(n)

Details

Specify the maximum number of characters, as in
VARCHAR(255). The size limit is 1GB for VARCHAR values.

Floating

•

DOUBLE

DOUBLE is recommended.

point

•

FLOAT

Boolean

•

BOOL

Can be true or false.

Integer

•

INT

INT holds 32 bits.

•

BIGINT

BIGINT holds 64 bits.

Date or

•

DATE

DATETIME, TIMESTAMP, and TIME are stored at the granularity

time

•

DATETIME

of seconds.

•

TIMESTAMP

TIMESTAMP is identical to DATETIME, but is included for syntax

•

TIME

compatibility.

Note: There is a 1GB limitation on the number of characters for VARCHAR.
If you have any VARCHAR data that exceeds this limit, the entire load will
fail.
Geographical data types
For geographical data types, use VARCHAR. For latitude and longitude, you
can use either VARCHAR or DOUBLE. After loading the data, designate it as
a geographical data type when you Model data in bulk in the modeling file.
Wherever abbreviations or codes are used, they are the same as what the USPS
(United States Postal Service) recognizes.
These types of data can be designated as geographical data, which enables
them to be visualized using the Geo chart types:

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

55

Load and manage data

Table 12: Data that uses geo charts
GeoType

Description

Type: Example

COUNTRY_REGION

Countries

•

name: United States

•

long name: United States

•

name_sort: United States of America

•

abbreviation: U.S.A.

•

adm0_a3: USA

•

adm0_a3_is: USA

•

adm0_a3_us: USA

•

admin: United States of America

•

brk_a3: USA

•

brk_name: United States

•

formal_en: United States of America

•

iso_a2: US

•

iso_a3: USA

•

iso_n3: 840

Counties in the

•

santa clara county

United States

•

pike county, ohio

•

pike county, OH

States in the United

•

name: California

States

•

US Postal Service abbreviation: CA

Must be used with

•

37.421023

LONGITUDE

•

1.282911

Must be used with

•

−122.142103

LATITUDE

•

103.848865

Zip codes and zip

•

po_name: MT MEADOWS AREA

codes +4 in the

•

ZIP: "00012"

•

zip2: 12

•

bremen

COUNTY

STATE_PROVINCE

LATITUDE

LONGITUDE

ZIP_CODE

United States
Other Sub-nation Regions

Administrative
regions found in

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

56

Load and manage data
GeoType

Description

Type: Example

countries other

•

normandy

than the United

•

west midlands

States

Note: You cannot upload your own custom boundaries.

Constraints
Constraints include primary keys, foreign keys, and relationships. Relationships
allow you to create a generic relationship for use when you want to join tables
that don't have a primary key/foreign key relationship.
Primary keys
When a primary key is selected for a table, it impacts data loading behavior.
When a new row is added:
• If another row already exists with same primary key, it is updated with the
values in the new row.
• If a row with the same primary key does not exist already, the new row is
inserted into the table.
This behavior is referred to as “upsert” because it does an INSERT or an UPDATE,
depending on whether a row with the same primary key already exists.
Note that ThoughtSpot does not check for primary key violations across
different shards of the table. Therefore, you need to shard the table on the
primary key columns if you require this “upsert” behavior.
Foreign key relationships
Foreign key relationships help ThoughtSpot with default schema modeling
by indicating a connection between two tables. These relationships are used
for joining the tables, and not for referential integrity constraint checking. The

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

57

Load and manage data

foreign key relationship is defined on the fact table and references the primary
key(s) in the dimension table.
If you use primary and foreign keys, when users search the data from the search
bar, tables are automatically joined. For example, assume there are two tables:
• revenue, which is a fact table
• region, which is a dimension table
There is a foreign key on the fact table on regionid which points to the id in the
region dimension table. When a user types in "revenue by region", the two tables
will be joined automatically.
Foreign keys have to match the primary key of the target table they refer to. So if
there are multiple columns that make up the primary key in the target table, the
foreign key must include all of them, and in the same order.
Generic relationships (many-to-many)
You may have a schema where there is a fact table that you want to join with
another fact table. If there isn't a primary key/foreign key relationship between
the tables, you can use many-to-many to enable this. You can do this by using
the RELATIONSHIP syntax to add a link between them, that works similarly to
the WHERE clause in a SQL join clause.
This is a special kind of relationship, that applies to specific data models and use
cases. For example, suppose you have a table that shows wholesale purchases
of fruits, and another table that shows retail fruit sales made, but no inventory
information. In this case, it would be of some use to see the wholesale purchases
that led to sales, but you don't have the data to track a single apple from
wholesale purchase through to sale to a customer.
In a many-to-many relationship, the value(s) in a table can be used to join to
a second table, using an equality condition (required) and one or more range
conditions (optional). These conditions act like the WHERE clause in a SQL JOIN

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

58

Load and manage data

clause. They are applied using AND logic, such that all conditions must be met
for a row to be included.
To use a many-to-many relationship, you need to follow a few rules:
• There must be one equality condition defined between the two tables.
• Each table must be sharded on the same key that will be used for the equality
condition.
• There can optionally be one or more range conditions defined.
This example shows the TQL statements that create the two fact tables and the
relationship between them.
TQL> CREATE TABLE "wholesale_buys" (
"order_number" VARCHAR(255),
"date_ordered" DATE,
"expiration_date" DATE,
"supplier" VARCHAR(255),
"fruit" VARCHAR(255),
"quantity" VARCHAR(255),
"unit_price" DOUBLE
)

PARTITION BY HASH (96) KEY ("fruit");

TQL> CREATE TABLE "retail_sales" (
"date_sold" DATE,
"location" VARCHAR(255),
"vendor" VARCHAR(255),
"fruit" VARCHAR(255),
"quantity" VARCHAR(255),
"sell_price" DOUBLE
)

PARTITION BY HASH (96) KEY ("fruit");

TQL> ALTER TABLE "wholesale_buys" ADD RELATIONSHIP WITH "retail_sales"
AS "wholesale_buys"."fruit" = "retail_sales"."fruit" and
("wholesale_buys"."date_ordered" < "retail_sales"."date_sold" and
"retail_sales"."date_sold" < "wholesale_buys"."expiration_date");

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

59

Load and manage data

Note that this many-to-many implementation does not protect from over
counting in some searches. If you plan to use it, make sure your searches don't
include aggregation or count searches that will count one value multiple times,
because it satisfies the join condition for multiple rows.

Sharding
For the best performance, you should split (or shard) very large tables across
nodes. If you have a large dimension table, you might choose to co-shard it with
the fact table it will be joined with.
Sharding a fact table
Use sharding to split large tables into parts for distribution across nodes. This is
typically done with large fact tables, to provide optimal performance.
When sharding, you'll choose a column to be the distribution key. This column
should contain a value that has a good distribution (roughly similar number of
rows with each value in that column). This is typically the primary key, but it can
be any single column or a set of columns.
The recommended number or shards depends upon the number of nodes in your
cluster:
Table 13: Recommended number of shards based on number of nodes
Number of Nodes

Number of Shards

3

96

4-12

128

13-24

256

25-36

384

37-48

512

49-60

640

61-72

768

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

60

Load and manage data

To specify the number of shards, add PARTITION BY HASH ( ) to your CREATE
TABLE statement, specifying the number of shards and the sharding key. For
example:
TQL> CREATE TABLE ...
...PARTITION BY HASH (96) KEY ("customer_id");

If no sharding is specified or the number of shards specified is one, the table is
assumed to be unsharded (i.e. the table physically exists on each node).
If no sharding key is specified, but the number of shards is greater than one, the
table is assumed to be sharded randomly. The system does not use primary keys
as sharding keys by default.
If you want to use the primary key for sharding, you must specify that the table is
to be partitioned by hash on the primary key, as in this example:
TQL> CREATE TABLE "supplier" (
"s_suppkey" BIGINT,
"s_name" VARCHAR(255),
"s_address" VARCHAR(255),
"s_city" VARCHAR(255),
"s_phone" VARCHAR(255),
CONSTRAINT PRIMARY KEY ("s_suppkey")
)

PARTITION BY HASH (96) KEY ("s_suppkey");

Sharded (distributed) dimension tables
In a typical schema, you'd have a sharded fact table with foreign keys to
replicated dimension tables (which exist on every node). This works best where
dimension tables are small (under 50MB). So if your dimension tables are small,
you should shard the fact tables and not shard the dimension tables they will be
joined with.
If you have a large dimension table, you can distribute it the same way as the
fact table it joins to. You might choose to use distributed dimension tables if:

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

61

Load and manage data

• The dimension table is large (over 50MB).
• The tables are always joined using the same columns.
If all of these requirements for the distributed dimension table are met, the
tables are automatically co-sharded for you:
• The tables are related by a primary key and foreign key.
• The tables are partitioned on the same primary key/foreign key.
• The tables have the same number of regions (or shards).
If the dimension table will be joined to multiple fact tables, all of the fact tables
must be sharded in the same way as the dimension table. Self-joins are not
supported.
When a fact and its dimension table are co-sharded:
• The two tables will always be joined on the sharding key.
• Data skew can develop if a very large proportion of the rows have the same
sharding key.
This example shows the CREATE TABLE statements that meet the criteria for cosharding of a fact table and a distributed dimension table:
TQL> CREATE TABLE products_dim (
"id" int,
"prod_name" varchar(30),
"prod_desc" varchar(100),
PRIMARY KEY ("id")
)
PARTITION BY HASH (96) KEY ("id")
;
TQL> CREATE TABLE retail_fact (
"trans_id" int,
"product_id" int,
"amount" double,
FOREIGN KEY ("product_id") REFERENCES products_dim ("id")
)
PARTITION BY HASH (96) KEY ("product_id")
;

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

62

Load and manage data

Sharded (distributed) face tables
You can also join two sharded fact tables with different shard keys, otherwise
know as non co-sharded tables. It may take a while to join two tables sharded
on different keys since a lot of data redistribution is required. Therefore,
ThoughtSpot recommends that you use a common shard key for two fact tables.

Chasm traps
In a complex schema, you may have a fact table with no relationship to another
fact table, except that each contains a foreign key to a shared dimension table.
This is known as a chasm trap, and ThoughtSpot can handle it!
A chasm trap in a data schema can introduce problems of over counting if you
join the two fact tables through their shared dimension table. This diagram
shows a typical complex schema with several tables that are related over a
chasm trap:

Figure 8: Complex schema with chasm traps
Examples of use cases where a chasm trap could occur are:

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

63

Load and manage data

• Attribution analysis comparing campaign data with purchase data, where all
they have in common is that both contain a customer identifier which is a
foreign key to a customer dimension table.
• Cost of sales analysis when the wholesale orders data is only related to the
retail sales data through a shared products dimension table.
In many databases, joining tables across a chasm trap creates a Cartesian
product, such that each row from the first fact table is joined to each row from
the second table. This produces over counting when computing counts and
aggregates. But ThoughtSpot protects you from this kind of over counting.
There are still just a few things to look out for when using a schema that contains
chasm traps:
• The tables need to be joined to the dimension table via an equi-join (i.e. a
primary key/foreign key relationship). They cannot be joined using a range of
values.
• Review the column setting called Attribution Dimension. You may need to
change this setting if some of the columns in the shared dimension table
should not be used for attribution when combining fact tables.
• Tables that will be joined across a chasm trap do not need to be co-sharded.
They will be joined appropriately automatically in the most efficient way.

Chasm trap limitations
If your database schema contains any chasm traps, you may encounter these
limitations.
Operations that produce an error for chasm traps
The following limitations on chasm trap schemas will produce a red bar error in
the ThoughtSpot application:

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

64

Load and manage data

• Show underlying data does not work for chasm trap searches, whether the
search is on a worksheet containing a chasm trap or on base tables that are
related over a chasm trap.
• When using the ThoughtSpot APIs, you cannot pass filter values via the URL
if the relevant searches occur on a worksheet containing a chasm trap or on
base tables that are related over a chasm trap.
• Legacy Row Level Security does not work with chasm trap schemas. The
newer Rule-Based Row Level Security must be used.
Behavior that is different for chasm traps
The following behavior is different for chasm traps than for schemas that do not
contain a chasm trap:
• There are no headlines (single facts based on the data) shown when a search
contains a worksheet containing a chasm trap or base tables that are related
over a chasm trap.
• Join information in What am I Looking At? does not appear for searches on
a worksheet containing a chasm trap or on base tables that are related over a
chasm trap.
• There are cases when attempting to configure certain charts on chasm trap
worksheets or tables will not work. If this happens, you will see the error Your
search needs to have unique y-axis values for each series of data shown on
the x-axis. The workaround is to remove all columns from the search, except
for those used in your chart.
Workarounds
In some cases, there is a workaround of saving an answer as a worksheet
(Aggregated Worksheet). See the ThoughtSpot User Guide for details on
how to do this. If you save a chasm trap search as a worksheet, it becomes a
materialized view of the answer. Effectively, it is then just a regular table (no

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

65

Load and manage data

chasm trap). As such, most of the issues above will not affect searches on the
saved worksheet.

Build the schema
Before you can load data into ThoughtSpot, you must build a database schema
to receive it. You will do this by writing a SQL script, which creates the objects in
your schema.
Your SQL script can use any SQL statements that are supported in ThoughtSpot
SQL Command Line (TQL). The TQL syntax is similar to the SQL used in other
relational databases, but with some important differences. You'll use DDL (data
definition language) to create the schema into which you'll load the data. You'll
probably want to put all your DDL statements into a text file, which you'll use as
a script for creating the schema.
Uploading the SQL script through the browser
You can upload an existing SQL script directly through the browser in the
ThoughtSpot application. You can edit the script or add to it right within the
browser, too.
The steps to build a schema through the browser are:
1. Write a SQL script to create the schema
2. Import a schema (use the SQL editor)
Using TQL to create the schema
You can choose to run your SQL script within the Linux shell, instead of
uploading it through the browser.
The steps to build a schema using TQL include:
1. Connect to the database with the ThoughtSpot SQL Command Line (TQL).
2. Write a SQL script to create the schema.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

66

Load and manage data

About TQL, the SQL command line interface
ThoughtSpot provides the ThoughtSpot SQL Command Line (TQL) for creating,
viewing, and managing a schema using SQL. You can run TQL in interactive
command line mode, or you can write a script and use TQL to run it.
TQL basics
The SQL syntax in ThoughtSpot is called TQL for ThoughtSpot SQL. The
ThoughtSpot SQL Command Line (TQL) runs in an interactive mode. To invoke
TQL Log in to the Linux shell using SSH and type tql. At the prompt, type h or
help

to see a list of supported commands.

Type your SQL commands on the command line, terminating each command
with a semicolon (;). Commands can span multiple lines. ThoughtSpot supports
a limited number of SQL commands, plus some custom SQL extensions. For
example, you can specify the number of shards and the distribution key as part
of the CREATE TABLE syntax. A full list of supported SQL in TQL is available in
the TQL reference.
You can also Write a SQL script to create the schema in TQL.

Connect to the database with the ThoughtSpot SQL Command Line (TQL)
To perform administrative tasks directly in the database, you will use the
ThoughtSpot SQL Command Line (TQL). TQL supports many, but not all,
common SQL commands.
Before connecting with TQL, you will need:
• Access to your ThoughtSpot instance Linux shell from a client machine.
• The administrator OS login.
To connect to TQL:
1. Log in to the Linux shell using SSH.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

67

Load and manage data

2. Invoke TQL:
$ tql
TQL>

3. Enter your SQL command, followed by a semicolon (;).

Create the schema in TQL
Having examined the structure of the data to be loaded and become familiar
with the ThoughtSpot SQL Command Line (TQL), you are now ready to create
the schema.
This method is a good way to get familiar with TQL and how to create database
objects, but when creating a schema in a production system, you will most likely
Write a SQL script to create the schema.
To create the schema directly in TQL:
1. Connect to the database with the ThoughtSpot SQL Command Line (TQL).
2. If the database you will be using does not exist, create it now:
TQL> CREATE DATABASE my_database;

3. Connect to the database:
TQL> USE my_database;

4. If you wish to use a schema other than the default one, create it now:
TQL> CREATE SCHEMA my_schema;

5. Issue a CREATE TABLE command for each table you will create, using the
information in Plan the schema.
Note: Foreign key declaration within a CREATE TABLE will show the
table created even if there are problems with the foreign key. Therefore,
it is good practice to also issue a separate ALTER TABLE ADD CONSTRAINT
FOREIGN KEY

command.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

68

Load and manage data

Write a SQL script to create the schema
Using a SQL script to create your schema is a recommended best practice. This
makes it easier to adjust the schema definitions and recreate the schema quickly,
if needed.
The schema creation script is a text file that contains all the SQL commands to
create your schema. Comments should be enclosed in the comment tags /* and
*/.
It is recommended to enclose all object names (schema, table, and column) in
double quotes and any column values in single quotes in your scripts. Object
names that are also reserved words in SQL, or that contain special characters
(any character other than alphanumeric or _ ), must be surrounded by double
quotes. If you see the error message "Error parsing SQL. Check SQL input.", you
should check for object names without double quotes in your script.
If you are working in a schema other than the default schema, object names
must be fully qualified, as in ""."".
If your schema includes constraints to define relationships between tables
(foreign key, or the RELATIONSHIP syntax), it is recommended that your script
first creates all the tables, and then at the end, creates the relationships between
them using the ADD CONSTRAINT syntax. This makes it easier to troubleshoot
the script and make changes.
If TQL is run using the flag --allow_unsafe, your statements will always execute
without this warning. Note that when running TQL from a script, you will
need to decide what behavior you want if the script contains changes that
affect dependent objects. If you want the script to run even if objects with
dependencies are afffected, run it using this flag, for example:
cat safest_script_ever.sql | tql --allow_unsafe

1. Open a new file in a text editor.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

69

Load and manage data

2. Type in the command to create the database, if it does not already exist:
CREATE database ;

3. Type in the command to specify the database to use:
USE database ;

4. Type in the command to create the schema, if you don't want to use the
default schema:
5. Type in each of the CREATE TABLE statements, with its column definitions,
primary key constraints, and sharding specification (if any).
6. At the end of your script, optionally type in the ALTER TABLE statements to
add foreign keys to use in joining the tables.
7. Save the file.
8. Run the script using one of these methods:
• Import a schema (use the SQL editor).
• Log in to the shell, copy your script to your ThoughtSpot instance using
scp, and pipe it to TQL:
$ cat create_schema.sql | tql

Schema creation examples
These examples demonstrate the steps involved in creating a schema using the
ThoughtSpot SQL Command Line (TQL). After the schema is created, you can
load data into it with ThoughtSpot Loader.
Simple schema creation example
The example creates a database (tpch) with two tables (customer, transaction).
The example does not create a schema explicitly. So it will use the default
schema (falcon_default_schema).
In this example:

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

70

Load and manage data

• The table customer has a primary key called customer_id. The table
customer_transactions

has a primary key called transaction_id.

• The customer table is unsharded.
• The customer_transactions table is sharded into 96 shards using the
transaction_id

column.

• Both tables have referential integrity on customer_id.
$tql
TQL> CREATE DATABASE tpch;
TQL> USE tpch;
TQL> CREATE TABLE customer (
name

VARCHAR(100),

address VARCHAR(255),
zipcode INT,
customer_id INT,
CONSTRAINT PRIMARY KEY (customer_id)
);
TQL> CREATE TABLE customer_transactions (
transaction_id INT,
customer_id INT,
amount DOUBLE,
transaction_date DATETIME,
CONSTRAINT PRIMARY KEY (transaction_id),
CONSTRAINT FOREIGN KEY (customer_id) REFERENCES
customer(customer_id)
)

PARTITION BY HASH (96) KEY (transaction_id);

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

71

Load and manage data

More complex schema creation example
The example uses a custom schema called sample_schema to hold the tables.
Because of this, every table reference has to be schema qualified.
$ tql
TQL> CREATE DATABASE "sample_db";
TQL> USE "sample_db";
TQL> CREATE SCHEMA "sample_schema";
TQL> CREATE TABLE "sample_schema"."customer" (
"c_custkey" BIGINT,
"c_name" VARCHAR(255),
"c_address" VARCHAR(255),
"c_city" VARCHAR(255),
"c_nation" VARCHAR(255),
"c_region" VARCHAR(255),
"c_phone" VARCHAR(255),
CONSTRAINT PRIMARY KEY ("c_custkey")
);
TQL> CREATE TABLE "sample_schema"."supplier" (
"s_suppkey" BIGINT,
"s_name" VARCHAR(255),
"s_address" VARCHAR(255),
"s_city" VARCHAR(255),
"s_nation" VARCHAR(255),
"s_region" VARCHAR(255),
"s_phone" VARCHAR(255),
CONSTRAINT PRIMARY KEY ("s_suppkey")
);
TQL> CREATE TABLE "sample_schema"."lineorder" (
"lo_orderkey" BIGINT,
"lo_linenumber" BIGINT,

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

72

Load and manage data
"lo_custkey" BIGINT,
"lo_partkey" BIGINT,
"lo_suppkey" BIGINT,
"lo_orderdate" DATE,
"lo_orderpriority" VARCHAR(255),
"lo_shippriority" VARCHAR(255),
"lo_quantify" BIGINT,
"lo_extendprice" BIGINT,
"lo_ordtotalprice" BIGINT,
"lo_discount" BIGINT,
"lo_commitdate" DATE,
CONSTRAINT PRIMARY KEY ("lo_orderkey","lo_linenumber"),
CONSTRAINT FOREIGN KEY ("lo_custkey") REFERENCES
"sample_schema"."customer" ("c_custkey"),
CONSTRAINT FOREIGN KEY ("lo_suppkey") REFERENCES
"sample_schema"."supplier" ("s_suppkey")
)

PARTITION BY HASH (96) KEY (lo_orderkey);

Import a schema (use the SQL editor)
You can run a SQL script to create your database schema through the browser,
without having to log in to the shell on the ThoughtSpot instance. You can edit
the script and run it directly in the browser to create the schema.
Importing a schema through the Web browser makes it possible to run your SQL
script without needing to have a Linux login. You can use this capability in any of
these ways:
• Create the SQL script ahead of time, and use the browser to run it.
• Use the editor to type your SQL directly into the browser.
• Use the browser SQL interface as an interactive SQL editor, for example to
test an existing script or make changes to an existing schema.
1. Log in to ThoughtSpot from a browser.
2. Click on Data, on the top navigation bar.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

73

Load and manage data

Figure 9: Data
3. Click Actions and select Upload schema.

Figure 10: Upload schema
4. Drag and drop your SQL file into the browser, or choose Browse Your Files to
locate it.
5. You're now in the SQL editor. Use it to view your script and make any changes.

Figure 11: Import schema
6. When ready, run your script by clicking the Execute button.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

74

Load and manage data

7. If there are any errors, correct them and run the script again.

Change the schema
After you've created a schema and loaded data, you may find yourself wishing
you'd set things up a little differently. You can make changes to the schema, such
as changing the primary key, relationships to other tables, and sharding.
Making changes to a schema after data has been loaded and users have created
worksheets or pinboards on the tables requires care, so that you don't lose the
relationship between the objects created in ThoughtSpot and the underlying
tables. If you follow the procedures here, your tables will retain their relationships
to the objects created on top of them:
• Change the primary key for a table.
• Change a relationship between tables.
• Change sharding on a table.

Change the primary key for a table
Use this procedure to change the primary key for a table. But use it with caution,
particularly if you are changing to a primary key for which values are not unique.
You should always take a snapshot of your database before making any schema
changes. This will allow you to revert back to the prior state if you make an error,
or something doesn't work as you expected after the schema change.
You can change the primary key of a table without having to TRUNCATE it first
and reload the data. However, changing the primary key could result in data
deletion. This is because of the upsert behavior which is applied when multiple
rows have the same primary key. This is very important to understand ahead of
time, if you are considering changing to a primary key for which values are not
unique.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

75

Load and manage data

To change the primary key, first remove any existing primary key, and then define
a new one (if any). You do not need to truncate the tables to do this operation
beginning in version 3.2. Any dependent objects (pinboards or worksheets) will
remain intact.
To change the primary key of a table:
1. Take a snapshot.
2. Connect to the database with the ThoughtSpot SQL Command Line (TQL).
3. Drop the existing primary key (if any), by issuing a command like this example:
TQL> ALTER TABLE "cart"
DROP CONSTRAINT
PRIMARY KEY;

4. Add a new primary key, if desired:
TQL> ALTER TABLE "cart"
ADD CONSTRAINT
PRIMARY KEY ("owner_id");

5. Test that any dependent objects (pinboards, worksheets, etc.) are still working
correctly.
6. Delete the snapshot you created earlier using the command:
tscli snapshot delete 

Change a relationship between tables
Use this procedure to remove a relationship between tables or define a new
one. This operation works for both kinds of relationships: foreign key or generic
relationship.
You should always take a snapshot of your database before making any schema
changes. This will allow you to revert back to the prior state if you make an error,
or something doesn't work as you expected after the schema change.
To change a relationship between two tables, first remove any existing
relationship, and then define the new relationship (if any). You do not need to

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

76

Load and manage data

truncate the tables to do this operation. Any dependent objects (pinboards or
worksheets) will remain intact.
To change the relationship between tables:
1. Take a snapshot.
2. Connect to the database with the ThoughtSpot SQL Command Line (TQL).
3. Issue the command to drop the existing relationship, by issuing a command
like one of these examples:
• Drop a foreign key by name, if it was given a name when it was defined:
TQL> ALTER TABLE
"sales_fact"
DROP CONSTRAINT
"FK_PO_number";

• Drop a relationship by name, if it was given a name when it was defined:
TQL> ALTER TABLE "fruit_dim"
DROP RELATIONSHIP "REL_dates";

• Drop the foreign key relationship explicitly, if it doesn't have a name, by
referencing the two tables that are joined. This drops all foreign keys
between the two tables:
TQL> ALTER TABLE "shipments"
DROP CONSTRAINT
FOREIGN KEY "orders";

• Drop all generic relationships between two tables:
TQL> ALTER TABLE "wholesale_buys"
DROP RELATIONSHIP
WITH "retail_sales";

4. Define a new relationship, if you want to, using ALTER TABLE...ADD
CONSTRAINT...
5. Test that any dependent objects (pinboards, worksheets, etc.) are still working
correctly.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

77

Load and manage data

6. Delete the snapshot you created earlier using the command:
tscli snapshot delete 

Change sharding on a table
You can change the sharding on a table or remove it altogether (creating a
replicated table) using this procedure. This procedure preserves the data within
the table.
You should always take a snapshot of your database before making any schema
changes. This will allow you to revert back to the prior state if you make an error,
or something doesn't work as you expected after the schema change.
This procedure reshards a table. This is also called redistributing or
repartitioning. You can use this method to reshard a table without losing its data
or metadata. This means that worksheets and pinboards built on top of the table
will continue to work.
You can use these steps to do any of these operations:
• shard a table that was previously replicated.
• change a replicated table to a sharded table.
• change the number of shards to use for a sharded table.
To change the sharding on a table:
1. Take a snapshot.
2. Connect to the database with the ThoughtSpot SQL Command Line (TQL).
3. Issue the command to change the sharding using this syntax:
TQL> ALTER TABLE 
[SET DIMENSION | SET FACT
[PARTITION BY HASH
[()]
[KEY()]]]

For example:

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

78

Load and manage data

• To make a sharded table into a dimension table (replicated on every node),
use:
ALTER TABLE "products"
SET DIMENSION;

• To make a dimension table into a sharded (fact) table or change the
number of shards, use:
ALTER TABLE "sales"
SET FACT PARTITION BY HASH (96)
KEY ("productID");

4. Test that any dependent objects (pinboards, worksheets, etc.) are still working
correctly.
5. Delete the snapshot you created earlier using the command:
tscli snapshot delete 

About data type conversion
You can convert the data in a column from one data type to another by issuing
a TQL command. There are some details you should be aware of when doing a
data type conversion.
Data type conversion behavior
When converting from one data type to another, any values that can not be
converted will be set to NULL. If errors occur during data type conversion, the
operation is aborted. However, you may choose to force the conversion despite
the errors. You can start TQL in allow_unsafe mode to continue with the data
conversion, at your own risk, of course! To start TQL in unsafe mode, issue this
command:
tql --allow_unsafe

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

79

Load and manage data

Multiple columns of a single table can be converted using a single TQL
command. The behavior is transactional. So for example, you would issue a
command like this example:
ALTER TABLE products
MODIFY COLUMN product_id int,
MODIFY COLUMN supplier VARCHAR(4);

Also note that changing data type has implications on the primary key and
sharding enforcement. For example, changing the data type of a column that is
part of the sharding key would lead to a redistribution of data. Then imagine that
the sharding key column contained the text values "00100", "0100", and "100",
which all map to same integer value. If this type of a column is changed from a
VARCHAR to an INT, then it would be subject to the upsert behavior on primary
keys. So in this example, only one of the three rows would be preserved.
Be aware that data type conversion will preserve the data in the underlying
database table, but there is no guarantee that any objects built on top of it
(worksheets or pinboards) will be preserved. This is because you might make
a data type change that makes a chart built on top of the table invalid (for
example a growth chart would be invalidated if the date column it depends on
were changed to a varchar column).
Supported data type conversions
In general, the data type conversions that make logical sense are supported. But
there are a few nuances you should be aware of:
• When you convert from INT to BOOL, zero is converted to false, and all nonzero values are converted to true.
• When you convert from BOOL to INT, true gets converted to 1, and false gets
converted to 0.
• When you convert from DOUBLE to INT, the value gets rounded.
• When you convert from INT to DOUBLE, the value gets rounded.

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

80

Load and manage data

• When you convert from DATETIME to DATE, the date part of value is
preserved and the time part is dropped.
• When you convert from DATE to DATETIME, the time gets added as 00:00:00.
The date part of the value is preserved.
• When you convert from DATETIME to TIME, the time part of the value is
preserved.
• Conversion from TIME to DATETIME is not supported.
Date and time conversions
Some data type conversion require a format string. These include:
• conversion from DATE/TIME/DATETIME
• conversion to DATE/TIME/DATETIME
For these types of conversions, you'll use a special syntax using parsinghint and
the date format specifications supported in the strptime library function.
For the example, first create a table with a timestamp stored as a VARCHAR:
CREATE TABLE fruit_sales
(time_of_sale VARCHAR(32));
INSERT INTO fruit_sales
VALUES ('2015-12-29 13:52:39');

Now, convert the column from a VARCHAR to DATETIME, using the format %Y%m-%d %H:%M:%S:
ALTER TABLE fruit_sales
MODIFY COLUMN time_of_sale DATETIME
[parsinghint="%Y-%m-%d %H:%M:%S"]

Finally, convert the column back to VARCHAR:
ALTER TABLE fruit_sales
MODIFY COLUMN time_of_sale VARCHAR(32);

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

81

Load and manage data

Boolean to string conversions
Boolean to string conversions have format strings, too. You'll use parsinghint as
you do for date and time conversions. You can choose among these approaches:
• Option 1: Specify string values for both true and false. Any non-matching
values get converted to null. In this example, "100" gets converted to true, and
"0" gets converted to false. "-1" gets converted to null.
ALTER TABLE db
MODIFY COLUMN s bool [parsinghint="100_0"];

• Option 2: Specify a string value for true. Any non-matching value gets
converted to false. In this example, "100" gets converted to true, "-1" and "0"
get converted to false.
ALTER TABLE db
MODIFY COLUMN s bool [parsinghint="100_"];

• Option 3: Specify a string value for false. Any non-matching value get
converted to true. In this example, "-1" and "100" get converted to true, and
"0" gets converted to false.
ALTER TABLE db
MODIFY COLUMN s bool [parsinghint="_0"];

String to boolean conversions
When converting from a string to a boolean, you must specify a string for true
and false. By default, a string to boolean conversion generates "true" for true,
"false" for false.
ALTER TABLE db
MODIFY COLUMN b varchar(32);

But you may override the default strings that get generated by using parsinghint,
as in this example:
ALTER TABLE db
MODIFY COLUMN b varchar(32) [parsinghint="tr_fa"];

ThoughtSpot Administration Guide

Copyright

©

2017 by ThoughtSpot. All Rights Reserved.

82

Load and manage data

Change the Data Type of a Column
When you issue the TQL command to convert a column from one data type to
another, the conversion is handled automatically. However, you'll need to ensure
that any visualizations built on top of the table display correctly.
You should always take a snapshot of your database before making any schema
changes. This will allow you to revert back to the prior state if you make an error,
or something doesn't work as you expected after the schema change.
When changing a data type in an existing table, be aware that answers and
pinboards created on top of that table (or worksheets that include it) may
change. This is because charts and aggregations depend upon the data type. So
for example changing from INTEGER to VARCHAR could break charts that used
the numeric data type INTEGER to calculate an average or a total. Because of
this, use caution, and check all dependent objects before and after changing the
data type, to ensure that they display as intended.
To change the data type of a column:
1. Connect to the database with the ThoughtSpot SQL Command Line (TQL).
2. Issue the command to change the data type using this syntax:
TQL> ALTER TABLE 
MODIFY COLUMN ; For example: ALTER TABLE fact100 MODIFY COLUMN product_id int; Load data with ThoughtSpot Loader ThoughtSpot Loader (tsload) is a common way to import data. When using tsload, you can load larger datasets and make the loading process repeatable through scripting. There are several steps to loading data using ThoughtSpot Loader: ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 83 Load and manage data 1. Generate CSV files with the data to be loaded. 2. Build the schema. 3. Import CSV files with ThoughtSpot Loader. Import CSV files with ThoughtSpot Loader Use ThoughtSpot Loader (tsload) to load data from a CSV text file into an existing table in ThoughtSpot. tsload accepts flags that enable you to specify column and row separators, date or timestamp formats, null value representations, and similar parameters. Many of these options have defaults that you can override. Before importing data, you need to Build the schema. To use ThoughtSpot Loader, type the command tsload followed by the appropriate flags. You can see the list of the flags it accepts in the ThoughtSpot Loader flag reference or by issuing tsload -help. tsload supports both full and incremental data loads. For incremental loads, an upsert (insert or update) is performed. If an incoming row has the same primary key as an existing row, it updates the existing row with the new values. You can integrate tsload into your ETL environment for more automated data loads. Most ETL tools provide the ability to write target data into files and support scripted post-transformation actions that can include loading data into ThoughtSpot. This procedure describes manually loading data, but the tsload commands could be saved as a script: 1. Log in to the Linux shell using SSH. 2. Change to the directory where your CSV files are staged. 3. Invoke tsload, specifying the appropriate flags and your data source file: $ tsload --target_database my_database --target_table my_table --alsologtostderr --empty_target --source_file my_file.csv --v 1 --field_separator "separator_char" ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 84 Load and manage data 4. Repeat the data load for each of your CSV files. This example imports the CSV file "ssbm_customer.csv" into the table CUSTOMER: $ tsload --target_database SAMPLE_DB --target_table CUSTOMER --alsologtostderr --empty_target --source_file ssbm_customer.csv --v 1 --field_separator "|" Use a script to load data If you need to load data from multiple CSV files, create a script to automate the process. You can also use a similar script to automate recurring data feeds. The data loading script is a text file that contains all the calls to tsload for loading the data from your CSV files. The example script shown here uses the cat command to read the data file, and pipes it to tsload. When creating and testing your script, you may wish to replace each cat with cat -10, to load only the first ten lines of each file. This allows you to quickly run a test of your script. When the test succeeds for all the data files, you can then remove each -10, so the complete files will load when you run the script again. 1. Log in to the Linux shell using SSH. 2. Navigate to the directory that contains your CSV files and open a new file in a text editor. 3. Type in the commands to load the data. This example shows commands to load three files: cat Players.csv | tsload --target_database baseball --target_table "players" --empty_target --field_separator "," --max_ignored_rows 10 --bad_records_file bad_records.txt --has_header_row --alsologtostderr --null_value "" cat AllstarFull.csv | tsload --target_database baseball --target_table "allstarfull" --empty_target --field_separator "," --max_ignored_rows 10 --bad_records_file bad_records.txt ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 85 Load and manage data --has_header_row --alsologtostderr --null_value "" cat Appearances.csv | tsload --target_database baseball --target_table "appearances" --empty_target --field_separator "," --max_ignored_rows 10 --bad_records_file bad_records.txt --has_header_row --alsologtostderr --null_value "" 4. Save the file. 5. Run the script: $ ./load_baseball_data.sh Bulk load files in parallel If you have a very large data file that takes a long time to load, you can reduce the load time by splitting it up into multiple files and loading them in parallel using multiple invocations of tsload. If the size of any of your data files is greater than 50 million rows, running tsload in parallel can reduce the load time significantly. First, you'll split up your large data file into multiple smaller files. Then create a script to load the files. You will make your script multi-threaded by invoking multiple loader threads (between 1 and 5 are recommended). Stage the data files in a location accessible to the node on which you'll run the script. Usually you'll use an NAS mounted file system. Then run the script to load the files. If you want to optimize the load time even further, determine what the bottleneck is and adjust your process accordingly: • If the disk I/O for reading the data files is the bottleneck, you can stage the data files on separate NAS mounted file systems and reference them accordingly in your script. • If the CPU on the machine you're using to run the load script is the bottleneck, you can split the load script into the same number of parts as you have nodes in your ThoughtSpot instance, place one script on each node, and run them in parallel. Make sure the other nodes are able to access the data files where ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 86 Load and manage data they are staged. Note that running the load script on separate nodes will put the data on all the nodes, just as when you run the script on a single node. Running the script on all the nodes at the same time just lets you take advantage of CPU power of each node for hashing data files. Let's say you have 30 days of data in 30 files, one for the data collected on each day. Each day's data file contains 10 million rows, for a total of 300 million rows of data. You want to load the whole month of data. For this example we'll have 5 loader processes, each one handling 6 days of data. Here is a sample script you could use the load the data files in parallel: /* Script load_script.sh, loads 30 days of data in parallel */ #!/bin/bash pidlist="" cat day1.csv day2.csv day3.csv day4.csv day5.csv day6.csv | tsload --target_database sales --target_table SALES_FACT -max_ignored_rows 10 --bad_records_file ./SALES_FACT.bad --date_format %Y-%m-%d --date_time_format "%Y-%m-%d %H:%M:%S" --source_data_format delimited --field_separator "|" --null_value "" --enclosing_character "\"" --boolean_representation 1_0 & pidlist="$pidlist $!" & cat day7.csv day8.csv day9.csv day10.csv day11.csv day12.csv | tsload --target_database sales --target_table SALES_FACT -max_ignored_rows 10 --bad_records_file ./SALES_FACT.bad --date_format %Y-%m-%d --date_time_format "%Y-%m-%d %H:%M:%S" --source_data_format delimited --field_separator "|" --null_value "" --enclosing_character "\"" --boolean_representation 1_0 & pidlist="$pidlist $!" & cat day13.csv day14.csv day15.csv day16.csv day17.csv day18.csv | tsload --target_database sales --target_table SALES_FACT -max_ignored_rows 10 --bad_records_file ./SALES_FACT.bad --date_format %Y-%m-%d --date_time_format "%Y-%m-%d %H:%M:%S" --source_data_format delimited ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 87 Load and manage data --field_separator "|" --null_value "" --enclosing_character "\"" --boolean_representation 1_0 & pidlist="$pidlist $!" & cat day19.csv day20.csv day21.csv day22.csv day23.csv day24.csv | tsload --target_database sales --target_table SALES_FACT -max_ignored_rows 10 --bad_records_file ./SALES_FACT.bad --date_format %Y-%m-%d --date_time_format "%Y-%m-%d %H:%M:%S" --source_data_format delimited --field_separator "|" --null_value "" --enclosing_character "\"" --boolean_representation 1_0 & pidlist="$pidlist $!" & cat day25.csv day26.csv day27.csv day28.csv day29.csv day30.csv | tsload --target_database sales --target_table SALES_FACT -max_ignored_rows 10 --bad_records_file ./SALES_FACT.bad --date_format %Y-%m-%d --date_time_format "%Y-%m-%d %H:%M:%S" --source_data_format delimited --field_separator "|" --null_value "" --enclosing_character "\"" --boolean_representation 1_0 & pidlist="$pidlist $!" & wait $pidlist Call your script using a command like: nohup bash ./load_script.sh > master_log.txt & tail -f master_log.txt Constructing your script in this way will execute all the commands in the background, and output to the file master_log.txt. You'll see a running status as the commands in the script execute. After the script completes, you can check the log file for detailed information, such as the number of rows that loaded successfully. Delete a data source When you want to delete a data source, you first need to handle any dependent objects that have been built on top of it. You can easily see these dependencies, and choose how to handle them before deleting the data source. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 88 Load and manage data There are two separate ways to delete a data source. Both of these methods will check for dependencies and warn if any are found: • Delete or change a table in TQL describes the dependency checking that occurs when deleting or changing a table using TQL. • Delete a data source from the browser describes the dependency checking that occurs when deleting a data source through the ThoughtSpot application. You can also Check dependencies in the browser before attempting to delete a data source. Delete a data source from the browser You can delete data sources from the browser, as long as they were not created by an administrator through tsload or Data Connect. You can delete data sources from the browser if they were created from the browser. These types of data sources include: • Data imported from the browser. • Worksheets. ThoughtSpot checks for dependencies whenever you try to delete a table or worksheet. 1. Click on Data, on the top navigation bar. Figure 12: Data 2. Check the box next to the name of the data source you want to delete. 3. Click the delete icon. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 89 Load and manage data Figure 13: Delete a data source 4. If you attempt to delete a data source with dependent objects, the operation will be blocked. You will see a list of dependent objects with links. Figure 14: Warning message when trying to delete an object with dependencies 5. Click on a dependent object to modify or delete it. If you want to remove the dependency by modifying the dependent object, you'll need to remove all search terms or columns that refer back to the data source you are trying to delete. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 90 Load and manage data 6. When all dependencies have been removed, you will be able to go back and delete the data source. Check dependencies in the browser You can see all of the dependencies for any data source (worksheet or table) on the Manage Data page. To view dependent objects for a data source: 1. Click on Data, on the top navigation bar. Figure 15: Data 2. Click the name of the data source whose dependencies you want to view. 3. Click Dependents. You will see a list of the names of the dependent objects (worksheets and pinboards), and the columns they use from that data source. You can use this information to determine the impact of changing the structure of the data source or to see how widely it is used. Figure 16: List of dependent objects 4. Click on a dependent object to modify or delete it. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 91 Load and manage data If you want to remove the dependency by modifying the dependent object, you'll need to remove all search terms or columns that refer back to the data source you are trying to delete. 5. When all dependencies have been removed, you will be able to go back and delete the data source. Delete or change a table in TQL Just as attempting to delete an object in the web browser warns of any dependencies, making changes using ThoughtSpot SQL Command Line (TQL) that modify or delete tables warns of dependencies. When you enter a TQL statement, you will be warned of possible dependency consequences with a prompt asking if you’d like to proceed. This should make you feel safe issuing TQL commands, even commands like dropping a table. If TQL is run using the flag --allow_unsafe, your statements will always execute without this warning. Note that when running TQL from a script, you will need to decide what behavior you want if the script contains changes that affect dependent objects. If you want the script to run even if objects with dependencies are afffected, run it using this flag, for example: cat safest_script_ever.sql | tql --allow_unsafe If you do not run the script using the flag, it will fail if any of its commands might cause problems with dependent objects. TQL actions with possible dependency consequences include: • Change, add, or remove a primary key. • When changing or adding a primary key, if the key in question is not unique in the data it may cause deletion of rows, because of upserts occurring when duplicate primary keys are found. • When changing or removing a primary key, incoming foreign key relationships will be broken. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 92 Load and manage data • Change a column datatype. • Add a relationship or foreign key. • Drop a relationship or foreign key constraint. • Change or add a sharding key. • Drop a table, schema, or database. When issuing one of the above commands, you will see a warning message similar to this: TQL> ALTER TABLE table1 DROP CONSTRAINT PRIMARY KEY; WARNING: This operation will break the Foreign Key relationship “products” with table “sales”, which will break 34 user-visible visualizations and 2 Worksheets. We recommend taking a snapshot before performing this operation. Do you wish to proceed? (yes/no). About the Schema Viewer There is a schema viewer in ThoughtSpot which lets you see your database schema in the web browser. You can see tables and worksheets and their relationships. The Schema Viewer is interactive, so you can configure it to show just what you want to see. Bringing up the Schema Viewer You can access the Schema Viewer from the Manage Data screen by clicking Actions, and selecting View Schema. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 93 Load and manage data Figure 17: Access the Schema Viewer When viewing the schema, you can filter the tables shown similarly to how you filter data sources. The list of tables, worksheets, and imported data on the left includes only those objects you want to see. Clicking on one of the objects brings it to the middle of the viewer and highlights it. You can drag the objects around in the viewer. Figure 18: Schema Viewer filters ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 94 Load and manage data How to use the Schema Viewer You can use the Schema Viewer to find out information like: • What is the relationship between two tables? • What tables make up this worksheet, and how are they joined? The schema viewer shows joins between tables, join directionality, and join type (whether they are Foreign Key > Primary Key, relationship joins, or joins defined by users through the web interface). Use the Table list to find a specific table or worksheet. Worksheet view For worksheets, you can also click on one to view the worksheet. The worksheet view shows the following information: • All tables in the worksheet, and the relationships between these tables. • Source columns for all columns of a worksheet. • Keys and definitions for each relationship, as well as join paths and types. • Columns that are derived from formulas. • Correct join paths for newly created chasm trap worksheets. Existing chasm trap worksheets will not show the correct join paths. Figure 19: Worksheet view example ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 95 Load and manage data The worksheet view does not work for aggregated worksheets, but does works for worksheets built on top of aggregated worksheets. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 96 Chapter 3: Model, link, and tag your data for searching Model, link, and tag your data for searching Topics: • Model the data for searching • • Modeling, tagging, and adding links between your data sources can make the data even easier to search. Link tables using You can start searching your data without doing relationships any modeling, creating relationships, or tagging. But About stickers putting some thought into these will make the data even easier and more intuitive to search for your end users. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 97 Model, link, and tag your data for searching Model the data for searching After loading data, you will create the semantic data model that makes the data easy to search. You can make these settings either through the Web interface or by using the model file. About data modeling Data modeling is a very lightweight process compared to what you may have experienced in other tools. It enables you to change some of the settings (or properties) of the data so that it becomes more searchable. You can change these settings in two ways, both of which have the same effect: 1. Model data in the Web interface. 2. Model data in bulk in the model file. Choose the model file method if you want to make many changes in one bulk operation. Modeling the data allows you to give the columns more search friendly names or predefine how they can be explored and aggregated. When you load data, most of this data modeling metadata is set up for you automatically. However, since you know your data best, you can adjust the modeling settings to improve the experience for your users. Model data in the ThoughtSpot application To make modeling settings for a table you've just loaded, or to make a quick change to existing settings, use the ThoughtSpot web interface. You can adjust the Columns settings from the data management listing. You can change all the same data model settings here as in the model file. This method is easier and faster, unless you need to make many settings in bulk. In that case, using the model file is recommended. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 98 Model, link, and tag your data for searching Click on the Data icon, to get to the data management listing. Then click on the name of your data source. Figure 20: Select a data source This brings up the Columns screen, where you'll make your modeling settings. Modify the column settings by clicking on them, typing or selecting the setting you want, and then saving your changes. Descriptions of the different settings are listed in Data modeling settings. Figure 21: Edit modeling settings in the Columns screen ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 99 Model, link, and tag your data for searching Model data in bulk in the modeling file Properties of your data such as Column Names, Column Visibility, Column and Data Definition, Column Rank, etc. are defined in the modeling file. Use the model file when you want to edit these settings in bulk. If you just want to make one or two quick changes, it will be faster to Model data in the ThoughtSpot application instead. Data modeling is a three steps process: 1. Download the model file. 2. Change settings in the model file. 3. Upload the edited model file. You can edit the data modeling file using Microsoft Excel, vi/vim, or a compatible tool. In each row of the modeling file, all the data properties corresponding to a column from your data are listed. You can modify many of these properties by typing in the new value. Remember these important guidelines when editing the model file: • Do not modify any value in a column which contains DoNotModify in the field under the column heading. • Make sure to keep the file in the same format as it had when you downloaded it (CSV text file). Download the model file Before you can make changes to the model file, you need to download it. Then you can edit it using Microsoft Excel, vi/vim, or a similar text editing tool. First, you'll download the model file, and then make changes to the appropriate Data modeling settings. To obtain the model file: 1. Log in to ThoughtSpot from a browser as an Administrator user. 2. Click on the Admin icon, on the top navigation bar. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 100 Model, link, and tag your data for searching Figure 22: The Admin icon 3. Click on Business Data Model. 4. Click Download model.xls. Figure 23: Download the model file Change settings in the model file After downloading the model file, you'll make changes to the settings using this procedure. Then you will upload the file again to apply your changes. You can edit any of the values in the model file, except for those where the words DoNotModify appear below the column header. To make changes in the model file: 1. Open the model file you downloaded (model.xls) in Excel, vi/vim, or a text editor. • If you are using Excel, you may see a warning message. Click Yes to proceed. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 101 Model, link, and tag your data for searching Figure 24: Warning when opening the model file • If your model file includes multi-byte characters, edit the file using vi or vim. This is because model files containing multi-byte characters must be saved as UTF-8 encoded. Otherwise you won't be able to upload them after making your edits. 2. Find the column you want to modify. Descriptions of the meanings of the columns are listed in Data modeling settings. 3. Select the value you want to change. 4. Type in the new value. 5. After making all your changes, save the model file. • If you are using Excel, you will see a message. Click Yes to save the file. Figure 25: Warning when saving the model file • If your model file includes multi-byte characters, edit the file using vi or vim. This is because model files containing multi-byte characters must be saved as UTF-8 encoded. Otherwise you won't be able to upload them after making your edits. Upload the edited model file After you have made changes to the modeling file, you must upload it back to ThoughtSpot before the changes will take effect. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 102 Model, link, and tag your data for searching Save the model file in the same format as it was when you downloaded it. If you are using Microsoft Excel to edit the file, you will see a warning when attempting to save it. Click Yes and save the file. Figure 26: Warning message when saving the model file To upload the model file: 1. Log in to ThoughtSpot from a browser as an Administrator user. 2. Click on the Admin icon, on the top navigation bar. Figure 27: The Admin icon 3. Click on Business Data Model. 4. Click BROWSE YOUR FILES to upload the model.xls file, or drag and drop it in the zone. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 103 Model, link, and tag your data for searching Figure 28: Upload the model file If you receive an error message upon uploading the file, check that it does not include any multi-byte characters (i.e. Japanese or other multi-byte language characters). If it does, you'll need to download the file again and make your edits using vi or vim. Note that you may choose to remove all the rows you have not changed from the model file before uploading it. If you upload a model file that includes only the changed rows, you won't lose any of the pre-existing model file settings. This is a good option if your model file is causing an error on upload, but you aren't sure where in the model file the problem is. As soon as the file is uploaded, ThoughtSpot performs any necessary reindexing for you automatically. Your new settings will be reflected within a few minutes. Data modeling settings Whether you are changing data modeling settings using the modeling file or the Web interface, the settings and their accepted values are the same. Modeling settings You can change these settings in two ways, both of which have the same effect: ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 104 Model, link, and tag your data for searching 1. Model data in the Web interface. 2. Model data in bulk in the model file. Choose the model file method if you want to make many changes in one bulk operation. This index lists the editable data modeling settings: Table 14: Modeling settings Setting name Description Can be modeled in a Worksheet Column Name Sets the name of the column to Yes be used in searches. Description Adds a text description of Yes what the column contains. Data Type Read only. Shows the column's data type. Column Type Sets the type of column, either ATTRIBUTE or MEASURE. Additive Controls the type of aggregations that will be available for a column. Aggregation Sets the default aggregation type for MEASURE columns. Hidden Sets the visibility of a column. Synonyms Adds synonyms that can be Yes used in the search bar to refer to a column. Index Type Sets the type of index that will be created for a column. Geo Type Enables a column to be used in GeoMap visualizations. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 105 Model, link, and tag your data for searching Setting name Description Can be modeled in a Worksheet Priority Changes the priority of a column in search suggestions. Number Format Specifies the format to use when showing a numeric value in the column. Date Format Specifies the format to use when showing the dates in a column. Currency Format Specifies the format to use when showing the currencies in a column. Attribution Dimension Only applies to tables that join over a Chasm Trap. Designates whether the tables depend on this column for attribution. Data modeling for worksheets For worksheets, only some of the settings can be modified, whether you are using the modeling file or the Web interface. The editable settings for worksheets are: • Name • Description • Synonyms If you want to change any of the settings that cannot be modified in a worksheet, you need to make your changes to the underlying table instead, and they will be reflected in all worksheets that use the table. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 106 Model, link, and tag your data for searching Change the column name You can change the text that is shown for the column names in ThoughtSpot to make the names more meaningful to users. The column name is what they will type to add that column to their search. Column Name is the name that will be displayed to users for that column in ThoughtSpot. The default is the name you gave the column when you defined the table in the database or imported the CSV file from the browser. 1. Find the Column Name for the column whose name you want to change. 2. Type in the new column name, as you want it to appear in ThoughtSpot. 3. Save your changes. Add a column description You can provide a description for a specific column, to provide additional information for users about the data it contains. Description contains an optional description for the corresponding column. When a user hovers over the column, a tooltip will show this description. To create a column description: 1. Find the column you want to add a description for, and select its Description. 2. Type in the column description, as you want it to appear in ThoughtSpot. 3. Repeat for all columns where you want to add a description. 4. Save your changes. Change the column type (ATTRIBUTE or MEASURE) Columns have a Column Type based on the kind of data they store. This is set automatically upon defining the table, but in some cases, you may want to change the type. There are two types of columns: • ATTRIBUTE contains a property, like name, address, or id number. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 107 Model, link, and tag your data for searching • MEASURE contains a numeric value that can be compared in a meaningful way using math, such as a count or measurement. When a new table is created, the default column type is set according to the data type defined for each column. By default, columns with the numeric data types (FLOAT, DOUBLE, INT, or BIGINT) are assigned the type MEASURE. Columns with VARCHAR, BOOL, or date/time data types are assigned the type ATTRIBUTE. Usually the default setting for column type works fine. But occasionally you'll need to change a MEASURE to an ATTRIBUTE. Examples of numeric values for which mathematical operations are not meaningful include: • ID numbers • Key values that are primarily used for joining tables • Product number or SKU • Sports team member jersey number • Year, when separate from a date (e.g. 1999, 2000) You can change the column type by modifying the Column Type setting. 1. Find the Column Type for the column whose type you want to change. If you are editing the model file, use the setting Type. 2. Change it to either MEASURE or ATTRIBUTE. 3. Save your changes. Change the additive setting for a column Your data may contain a column with a numeric data type that you have defined as an ATTRIBUTE rather than a MEASURE (such as age). You can allow aggregations on the values by changing the value of the Additive setting. The setting Additive only applies to columns that have both: • A numeric data type (FLOAT, DOUBLE or INTEGER) or a date data type (DATE, DATETIME, TIMESTAMP, or TIME). ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 108 Model, link, and tag your data for searching • The type ATTRIBUTE in the modeling file. When an result is returned from a search, you also get a list of options for each column on the left side of the screen. For numeric columns with the Additive setting of NO, the count listing includes: • UNIQUE COUNT OF • TOTAL COUNT OF Figure 29: Default options with Additive set to “NO” If you want aggregates to be added to the options for these columns, you need to change the Additive setting to YES. After you make this change, these options will be offered: • TOTAL OF • AVG OF • STD DEVIATION OF • VARIANCE OF • TOTAL COUNT OF • UNIQUE COUNT OF • MIN OF • MAX OF ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 109 Model, link, and tag your data for searching Figure 30: Options for numeric columns with type ATTRIBUTE and Additive set to YES To change the Additive setting: 1. Find the column whose additive setting you want to change and select its Additive value. 2. Change the value to one of these: • YES or NO, if using the Web interface. • TRUE or FALSE, if using the model file. 3. Save your changes. Change the aggregation setting for a column All types of aggregations can be performed on MEASURE columns, and some aggregations can be done on ATTRIBUTE columns. You can change the default aggregation type to make combining data more intuitive and faster. To aggregate a column without having to enter the aggregation type explicitly in your searches every time, you can set a default Aggregation for that column. Note that any non-numeric columns (columns of type ATTRIBUTE) will have a ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 110 Model, link, and tag your data for searching default aggregate type of NONE, which you can change to one of the supported aggregation types. Table 15: Supported aggregate types Aggregate type Description NONE Does no aggregation. This is the default for ATTRIBUTE type columns. SUM Adds the values together and returns the total. This is the default for MEASURE type columns. AVERAGE Calculates the average of all the values. MIN Calculates the minimum value. MAX Calculates the maximum value. STD_DEVIATION Calculates the standard deviation of all the values. VARIANCE Calculates the variance of all the values. COUNT Calculates the total number of values. COUNT_DISTINCT Calculates the total number of distinct values. 1. Find the column whose default aggregation type you want to change, and select its Aggregation. If using the modeling file, use the AggregationType setting. 2. Select the new default aggregation type. 3. Save your changes. Supposed there is a table containing data about athletes on a sports team. The data contains some numerical values, including points scored, salaries, and jersey numbers for each of the players. Because jersey number is an INTEGER, it would become a column of type MEASURE (not ATTRIBUTE). So it will aggregate, by default. But you may want to make its aggregation type NONE ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 111 Model, link, and tag your data for searching instead. This ensures that search results that include jersey number will not attempt to compare or aggregate those values in a way that is not meaningful. Hide a column You can hide columns from users in ThoughtSpot without dropping them from the database. This is common when the data contains identifier columns that are used to join tables, but which do not have any meaning to users. By default, all columns in a data source will be shown in ThoughtSpot. To hide the column names, change the Hidden setting. As the number of columns in the dataset increases, the search experience requires more effort. Users have to navigate through larger numbers of columns to choose the correct one. There might also be some columns in the dataset that you don’t want to expose to the users. To hide these columns, you can set the value of the Hidden column to YES. 1. Find the Hidden setting for the column you want to hide, and set its value to YES. If you are editing the model file, use the Hide setting, and set it to TRUE. 2. Save your changes. Create synonyms for a column If you want to allow searching using more than one name for a column, you can create synonyms for it. This is helpful when different departments refer to the data using different terminology, for example. When users search a data source, they might try typing different words to try to retrieve a particular column. This could be due to different groups in your organization using different terms for the same data. Or maybe your users just intuitively use different words when searching for that item. Using synonyms ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 112 Model, link, and tag your data for searching allows them to access the data even if the term they choose isn't the same as the actual column name. You can set column synonyms for columns in tables, user imported data, and worksheets. Note that the returned table or chart uses the actual column name, but the search bar will still reflect the term the user typed in (the synonym). To create a column description: 1. Find the column for which you want to add synonyms, and select its Synonyms. 2. Type in a comma separated list of the synonyms you want to add. If a synonym is more than one word, it must be enclosed in double quotes. If you are using the Web interface, you would type: profit,"gross profit" If you are using the model file, the list of synonyms must be enclosed in square brackets. For example: [profit, "gross profit"] 3. Save your changes. Change the index type for a column ThoughtSpot indexes column names and unique column values. The indexes are used to dynamically generate suggestions in the search bar when typing a search. You can change the way a column is indexed by modifying its Index value in the modeling file, to influence the suggestions that will appear for that column. The default behavior of indexing is as follows: • All column names are indexed using their ColumnName value. • Values for columns with the column type of MEASURE are not indexed. • Values for columns with the data type of DATE are not indexed. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 113 Model, link, and tag your data for searching • Columns that contain a large amount of free-form text (i.e. the number of characters in more than a few of the fields is more than 50) are indexed as PREFIX_ONLY by default. • Short strings (like a firstname column) are indexed using PREFIX_AND_SUBSTRING by default, which indexes both prefix and substrings. It is not recommended to change the indexing for columns with very large free text values. These should not to be indexed, because indexing on these values is not useful and may generate confusing suggestions. You can override the default behavior by editing the modeling file to change the Index value for any columns that should be indexed differently. There are several different supported index types: Table 16: Supported index types Index type Description DEFAULT This is the default value. The default indexing behavior will apply to the column values, depending on their type. PREFIX_AND SUBSTRING for short values and PREFIX_ONLY for long values and free-form text. DONT_INDEX Prevents indexing on the column values. PREFIX_AND_SUBSTRING Allows full indexing such that prefix and substring search both work for the column values. PREFIX_ONLY Allows indexing such that only prefix search works for the column values. PREFIX_AND_WORD_SUBSTRING Allows indexing such that only prefix search works for each word of a multi-word string, for the column values. 1. Find the column whose index type you want to modify, and set its Index Type. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 114 Model, link, and tag your data for searching If you are using the model file, double click in the Index cell, and type in the index type you want to use. 2. Save your changes. Consider a column in which there are four values “ThoughtSpot”, “Thought”, “Spot” and “Thought Spot”. If you search for “sp”, depending on the setting for indexing, the column value search result suggestions will vary: Table 17: Example of the effects of indexing on search bar suggestions Index field value Search bar suggestions DEFAULT “ThoughtSpot”, “Spot” and “Thought Spot” DONT_INDEX No suggestions. PREFIX_AND_SUBSTRING “ThoughtSpot”, “Spot” and “Thought Spot” PREFIX_ONLY “Spot” PREFIX_AND_WORD_SUBSTRING “Spot” and “Thought Spot” Add a geographical data setting for a column Certain attribute columns that contain location data can be used to create GeoMaps. ThoughtSpot supports Latitude, Longitude, Zip Code, US States, US Counties, Countries, and select international sub-nation regions. You can designate a column as "Geo" by editing the GeoType column in the modeling file or the Columns setting screen. Columns that can be designated as "Geo" columns need to contain text (VARCHAR) data unless they contain latitude/longitude data. Latitude and longitude columns can contain numeric data (DOUBLE) or text. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 115 Model, link, and tag your data for searching If you are using a column with the data type DOUBLE for latitude and longitude, you will also need to change the following settings for those columns: • Set Column Type to ATTRIBUTE. • Set Additive to NO. • Set Aggregation Type to NONE. 1. Find the GeoType for the column that contains the geographical data. 2. Change the value to the appropriate GeoType, depending on the kind of data the column contains. Table 18: Data that uses geo charts GeoType Description Type: Example COUNTRY_REGION Countries • name: United States • long name: United States • name_sort: United States of America • abbreviation: U.S.A. • adm0_a3: USA • adm0_a3_is: USA • adm0_a3_us: USA • admin: United States of America • brk_a3: USA • brk_name: United States • formal_en: United States of America • iso_a2: US • iso_a3: USA • iso_n3: 840 Counties in the • santa clara county United States • pike county, ohio • pike county, OH States in the • name: California United States • US Postal Service abbreviation: CA COUNTY STATE_PROVINCE ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 116 Model, link, and tag your data for searching GeoType Description Type: Example LATITUDE Must be used with • 37.421023 LONGITUDE • 1.282911 Must be used with • −122.142103 LATITUDE • 103.848865 Zip codes and zip • po_name: MT MEADOWS AREA codes +4 in the • ZIP: "00012" • zip2: 12 • bremen • normandy • west midlands LONGITUDE ZIP_CODE United States Other Sub-nation Regions Administrative regions found in countries other than the United States 3. If your data includes latitude and/or longitude columns that are stored as a numeric data type (DOUBLE), make these changes for those columns: a) Change the Type or ColumnType to ATTRIBUTE. b) Change Additive to NO/FALSE. 4. Save your changes. Set the search suggestions priority for a column You can change the priority that determines which columns are shown in search suggestions and the order in which they appear. The value of Priority determines the priority order (rank) in which a particular column (and its values) appears in the search dropdown. You can push a column up in the order (increase the rank) by increasing its Priority value. A higher value (like 2) will cause the corresponding column and its values to appear higher up in the search dropdown than columns with lower value (like 1). By default, the Priority value is set to “1” for all columns. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 117 Model, link, and tag your data for searching You should only use numbers between 1-100 in this field. It is recommended to assign values in increments of 5 or 10, to allow room to assign values to other columns that may be inserted later on. 1. Change the value to a number between 1 and 100. Use a higher value if you want to boost the search priority, and a lower value if you want to lower it. 2. Save your changes. Set the format to use when showing numbers You can set a format for how numbers are displayed in tables and charts. For example, you can display numbers with a different number of digits after the decimal point, based on the data modeling setting Format Pattern. You can use any of the supported number formats for delimiters and number of digits to show using Java Decimal Notation. Currency symbols are not supported. The default values are: • #,### for integer data types (INT, BIGINT). • #,###.00 for decimal data types (DOUBLE and FLOAT). These are some examples of formats you can use: Table 19: Number format examples Stored Value Format Pattern Display Value 12345.6789 #,##0.## 12,345.68 12345.6789 #,##0.### 12,345.68 12345.6789 #,##0.00000 12,345.68 12345.6789 #,##0 12,345 12345.6789 #,##0.00 12,345.68 12345 #,##0.## 12,345 12345 #,##0.00 12,345.00 ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 118 Model, link, and tag your data for searching To change the date format used to display a column's values: 1. Find the Format Pattern for the column whose display format you want to change. 2. Change it to the format you want to use. 3. Save your changes. Locale-based number formatting Number formatting is set by default based on your browser locale setting. This has been set in order to accommodate users in various geographical locations, primarily in the US and Europe regions. For example, if you are using ThoughtSpot in the US, the number formatting should look like this: xxx,xxx.xx. And in Europe, it should look like this: xxx.xxx,xx. Set the format to use when showing dates You can set a format for how dates are displayed in tables and charts. For example, you can display dates in a standard European or US format based on the data modeling setting Format Pattern. You can use any of the supported date formats listed in the Date and time formats reference. These are some examples of formats you can use: • MM/dd/yyyy • MMM (for abbreviated month format) • DD/mm/yyyy • MM/dd/yyyy HH:mm • DD/mm/yyyy HH:mm To change the date format used to display a column's values: 1. Find the Format Pattern for the column whose display format you want to change. 2. Change it to the format you want to use. 3. Save your changes. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 119 Model, link, and tag your data for searching Set the format to use when showing currencies You can set a format for how currencies are displayed in tables and charts when using the ThoughtSpot Data API or Embedding. For example, you can display currencies in a standard European Euro or US Dollar format based on the data modeling setting Currency Type. When you specify the currency type of your data on the Manage Data page, your currency data will only display the correct format and currency code in the embedded use case. Currency specific symbols are available in the nonembedded use case as well, but they are not localized. All users are treated as if they are in en-US locale unless they are in embed mode and their browser configuration tells ThoughtSpot that they are in some other locale. For example, 100 Polish Zloty appears as 100zł to a user in Poland, but without localization enabled, it appears as PLN 100. This subtle difference can be seen when you use the REST API. See the ThoughtSpot Application Integration Guide for more information on the API. To change the currency format used to display a column's values: 1. Find the Currency Type for the column whose display format you want to change. 2. Click on it to open the Specify Currency Type menu. Figure 31: Specify Currency Type 3. Select one of the following ways you would like to change the format. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 120 Model, link, and tag your data for searching • Infer From Browser Your currency data will be modeled upon the locale of your browser setting. • From a column Your currency data will be modeled upon the existing currency information in the selected column. Note: This option is disabled if there is no VARCHAR column to choose from. • Specify ISO Code Your currency data will be modeled upon your selection from the available currency code choices. 4. Click Ok to save your changes. Change the Attribution Dimension setting of a column The Attribution Dimension setting applies only to tables that are related through a chasm trap. If your schema does not include these, you can ignore this setting. The Attribution Dimension setting only applies to tables that join over a Chasm Trap. By default, the attribution dimension setting will be set to Yes, but you can override that by setting the column's attribution dimension property to No, as described here. In the classic chasm trap two fact tables are related through a shared dimension table. When the two fact tables are joined, the shared column(s) in the dimension table are used to attribute rows in one fact table to match with rows in the other fact table. Usually, all goes well using this method. But sometimes an incorrect or illogical attribution can result, because the column chosen is not meaningful for performing this attribution. If you are seeing unexpected results in searches that include tables across a chasm trap, this setting is for you. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 121 Model, link, and tag your data for searching Here is an example of a column that is not an attribution dimension. Suppose you have two fact tables, Wholesale Purchases and Retail Sales, that share a common dimension Date. In this case, the date column in the Date dimension should not be used for attribution, since unrelated rows in both of the fact tables could share the same row in the Date table. If I bought oranges wholesale on April 25, 2005 and made a retail sale of apples on the same day, there is no logical relationship between those two events. Combining the two events using the date they share will not create any meaningful information. Figure 32: Example where a column is not an attribution dimension If matching rows in two fact tables over a chasm trap depends on the values in a column contained in a dimension table, that column is known as an attribution dimension. In this example, the Product ID column in the Products dimension table is an attribution dimension. For rows where the Product ID in the Wholesale Purchases and in the Retail Sales tables is a match, those rows are logically related in a meaningful way. They can be combined in charts and reports to produce a logical, expected outcome. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 122 Model, link, and tag your data for searching Figure 33: Attribution dimension example To designate a column as not being an attribution dimension (i.e. not producing any meaningful attribution across a chasm trap): 1. Find the column that is not an attribution dimension and select its Attribution Dimension. 2. Set the value to No. If you're using the modeling file, set it to FALSE. 3. Save your changes. Link tables using relationships You can link tables by creating relationships between their columns. Linked tables can be searched together or combined into a worksheet for easy searching. Tables that have no relationship between their columns can not be combined in a single search. There are two ways to create relationships between tables: 1. Create a constraint using TQL. 2. Create a relationship through the web interface. The two methods create the same kind of relationship both from an end user perspective and an administrative perspective. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 123 Model, link, and tag your data for searching Both types of relationships exist within the database. You can also generate a script through TQL that contains all relationships, whether create via the web interface or in TQL. Relationships created through either method can be managed either via TQL or by going to the Relationships page when viewing data in the Date Modeling section in the ThoughtSpot application. You can view, modify, or delete relationships in either place. You may create relationships using a mixture of TQL and the web interface, but the relationships you create cannot form a circular relationship, or "cycle". If you attempt to create a relationship that would complete a cycle, you will see a message saying that the relationship could not be added because it conflicts with another existing relationship. Create a relationship You can quickly create a relationship (or link) between tables that allows you to combine them in a single search. Choose a column to join on that both tables contain (e.g. employee ID or product key). You must have either administration privilege or modify access permission to the columns to create a relationship. When creating a link between the columns in two data sources, the columns being linked must have the same data type, with the same meaning. That is, they must represent the same data. Normally, you'll make this kind of link from a fact table column to a column in a dimension table that uniquely identifies a logical entity in your data such as Employee ID for a person, Product ID for a product, or Date Key for a specific date in a date lookup table. To create a relationship through the Web interface: 1. Click on Data, on the top navigation bar. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 124 Model, link, and tag your data for searching Figure 34: Data 2. Click on the name of the data source you want to link from. 3. Select Relationships. Figure 35: Select Relationships 4. If there are already some existing relationships, scroll down and click Add Relationship. Otherwise, continue to the next step. 5. Click on Source Column and select the column you want to link in the source table. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 125 Model, link, and tag your data for searching Figure 36: Select a Source Column 6. Under Destination Table find and select the table that you want to link to. Figure 37: Find and select a Destination Table 7. Click on Destination Column and select the column you want to link to in the destination table. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 126 Model, link, and tag your data for searching Figure 38: Select the Destination Column 8. Click Add Key to add the link. 9. Name your relationship and optionally give it a description. Figure 39: Name the Relationship 10.Click Add Relationship. 11. Repeat these steps for creating a link until all the links you want to make for your table have been created. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 127 Model, link, and tag your data for searching Delete a relationship If you created a relationship (link) between tables using the Web interface, you can also delete it from the Web interface. But if the relationship was created using TQL, you must also use TQL to delete it. You must have either administration privilege or modify access permission to the columns to delete a relationship. To delete a relationship using TQL, use an ALTER TABLE...DROP FOREIGN KEY... statement. To delete a relationship from the Web interface: 1. Click on the Data icon on the top navigation bar and then on Worksheets. Figure 40: Go to the worksheet list 2. Click on the name of the data source you from which you want to remove the relationship. 3. Select Relationships. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 128 Model, link, and tag your data for searching Figure 41: Select Relationships 4. Find the relationship you want to delete, and click Delete. Figure 42: Delete a relationship About stickers You can create stickers to make it easier for people to find data sources and pinboards. About stickers Stickers enable you to create categories for classification of objects, including pinboards, answers, data sources, and worksheets. Only administrators can create stickers, and they are global in scope. This means that everyone can see the stickers and use them to tag objects. They can also filter lists of objects by ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 129 Model, link, and tag your data for searching sticker. Stickers are often used to designate subject areas, such as sales, HR, and finance, but you can use them any way you like. This is the workflow for using stickers: 1. Only administrators can create stickers. 2. Anyone can Apply a sticker. 3. Anyone can Filter by a sticker. Create stickers You can create stickers for use in tagging pinboards, worksheets, and data sources. Only administrator users can create stickers. Anyone can apply the stickers you create, or use them as filters when selecting from a list of sources or pinboards. To create a sticker: 1. Navigate to the Manage Data or Pinboards screen using the icons in the top navigation bar. 2. Choose Select sticker, scroll to the bottom of the list, and click + Add. Figure 43: Add a sticker 3. Type the name for the new sticker. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 130 Model, link, and tag your data for searching 4. You can change the name or color of a sticker by clicking the edit icon next to its name. Figure 44: Edit a sticker Apply a sticker Apply a sticker whenever you want to tag a data source, worksheet, or pinboard to make it easier to find. Only administrators create stickers, but anyone with edit privileges can tag an object with a sticker. To tag an object with a sticker: 1. From the top menu, choose Answers, Pinboards, or Data. Figure 45: Choose Answers, Pinboards, or Data 2. Find the item(s) you want to tag in the list, and check the box next to its name. 3. Click the apply sticker icon and choose one from the list. You can apply as many stickers as you like to an object. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 131 Model, link, and tag your data for searching Figure 46: Choose a sticker to apply Filter by a sticker Whenever you are selecting objects from a list, you can filter by sticker to find what you're looking for. Anyone can use stickers to filter lists of pinboards or data sources. You can also filter by sticker when selecting data sources. To filter by sticker: 1. From the top menu, choose Answers, Pinboards, or Data. Figure 47: Choose Answers, Pinboards, or Data 2. Click on Select sticker, and select a sticker to filter by. Click on its name. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 132 Model, link, and tag your data for searching Figure 48: Filter by a sticker ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 133 Simplify searching with worksheets Chapter 4: Simplify searching with worksheets Simplify searching with worksheets Topics: After modeling the data, create worksheets to make searching easier. A worksheet groups multiple related • Create a new worksheet • Edit a worksheet • Delete a worksheet or Worksheets are flat tables created by joining columns table from a set of one or more tables or imported tables together in a logical way. datasets. You might use a worksheet for these reasons: • To pre-join multiple tables together. • To give a user or group access to only part of the underlying data. • To include a derived column using a formula. • To rename columns to make the data easier to search. • To build in a specific filter or aggregation. Users are often unfamiliar with tables and how they are related to one another. For example, a sales executive might need to search for information about retail sales. The required data could be contained in several tables (sales, customers, products, stores, etc.), with foreign key relationships between them. An administrator who is familiar with the data model can create a retail sales worksheet, that combines all of the related fact and dimension tables into a single, easy-to-use view, and share it with the sales executive. This provides access to the data without requiring an understanding of how it is structured. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 134 Simplify searching with worksheets You will typically create one worksheet for each set of fact and dimension tables. For example, you may have a sales fact table and an inventory fact table. Each of these fact tables shares common dimensions like date, region, and store. In this scenario, you would create two worksheets: sales and inventory. The following diagram depicts the workflow for creating the sales worksheet. Figure 49: Workflow for creating a sales worksheet The process for creating a worksheet is: 1. Decide which tables to use for the worksheet. 2. Create a new worksheet. 3. Add sources (tables) to the worksheet. 4. Choose the inclusion rule to apply. 5. Choose the worksheet join rule. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 135 Simplify searching with worksheets 6. Select the columns to include. 7. Create formulas, if needed. 8. Save the worksheet. 9. Share the worksheet with groups or users. An alternative way to create a worksheet is to do a search and save it as a worksheet. See the ThoughtSpot User Guide for details on how to do this. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 136 Simplify searching with worksheets Create a new worksheet Create a worksheet to make the data easy for users to search. This process includes adding a new worksheet, after which you will choose the data sources to include in it. To create a new worksheet: 1. Click on Data, on the top navigation bar. Figure 50: Data 2. Click the Actions icon from the upper right side of the screen, and select Create worksheet. Figure 51: The Create worksheet icon Add sources and columns to a worksheet After creating a worksheet, you need to add the sources that contain the data. Sources is another name for tables. The sources you choose are typically related to one another by foreign keys. To add the sources to the worksheet: 1. Click on the Choose Sources link. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 137 Simplify searching with worksheets Figure 52: The Choose Sources link 2. Check the box next to each of the sources you want to include in the worksheet. Note that the list of sources only shows the tables on which you have view privileges. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 138 Simplify searching with worksheets Figure 53: Choose sources for a worksheet 3. If you want to see what the data inside the sources looks like, click Explore all data. 4. Choose the inclusion rule. 5. Choose the worksheet join rule. 6. Click Done to save your changes. 7. Expand the table names under Columns to select the columns to add to the worksheet. a) To add all of the columns from a table, click on the table name and click + Add Columns. b) To add a single column, double click on its name. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 139 Simplify searching with worksheets c) To add multiple columns, Ctl+click on each column you want to add and click + Add Columns. Note that once you add a column, non-related tables (i.e. those without a primary/foreign key relationship) become hidden. If you are working with two tables that should be related, but are not, you can add a relationship between them. 8. Click on the worksheet title to name it, and then Save it. 9. Click on each column name to give it a more user-friendly name for searching. You can tab through the list of columns to rename them quickly. 10.If you want to add a prefix to the name of several columns, select them, click the Add Prefix button, and type in the prefix. Figure 54: Add a prefix to column names 11. Click Actions and select Save. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 140 Simplify searching with worksheets Figure 55: Save a worksheet 12.Share your worksheet, if you want other people to be able to use it. How the inclusion rule works Use the inclusion rule to specify which data to include in a worksheet where two or more tables are joined. If you are familiar with SQL, you might think of it as a JOIN condition. In the case where some of the rows in the fact table do not have a match in one of the dimension tables, the inclusion rule determines whether or not they will be shown. Because of this, the inclusion rule can affect the number of rows the worksheet will contain. Setting the inclusion rule differently will affect the number of rows in the worksheet if some of the values in a table are empty (or NULL) or if some primary key column values in a fact table do not have a match to a foreign key in the dimension table. Only rows in the fact table (also known as the LEFT table) are affected by the inclusion rule. It works like this: • If you choose Apply full outer join (Full Outer Join), the results of both the left and right outer joins are combined, and all matched or unmatched rows from the tables on both sides are shown. • If you choose Apply left outer join (Left Outer Join), all possible rows in the fact table are shown, regardless of whether they have a match in the dimension tables. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 141 Simplify searching with worksheets • If you choose Apply right outer join (Right Outer Join), all possible rows in the second table are shown, regardless of whether they have a match in the dimension tables. • If you choose Exclude empty rows (Inner Join), any rows that do not have a match in one of the dimension tables, won't be shown in search results. When using Exclude empty rows (Inner Join), the number of rows in the resulting worksheet can differ from the number of rows in the table when accessing it directly, because of the join condition. The worksheet acts like a materialized view. This means that it contains the results of a defined query in the form of a table. If you find that the charts and tables built on a worksheet contain a large number of null values (which display as {blank} in the web browser), you can fix this by changing the inclusion rule for the worksheet. The answer returned when searching using a worksheet as the source can be different from the answer you get when using the table directly as a source. When using a worksheet as a source, even if you were to select fields that come from only one table in your search, any underlying joins to other tables will still be active. When using the table directly as the source, you will see every value. This is best understood through an example. A typical sales fact table contains a column with the employee ID of the person who made the sale. The employee ID column has a foreign key in the employee dimension table. This is the relationship used to join the two tables. Sometimes a sale has been made directly or through a reseller, without involving a sales person. In this case, the employee ID value for the sale will be empty in the fact table. If you wanted the worksheet to include all sales, regardless of whether or not they were associated with a sales person, you would choose Include all ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 142 Simplify searching with worksheets rows (Left Outer Join). If you only want the worksheet to contain sales made by employees, you would choose Exclude empty rows (Inner Join). How the worksheet join rule works Use the worksheet join rule to specify when to apply joins when a search is done on a worksheet. You can either apply joins progressively, as each search term is added (recommended), or apply all joins to every search. Often, a worksheet includes several dimension tables and a fact table. With progressive joins, if your search only includes terms from the fact table, you'll see all of the rows that satisfy your search. But as you add terms from dimension tables, the total number of rows shown may be reduced, as the joins to each dimension table are applied. It works like this: • If you choose Apply joins progressively (recommended for most cases), joins are only applied for tables whose columns are included in the search. • If you choose Apply all joins, all possible joins are applied, regardless of which tables are included in the search. When using Apply joins progressively, the number of rows in a search using the worksheet depends on which tables are part of the search. The worksheet acts like a materialized view. This means that it contains the results of a defined query in the form of a table. So if a particular dimension table is left out of the search, its joins are not applied. About the worksheet join rule with Rule-Based Row Level Security When working with worksheets and row level security, you need to understand how joins are applied. This is especially important if your schema includes any chasm traps. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 143 Simplify searching with worksheets This section applies only to the newer Rule-Based Row Level Security. If you are using the older, Legacy Row Level Security (not recommended), see About the worksheet join rule with Legacy Row Level Security. Rule-Based Row Level Security with worksheets In the past, if you used the Legacy Row Level Security, you could depend on the worksheet join rule to protect sensitive data, based on the row level security settings on a single table. But now, with Rule-Based Row Level Security, you need to protect every table that contains any sensitive data. To do this, you'll grant access by creating explicit row level security rules on each of the underlying tables which contain data that row level security should apply to. When creating the row level security rules for a table that's part of a worksheet, you aren't limited to referencing only the columns in that table. You can specify columns from other tables in the worksheet as well, as long as the tables are joined to the table you're creating the rule on. Then, when creating a worksheet on top of them, the behavior is consistent regardless of the worksheet join rule you choose. Users will never be able to see data they should not, regardless of what their search contains. Example of using Rule-Based Row Level Security to secure a table Imagine you have a worksheet that contains a "Sales" fact table, and "Customer" and "Product"dimensions that are joined on "Customer SSN" and "Product Code" columns. In order to secure the "Sales" table, you can use "Customer Name" from the "Customer" column to create a row level secuirty rule. Chasm Trap This is particularly important with chasm trap schemas. For chasm trap schemas, if row level security is only set on one of the tables, people could see data they ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 144 Simplify searching with worksheets should not see if the scope of their search does not include that table. (this protects the from having people see the wrong things if they have chasm trap). For any worksheets that include a chasm trap, you need to use the new RuleBased Row Level Security. In fact, starting in release 3.3, if you have existing Legacy Row Level Security built on a chasm trap schema, you'll need to migrate to the new row level security before you can use them anymore. If you were still using Legacy Row Level Security, after upgrading to 3.3.x, you would not be able to access any of those worksheets. You’d see a message advising you to migrate to the newer Rule-Based Row Level Security. Note also that for chasm trap worksheets, progressive and non-progressive joins do not apply. There is an entirely different methodology for how worksheet joins on a chasm trap schema work with row level security. So you can safely ignore that setting. About the worksheet join rule with Legacy Row Level Security When working with worksheets and Legacy Row Level Security, you need to understand how joins are applied. This section gives some examples to explain the interaction between these two concepts. This section applies only to the older Legacy Row Level Security. If you are using the newer, Rule-Based Row Level Security (recommended), see About the worksheet join rule with Rule-Based Row Level Security. If your schema includes any chasm traps, you must use the newer Rule-Based Row Level Security. Worksheet schemas and the root table To understand how the worksheet join rule is applied, you first need to understand worksheet schemas and the concept of root tables. When you create a worksheet, you're effectively creating a self-contained schema made up of the tables in the worksheet and the relationships (joins) between the tables. The joins (represented by arrows in the diagram) reflect the primary key/foreign key relationships between the tables in the underlying database schema. The ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 145 Simplify searching with worksheets concept of the "root" table in the worksheet schema becomes important for understanding how the joins are applied when searching. In this context, the root table is specific to the schema structure defined for that worksheet. Suppose you created a worksheet with a schema like the example in the diagram: Figure 56: Example of a worksheet data schema Imagine your schema as a tree, with a root, branches, and leaves. In this diagram, the root table is the fact table Sales. The root table is typically a fact table. It is the table that does not have a foreign key pointing to it. So if you draw out the schema like the diagram, the tables at the bottom can be referred to as the "leaves" in the schema. If the worksheet only included the tables Products, Departments, and Suppliers, then the root table would be the Products table. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 146 Simplify searching with worksheets Similarly, if the worksheet only included Inventory and Warehouses, the root table would be Inventory. Apply all joins When you choose Apply all joins when creating a worksheet, all joins between the tables get pre-applied, whether or not there is row level security present. This is the simplest case. Apply joins progressively without row level security What if you choose Apply joins progressively when creating a worksheet, and none of the tables in ThoughtSpot have row level security applied? In this case, the joins will be applied using the worksheet schema as in these examples: Example 1: Progressive join with tables from one branch Joins are applied from the root table of the worksheet down to the lowest leaf table involved in the search. If the worksheet includes all of the tables in the diagram, but when doing a search we choose only columns from Products and Departments, the joins get applied starting at the root table and moving down to all of the tables included in the search. That is, joins from Sales to Products to Departments (Join 1 and Join 2) will be used. Example 2: Progressive join with tables from different branches Joins are applied from the root table of the worksheet, moving down each branch, to the lowest leaf tables involved in the search. If we searched on columns from only the Suppliers and Warehouses tables, the join path would traverse down the tree to reach the lowest leaf table in each branch. That is, the joins applied would be: • Join 1 ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 147 Simplify searching with worksheets • Join 3 • Join 4 • Join 5 Apply joins progressively with row level security If any of the tables in ThoughtSpot have level security applied to them, the joins used in your worksheet will be affected like this: • If the row level security is applied only outside the scope of the worksheet schema, the join behavior is the same as when there is no row level security in the system. • If the worksheet contains even a single table with row level security, nonprogressive joins (Apply all joins) will be used if the join path includes the table with row level security. Remember that the join path begins at the root table and moves down to each of the leaf tables included in the search. So a table with row level security may occur in the join path even if its columns are not included in the search. Example 3: Progressive joins when a table outside of the join path has row level security Assume the table Departments has row level security applied, so that department managers can only see the department they manage. If we did a search on the tables Suppliers and Products, progressive joins would be used. The join path would be Join 1, Join 3. The Department table row level security would not apply, since the Departments table is not in the join path. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 148 Simplify searching with worksheets Example 4: Progressive joins when a table in the join path has row level security Assume now that the table Products has row level security applied, so that buyers could only see the products they order. If we did a search on the tables Departments and Warehouses, Apply all joins would be used, so the Products row level security would apply. The join path would be Join 1, Join 2, Join 4, and Join 5. This join path takes us through the Products table, which explains why its row level security would affect the search results, even though no columns from the Products table are included in the search. Chasm trap If you have a worksheet that includes a chasm trap, you cannot use the Legacy Row Level Security. You must migrate your row level security settings to use Rule-Based Row Level Security. About aggregated worksheet and table joins You have the ability to join an aggregated worksheet with a table. Previously you could only join aggregated worksheets with other aggregated worksheets. Now, you can join an aggregated worksheet with a system table by creating a relationship. This means aggregated worksheets now behave similar to tables, and they can be used in the same way as a table, excluding any TQL manipulation. You also have the capability to create a worksheet on top of an aggregated worksheet. So aggregated worksheets can be included as tables in regular worksheets. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 149 Simplify searching with worksheets About formulas in worksheets You can define formulas and use them to create derived columns in worksheets. You create formulas by combining standard functions and operators, column names, and constant values. Anyone who can create a worksheet can add a formula to it. Formulas are not reusable; the formula you create is associated only with the worksheet it belongs to. A complete list of available formulas and examples of each is available in the Formula reference. Create a formula in a worksheet You can create a formula in a worksheet by using the Formula Builder. When you do this, the result of the formula gets added to the worksheet as a column. Use these steps to create a formula: 1. Create a new worksheet, or edit an existing one. 2. Click the + button next to Formulas. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 150 Simplify searching with worksheets Figure 57: Create a new formula in a worksheet 3. Type your formula in the Formula Builder. Figure 58: Use the Formula Builder ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 151 Simplify searching with worksheets Note: Formulas elements are color coded by type and can include the formula operators and functions (blue), the names of columns (purple), and/or constants (black). 4. If you want to change what your formula returns, use the Advanced settings. Depending on your formula, you may be able to change: • Data type • ATTRIBUTE or MEASURE • Aggregation type Figure 59: Advanced settings in the Formula Builder 5. You can see a list of formula operators with examples by clicking on Formula Assistant. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 152 Simplify searching with worksheets Figure 60: Examples in the Formula Assistant 6. Name the formula by clicking on its title and typing the new name. Click Save. Edit a worksheet As long as you have permissions to edit a worksheet, you can always go into it and make changes, such as adding sources and columns, adding or editing formulas, and changing column names. To edit a worksheet: 1. Click on the Data icon on the top navigation bar and then on Worksheets. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 153 Simplify searching with worksheets Figure 61: Go to the worksheet list 2. Click on the name of the worksheet you want to edit from the list. 3. Click the Edit button in the upper right hand side of the screen. 4. Make your changes to the worksheet. 5. Click Actions and select Save. Figure 62: Save a worksheet Rename a worksheet or table You can change a worksheet or table name from the ThoughtSpot application. To change the name of a worksheet or table: 1. Click on Data, on the top navigation bar. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 154 Simplify searching with worksheets Figure 63: Data 2. Find the worksheet or table you want to rename and click on its name. 3. On the right hand side, click the current name, and enter a new name. Figure 64: Enter a new name You can also edit column names and other details in the same way. 4. Click Done and Save. Change the inclusion or join rule for a worksheet As long as you have permissions to edit a worksheet, you can always go into it and set a different inclusion rule or join rule. If you find that the charts and tables built on a worksheet contain a large number of null values (which display as {blank} in the web browser), you can fix this by changing the inclusion rule for the worksheet. To change the inclusion or join rule of a worksheet: 1. Click on the Data icon on the top navigation bar and then on Worksheets. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 155 Simplify searching with worksheets Figure 65: Go to the worksheet list 2. Click on the name of the worksheet you want to edit from the list. 3. Click the Edit button in the upper right hand side of the screen. 4. Click on the Choose Sources link. Figure 66: The Choose Sources link 5. Choose the inclusion rule and/or the worksheet join rule. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 156 Simplify searching with worksheets Figure 67: The worksheet join rule and inclusion rule 6. Click Done. 7. Click Actions and select Save. Figure 68: Save a worksheet Delete a worksheet or table When you try to delete a worksheet or table, you'll see a message listing any dependent objects that must be removed first. ThoughtSpot checks for dependencies whenever you try to remove a table or worksheet. A list of dependent objects is shown, and you can click on them to delete them or remove the dependency. Then you'll be able to remove the table or worksheet. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 157 Simplify searching with worksheets To delete a worksheet or table: 1. Click on Data, on the top navigation bar. Figure 69: Data 2. Find the worksheet or table you want to remove in the list, and check the box next to its name. 3. Click the Delete icon. Figure 70: The Delete icon 4. If you are attempting to delete a data source with dependent objects, the operation will be blocked. You will see a warning, with a list of dependent objects with links. Click on the link for an object to modify or delete it. When all its dependencies are removed, you will be able to delete the data source. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 158 Simplify searching with worksheets Figure 71: Dependent objects warning 5. You can also click on the name of a worksheet or table and then click Dependents, to see a list of dependent objects with links. The Dependents list shows the names of the dependent objects (worksheets and pinboards), and the columns they use from that source. You can use this information to determine the impact of changing the structure of the data source or to see how widely used it is. Click on a dependent object to modify or delete it. Figure 72: Dependent objects message ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 159 Manage users, groups, and privileges Chapter 5: Manage users, groups, and privileges Manage users, groups, and privileges Topics: • About privileges • Add a group and set security privileges • Edit or delete a group • Add a user • Add multiple users to a group Before people can log in and use ThoughtSpot, you need to create a username, a password, and a membership in one or more groups for them. Creating groups and assigning users to them makes privilege management easier. Ways of managing users and groups This section describes manual creation of users, groups, and privileges, but you can also manage • Edit or delete a user users through LDAP or SAML. For information on • Forgotten password setting up SAML authentication, see the ThoughtSpot Application Integration Guide. The "All" group There is a default group called All, which includes every user in ThoughtSpot. When you create a new user, they will be added to the All group automatically. You cannot delete the All group or remove members from it. Privileges Privileges determine what kinds of actions users are allowed to do. Plan your groups so that you can use them to assign a common set of privileges to multiple users. Privileges are set at the group level. For more information on the privileges you can assign and how to assign them, see About privileges. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 160 Manage users, groups, and privileges Nested groups (groups within groups) You can also have a hierarchy of groups. That is, groups can belong to (i.e. be children of) other groups. When using group hierarchies, permissions are inherited from the parent group. So if you're a member of a sub-group, you would automatically have the privileges of the parent group. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 161 Manage users, groups, and privileges About privileges You can assign privileges at the group level. Then you create users and assign them to groups. This is how you grant users access to different capabilities in ThoughtSpot. Each group includes a set of privileges for its users. Good planning when creating groups and assigning privileges will pay off in ease of administration and a better search experience. The privileges a group has determine the actions that its members are allowed to do. If a user belongs to more than one group, they will have the highest level of the privileges from all the groups they belong to. Here are the different privileges, and the capabilities they enable: Table 20: Group Permissions Privilege Description Has administration privileges Can manage Users and Groups and has view and edit access to all data. Can upload user data Can upload their own data from the browser using Import Data. Can download data Can download data from search results and pinboards. Can share with all users Can see the names of and share with users outside of the groups the user belongs to. Can manage data Can create a worksheet. Can also create an aggregated worksheet from the results of a search by selecting Save as worksheet. Can also use ThoughtSpot Data Connect, if it is enabled on your cluster. Can schedule pinboards Can create pinboard schedules and edit their own scheduled jobs. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 162 Manage users, groups, and privileges Privileges are additive, meaning that if a user belongs to more than one group, they will have the highest level of privileges from among the groups they are a member of. They are also inherited from the parent, so that a sub-group gets all the same privileges of its parent, all the way up the group hierarchy. If you add the privilege Has administration privileges to a group, note that all users in that group will be able to see all the data in ThoughtSpot. Administrators can see all data sources, and Row level security does not apply to them. There is a default group called All, which includes every user in ThoughtSpot. When you create a new user, they will be added to the All group automatically. You cannot delete the All group or remove members from it. If you have a common set of privileges you want every user to have (typically Can upload user data and/or Can download data), add those privileges to the All group. Permissions to see and edit tables, worksheets, and pinboards are set when you share them with users and groups, as described in the topic Data security. Add a group and set security privileges Before adding users, create the groups they will belong to. Each group includes a set of privileges for its users. Good planning when creating groups and assigning privileges will pay off in ease of administration and a better search experience. To create a group and add privileges for the group: 1. Log in to ThoughtSpot from a browser. 2. Click on the Admin icon, on the top navigation bar. Figure 73: The Admin icon ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 163 Manage users, groups, and privileges 3. In the Admin panel, click on User Management and Groups. Figure 74: Manage Groups 4. Click the + Add Group button on the upper right hand side of the list of groups. Figure 75: Add a new Group 5. Enter the details for the new group: ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 164 Manage users, groups, and privileges Figure 76: Enter Group details a) Enter a unique name for the group. b) Optionally enter a description. c) Check the privileges you want to grant to the group. If you add the privilege Has administration privileges to a group, note that all users in that group will be able to see all the data in ThoughtSpot. Administrators can see all data sources, and Row level security does not apply to them. d) Click the Manage Groups tab if you want to add sub-groups. Find the groups you want to add in the list, or search for them by name. Check the box next to each group you want to add to the group. e) Click the Manage Users tab if you want to add users. Find the users you want to add in the list, or search for them by name. Check the box next to each user you want to add to the group. 6. Click Create to create the group. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 165 Manage users, groups, and privileges Edit or delete a group After adding a group, you can always go in and change its settings to add or revoke privileges. The new settings will apply to all the group members. To edit or delete an existing group: 1. Log in to ThoughtSpot from a browser. 2. Click on the Admin icon, on the top navigation bar. Figure 77: The Admin icon 3. In the Admin panel, click on User Management and Groups. Figure 78: Manage Groups 4. Find the group you want to edit in the list and click its name, or the edit icon . If you don't see the name of the group, try searching for it. You can also delete a group from this page by clicking the Delete icon. Deleting a group does not delete its users. 5. Make your changes and click Update. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 166 Manage users, groups, and privileges Add a user You will create a user account for each unique person who will access ThoughtSpot, either manually or through LDAP. This procedure shows how to creating a user manually. When you create a user, you can assign group memberships. The group's privileges and permissions apply to all of its members. Any user you create will be added to the group All automatically. 1. Log in to ThoughtSpot from a browser. 2. Click on the Admin icon, on the top navigation bar. Figure 79: The Admin icon 3. In the Admin panel, click on User Management and Users. Figure 80: Manage Users 4. Click the + Add User button on the upper right hand side of the list of groups. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 167 Manage users, groups, and privileges Figure 81: Add a new User 5. Enter the details for the new user: Figure 82: Create a user manually a) Name: A unique name for the user (usually their first and last name). b) Username: A login name for the user. Note: Usernames must be unique and lowercase. If you are using Active Directory to authenticate users, and your LDAP configuration requires users to be created manually (i.e. they are not created automatically in ThoughtSpot upon authentication), the username you specify has to be domain qualified (e.g. username@ldap.thoughtspot.com), and you must enter a dummy password. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 168 Manage users, groups, and privileges c) Password and Confirm Password: A temporary password. d) E-mail Address: The user's email address. This is used for notification when another user shares something with them. e) Add to group: Select all the groups the user will belong to. If you add the user to a group that has the privilege Has administration privileges, note that they will be able to see all the data in ThoughtSpot. Administrators can see all data sources, and Row level security does not apply to them. 6. Click Save to create the user. When you create a new user, the groups they belong to define the user's: • Privileges - the actions they are allowed to do, which are defined when you Add a group and set security privileges. • Permissions - the data they can access and view, which is defined when you Data security. Add multiple users to a group You can add multiple users to a group using one button. To add multiple users to a group: 1. Log in to ThoughtSpot from a browser. 2. Click on the Admin icon, on the top navigation bar. Figure 83: The Admin icon 3. In the Admin panel, click on User Management and Users. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 169 Manage users, groups, and privileges Figure 84: Manage Users 4. Select the users you would like to add to the same group from the list. 5. Click the Add Users to Groups button on the top of the list of users. Figure 85: Add a new User Edit or delete a user After a user has been created, you can always go back and change their settings, for example to change their group memberships or change their password. As an administrator, you can edit a user and change the groups the user belongs to. You can also edit a user to reset a user’s password by entering and confirming the new password. This is useful if a user has forgotten their password, or to effectively disable an account. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 170 Manage users, groups, and privileges To edit an existing user: 1. Log in to ThoughtSpot from a browser. 2. Click on the Admin icon, on the top navigation bar. Figure 86: The Admin icon 3. In the Admin panel, click on User Management and Users. Figure 87: Manage Users 4. Find the user you want to edit in the list and click on its name or the edit icon . If you don't see the name of the user, try searching for it. You can also delete a user from this page by clicking the Delete icon. 5. Make your changes and click Save. Forgotten password If a user forgets their password, you can reset it by editing the user. To reset a user's password, follow the procedure in Edit or delete a user. If you have forgotten the admin password, please call ThoughtSpot Support. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 171 Manage jobs Chapter 6: Manage jobs Manage jobs Topics: • About scheduled pinboards The Jobs Management page found on the Admin section in the ThoughtSpot web application allows you to create and manage jobs, namely scheduled pinboards. All jobs on your cluster will appear on the Jobs Management page. You can also view jobs for individual pinboards under the pinboard Actions dropdown. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 172 Manage jobs About scheduled pinboards You can get pinboards emailed to you on a regular basis and do analysis offline. This introduces an additional format for you to consume and share pinboards with others, including those who don’t have a ThoughtSpot account. Scheduled pinboards should help with preparing for recurrent meetings, when reviewing the same pinboard is necessary. They should also be useful when you have metrics you want to monitor at a consistent interval, like daily or monthly sales targets. Contact ThoughtSpot Support if scheduled pinboards is not enabled on your cluster, or you can run the command tscli scheduled-pinboards to enable it yourself. Scheduled pinboard creators Administrators and users with can schedule pinboard privilege can schedule and manage pinboard jobs. These scheduled pinboard creators must have at least edit-only and view-only rights to the pinboard they want to share. Caution: It is recommended that admins carefully choose who to give can schedule pinboard privilege to, since there is a possible security hole where a user with limited access can get a pinboard email with all access data. Row level security The scheduled pinboards respect row level security rules. This means if the recipients are users in ThoughtSpot, then they can only see data based on their own access to the pinboard. If the user does not have at least view-only access to the pinboard, then they will not see anything in the email. However, if the recipients are from outside of the cluster, then they will have access to the dataset of the pinboard based on the sender’s permissions. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 173 Manage jobs Scheduled pinboard formats The pinboard visualizations are attached to the scheduled email as CSV or PDF files. Saved configurations such as pinboard filters are applied to the attachments. Refer to the table to see how the pinboard data is represented in each file format. Table 21: CSV - PDF report comparison CSV PDF The CSV file gets data only for table The PDF file gets data for all visualizations. visualizations. The email has n CSV attachments, where there The email has only one attachment file, which are n table visualizations in the pinboard. includes every visualization on its own page. Table visualizations have all data rows that Table visualizations include only the first 100 they're supposed to have. rows. In the case of a corrupted pinboard: no email In the case of a corrupted pinboard: the PDF is sent. An error message indicating failure attachment has empty/error screenshots. to export data is visible on the Admin Jobs Management page. In the case of a corrupted visualization: an In the case of a corrupted visualization: the email with the visualizations whose data can be PDF attachment has empty/error slots for the exported is sent. An error message indicating corrupted visualizations. visualization export error is visible on the Jobs Management page. The size of each email is limited to 25 MB, which matches most email services size limitations. And the total number of recipients for a scheduled pinboard job cannot exceed the default of 1000. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 174 Manage jobs Schedule a pinboard job You can schedule a pinboard job for any pinboard by using the Add a schedule prompt page. You can add multiple schedules with different configurations for a single pinboard. However, each job is limited to one pinboard schedule. In order to add a schedule, you must have administrator or can schedule pinboard privilege, and at least edit-only and view-only access to the pinboard. To schedule a pinboard: 1. Log in to ThoughtSpot from a browser. 2. Click on Pinboards, on the top navigation bar. Figure 88: Pinboards 3. Select the pinboard you would like to create a schedule for. 4. Click Actions and select Manage schedules to view all of the schedules set for the pinboard. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 175 Manage jobs Figure 89: View pinboard schedules 5. Click New schedule to add a new pinboard schedule. Figure 90: Add a new pinboard schedule 6. On the Add a schedule prompt page, set the times you would like to schedule the pinboard for. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 176 Manage jobs Figure 91: Set the pinboard schedule Scheduled pinboards can be set to repeat every n minutes, hourly, daily, weekly, or monthly. For some of these, you can also choose specific times of the day or days of the week. Make sure to note the Server time zone which is the timezone which will be used. 7. Enter the name and description of your schedule. Then select whether you'd like to send the pinboard as CSV or PDF attachments. Figure 92: Set the pinboard type ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 177 Manage jobs CSV files provide all data for tables, with one attachment per table. Use CSV files to perform further analysis offline. PDF files show all visualizations in the pinboard. Each chart takes up a whole page in the file, while only the first 100 rows of a table are included. Use PDF files to skim the data. 8. Add recipients by entering ThoughtSpot users or groups. Suggestions are shown as you type. Click Add to add the selected users or groups. You can also add recipients that are not ThoughtSpot users by entering their email addresses. To do so, you must first set the whitelist domains. Contact ThoughtSpot Support to set your whitelist domains. Figure 93: Set the pinboard recipients You are limited to 1000 recipients per job. 9. Click Schedule to save your scheduled pinboard. Scheduled pinboards management You can manage all scheduled pinboards on the Jobs Management page under Admin. Users who are not admins, but have can schedule pinboard privilege, can only view pinboard schedules they've created. You can select specific jobs and ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 178 Manage jobs choose to pause, resume, edit, or delete them. You can have up to 50 scheduled jobs on your cluster at time. Figure 94: View Jobs Management page Bulk actions Select the scheduled pinboards and use the Delete, Resume, and Pause buttons to perform these bulk actions. Deleting a pinboard will also delete any schedules linked to it. Job statuses Clicking on the row of a job will open a detailed view of every generated update of that job. You can see the start and end times of the job, as well as the status. Clicking on a job will show more information about the status updates. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 179 Manage jobs Figure 95: Success status updates Figure 96: Failed status updates ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 180 Manage jobs Pinboard links Click the scheduled pinboard name link to jump to a Edit schedule page, where you can edit the schedule configurations. You can also click on the pinboard link provided in the scheduled pinboard emails to jump to the pinboard in ThoughtSpot. In order to have the link direct you to the correct URL, you must first configure front end host and port access. Contact ThoughtSpot Support to configure these settings. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 181 About security Chapter 7: About security About security Topics: • System security • Data security • Network security This section is an overview of security in ThoughtSpot. There are several aspects of security, including access and permissions, data security and privacy, and security from an IT perspective. System Security System security refers to the following: • Audit logs • Security policies Network Security Network security refers to the following: • Network ports Data Security Data security refers to which users can see which data in the ThoughtSpot application, and includes: • Users and Groups • Privileges • Table and columns sharing • Row level security • Worksheet sharing • Pinboard sharing ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 182 About security System security ThoughtSpot includes a number of management tools, monitoring applications, and automated processes to support system security. System security includes managing access and privileges, audit logs, security policies, and Linux OS installed package updates. Audit logs There are several ways you can view audit log information in ThoughtSpot. You can see recent events in the Control Center or view more detailed audit logs using tscli. Administrators can view audit logs of configuration changes users have made to ThoughtSpot in these ways: • View events from the Control Center. • Generate audit log reports through tscli. Security policies Security policies are the principles and processes ThoughtSpot uses in development to ensure a product that conforms to security standards. Get audit logs You can access an audit log of cluster events through tscli. You can also access information on cluster updates, configurations, data loading and metadata events. Use the tscli event list command to return an audit list of events from the cluster. The syntax is: tscli event list [--include ] [--since | --from --to ] [--detail] [--summary_contains <'string1'| 'string2' ...>] [--detail_contains <'string1'| 'string2' ...>] ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 183 About security [--attributes ] Optional parameters are: • --include specifies the type of events to include, and can be all, config, or notification. • --detail returns the events in a detail format rather than a tabular summary, which is the default. • --summary_contains <'string1'| 'string2' ...> specifies a string to check for in the event summary. Enclose strings in single quotes, and separate multiple strings with |. Events that match all specified strings will be returned. • --detail_contains <'string1'| 'string2' ...> specifies a string to check for in the detail. Enclose strings in single quotes, and separate multiple strings with |. Events that match all specified strings will be returned. • --attributes specifies attributes to match as key=value pairs. Separate multiple attributes with |. Events that match all specified key/value pairs will be returned. Put single quotes around the value(s). And a time window made up of either: • --since is a time in the past for where the event audit begins, ending at the present time. Specify a human readable duration string, e.g. 4h (4 hours), 30m (30 minutes), 1d (1 day). Or both: • --from is a timestamp for where to begin the event audit. It must be of the form: yyyymmdd-HH:MM. • --to is a timestamp for where to end the event audit. It must be of the form: yyyymmdd-HH:MM. To get audit logs: 1. Log in to the Linux shell using SSH. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 184 About security 2. Issue the tscli event list command, with the desired parameters, for example: $ tscli event list --include config --since 24 hours Security policies ThoughtSpot has security policies in place to ensure a secure product with each release. When a release is in development, each build is tested using Qualys Network Security and Vulnerability Management Suite. Issues and vulnerabilities are fixed proactively, based on the results. The ThoughtSpot Engineering and ThoughtSpot Support teams are notified of Common Vulnerabilities and Exposures (CVEs), so they can patch OS packages proactively as well. You can View installed packages along with their version numbers at any time, in order to see if you require an update to ThoughtSpot. Whenever a CVE is identified, and an OS package needs to be updated, the next patch release will include the patch or update. View installed packages You can view installed Linux packages at any time, along with the version numbers of the installed packages. Use the command tscli os list-packages to see installed packages and their version: 1. Log in to the Linux shell using SSH. 2. Issue the tscli os list-packages command to list installed packages: $ tscli os list-packages |Package |Version | |------------------------------------------------------------------| |accountsservice |0.6.15-2ubuntu9.7 | |acpid |1:2.0.10-1ubuntu3 | |adduser |3.113ubuntu2 | |apache2 |2.2.22-1ubuntu1.10 | |apache2-mpm-prefork |2.2.22-1ubuntu1.10 | |apache2-utils |2.2.22-1ubuntu1.10 | ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 185 About security |apache2.2-bin |2.2.22-1ubuntu1.10 | ... Data security Sharing and security settings govern what data a user can access and what they can do with the data. Admins can use these settings to regulate access to information and provide a personalized user experience. Users, groups, and privileges Data security applies to users and groups. Users can be managed manually or through LDAP. Each user can have membership in one or more groups. Admins can make security settings that determine what users are allowed to do in ThoughtSpot. These settings are applied at the group level. Security model for sharing objects You can share with groups and with individual users. Sharing of tables can be defined at the table, column, or row level. This provides flexibility in modeling your data security policy. Security and sharing settings apply to several different types of objects, each of which has its own security default settings and rules. Table 22: Security model by object type Object type Description Default security model Sharing procedure Tables The source data Administrator users have access to source Share tables that have tables. They can share a table with other tables and been loaded using users or groups. columns The columns in Administrator users have access to Share the source data columns in the source tables. They can tables and tables that have share selected columns with other users or columns ThoughtSpot Loader. Columns groups. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 186 About security Object type Description Default security model Sharing procedure been loaded using ThoughtSpot Loader. Rows The rows in the All rows in the source tables are shared Define source data tables with all users by default. You can hide rows Legacy that have been from groups based on the values they Row Level loaded using contain. Security Data that was Only the user who imported the data (and Share imported using a any user with administrator privileges) tables and Web browser. has access to it by default. They can share columns ThoughtSpot Loader. Imported data a table (or selected columns) with other users or groups. Worksheets A worksheet created Only the creator of the worksheet (and using a Web browser. any user with administrator privileges) has Share worksheets access to it by default. They can share a worksheet with other users or groups. Pinboards A pinboard of saved Anyone who can view a pinboard can Share a search results. share it. pinboard Row level security ThoughtSpot includes robust row level security, which allows you to filter all objects users see based on conditions you set at the level of row values in base data tables. Share tables and columns Administrators and owners can share Can View or Can Edit privileges on tables with other users, who can further share them with others. By default, when data is loaded using the ThoughtSpot Loader, ODBC, or JDBC, it is only visible to administrators. Data imported from a Web browser is visible ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 187 About security to administrators and the user who uploaded it. You can share an entire table, or only some of its columns. Use caution when sharing tables, because any objects created from them will have dependencies on the tables and their underlying structure. Objects created from tables can include worksheets, answers, and pinboards. This means that if a user wants to drop or modify a table, any object that depends upon it must be edited or removed first, to remove the dependency. For this reason, it is a best practice to only grant the Edit permission on tables to a small number of users. Share a table or imported data by following these steps: 1. Click on the Manage Data icon in the top navigation bar. 2. Click on Tables. Figure 97: Tables in Manage Data 3. Select one or more tables to share, and click the Share icon. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 188 About security Figure 98: Select tables to share 4. Select Entire Table or Specific Columns. Figure 99: Configure table sharing settings 5. If you selected Specific Columns, select the column to share. 6. Click + and select the users and groups that you want to share with. 7. Configure the level of access by selecting from the dropdown list. You can select: • Can View to provide read-only access. This enables viewing the table data and defining worksheets on the table. • Can Edit to allow modification. This enables renaming, modifying, or deleting the entire table and adding or removing its columns. 8. Click Add and Save. 9. Click Done. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 189 About security Share worksheets You can share worksheets with users or with groups. Sharing a worksheet allows users to select it as a data source and search it. A worksheet can be shared by the owner of the worksheet, or by an administrator. Users can start searching a worksheet as soon as the worksheet is shared with them. When you share a worksheet, all of its columns are shared. Sharing a worksheet does not share the underlying tables. If you want to share the underlying tables, see Share tables and columns. 1. Click on the Data icon on the top navigation bar and then on Worksheets. Figure 100: Go to the worksheet list 2. Select one or more worksheets to share, and click the Share icon. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 190 About security Figure 101: Select worksheets to share 3. Click + Add users or groups and select users or groups that you want to share with. Figure 102: Configure worksheet sharing settings 4. Configure the level of access by selecting from the dropdown list. You can select: • Can View to provide read-only access. Enables viewing the worksheet and searching on it. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 191 About security • Can Edit to allow modification. Enables renaming, modifying filters, or deleting the worksheet and adding or removing its columns. To add columns to a worksheet a user needs access to the underlying table. 5. Click Add and Save. 6. Click Done. Share a pinboard You do not have to be an administrator or the owner to share saved pinboards. Any user can share them, based on the access levels the user has. Whenever you view a pinboard you have the option of sharing it with others. What you are really sharing is a live link to the pinboard, when you click Share with.... So whenever someone else views it, they will see the most recently saved version with the most recent data. 1. Configure the pinboard to look as you'll want it to appear when shared. 2. Click the Share icon. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 192 About security Figure 103: Share with option 3. Click + Add users or groups and select users or groups that you want to share with. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 193 About security Figure 104: Configure sharing settings 4. Configure the level of access by selecting from the dropdown list. You will only see options available, based on your own access level. For example, if you have only View access, you cannot share as Edit. You can select: • Can View to provide read-only access. If the person doesn't have access to the underlying data, they can only view a shared pinboard. If they change anything on the pinboard, their changes are not saved. In order to persist the changes, the user would need to make a copy of the modified pinboard. • Can Edit to allow modification. Enables renaming or deleting the shared pinboard. If a person with edit privileges modifies a shared pinboard, their changes will be saved to it. 5. Click Add and Save. 6. Click Add Permissions. Revoke access (unshare) You may need to revoke access to an object (table, worksheet, or pinboard) that you have previously shared. Unsharing an object is very similar to sharing it. To unshare one or more objects: ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 194 About security 1. Go to the area where the object(s) you want to unshare is located. From the top menu bar: • If the object is a table or worksheet, click Data. • If the object is a pinboard, click Pinboards. • If the object is an answer, click Answers. 2. Find the object(s) in the list, and check the corresponding box(es). 3. Click the Share icon. Figure 105: The Share icon 4. Click the X next to the users and groups that you want to remove from sharing. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 195 About security Figure 106: Click the X to unshare 5. Click Done. Row level security Row level security allows you to define which groups can see individual rows in a table, based on the values in one of its columns. Using row level security, you can effectively customize search results and pinboards for each group that views them. How row level security works Row level security works at the group level, not the individual user level. By default, all groups can see all rows for any table they can view. You can limit the rows a group can see by setting conditions on column values. The row level security rules you define on a table also apply to any worksheets and pinboards based on that table. There are several reasons you might want to use row level security: ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 196 About security Table 23: Row level security examples Reason Example Hide sensitive data from groups who should not In a report with customer details, hide potential see it. customers (those who have not yet completed their purchase) from everyone except the sales group. Filter tables to reduce their size, so that only Reduce the number of rows that appear in a the relevant data is visible. very large table of baseball players, so that players who are no longer active are not shown except to historians. Enable creation of a single pinboard or Create one sales pinboard that shows only visualization, which can display different data the sales in the region of the person who depending on the group who is accessing it. views it. This effectively creates a personalized pinboard, depending on the viewer's region. Rules-Based Row Level Security vs. Legacy Row Level Security There are two types of row level security in ThoughtSpot: • Rule-based row level security This is the newer, preferred method of setting row level security. It can handle thousands of groups, and allows you to set up flexible rules that are selfmaintaining. • Legacy row level security This is the legacy method, where you download a security file, which you edit in Excel to set up rules for row level security. This method is limited to 65 groups. You should not use this method, if you are setting up row level security for the first time. In fact, if you are using legacy row level security, you should consider migrating to the improved rule-based row level security for its better scale, flexibility, and ease of maintenance. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 197 About security Row level security and administrators If a user is a member of a group that has the privilege Has administration privileges, that user will be able to see all the data in ThoughtSpot. Administrators can see all data sources, and no type of row level security applies to them. About Rule-Based Row Level Security Rule-Based Row Level Security is the preferred way to protect your data so that users see only those rows they are allowed to see based on their group membership. Beginning in ThoughtSpot version 3.3, you can set row level security directly in the ThoughtSpot application. In past versions, you had to use the security file (see About Legacy Row Level Security). If you are setting up row level security for the first time, use the method shown in this section. How Rule-Based Row Level Security works Row level security works at the group level, not the individual user level. By default, all groups can see all rows for any table they can view. You can limit the rows a group can see by setting rules based on the data values contained in one or more columns. The row level security rules you define on a table also apply to any worksheets and pinboards based on that table. For each data source (table or imported data), you will define one or more rules that govern which groups can see which data. The rules take the form of an expression which is evaluated for each row and group combination, to decide if that group can see that row. If the expression evaluates to "true", for a particular group, they will be able to see that row. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 198 About security Row level security operators and functions For a list of operators and functions you can use to build these expressions see Row level security rules reference. Best practices for using Rule-Based Row Level Security Use these best practices for Rule-Based Row Level Security: 1. Contact ThoughtSpot Support to have them disable search suggestions based on data values. These are not hidden from users when you set row level security, so if you don't want them to ever see a search suggestion from a row they are not allowed to see, you'll need to disable the data value search suggestions. 2. Set up row level security on every table to which it applies. It is always a possibility that a particular search will only include data from a single table, and a user will see something they shouldn't. So protect your data by setting row level security wherever you want to keep data secure. 3. Give users access to worksheets instead of tables. This is a general best practice in all implementations. It makes things easier for end users, because they only need to choose among a few sources, rather than every table. Also they won't have to choose five separate tables to get meaningful results. They can choose the single worksheet that combines the tables. 4. Explicitly grant access for users that should see all rows. As soon as you create a row level security definition on a table for one group, all other groups are then blocked from seeing any rows in the table. You have to specifically grant other groups access in order for them to see any rows. If you want to ensure that a group can always see all rows in a table, use a rule that always evaluates to "true" for that group. For example: ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 199 About security • if ts_groups = supergroup then 'true' Row level security with multiple conditions When multiple row level security rules apply, the permissions are additive. That is, when there are multiple row level security conditions specified on a table, they are applied using an OR operator. If any of the rules applied allow a user to see a particular row, the row will be shown to that user. If a user is a member of multiple groups, the user will be able to see all the rows that are visible to all of the groups, so the most permissive policy is used. Workflow for setting Rule-Based Row Level Security Setting row level security is a three steps process: 1. Disable data value suggestions 2. Access the Rule Builder 3. Define Rule-Based Row Level Security rules Disable data value suggestions When you set Rule-Based Row Level Security, you need to first turn off search suggestions on data values. When Rule-Based Row Level Security is set, it protects users from seeing data they shouldn't in worksheets and pinboards. However, the search suggestions are not filtered using Rule-Based Row Level Security, so it is possible someone could see a data value they should not in a search suggestion. Disabling suggestions on data values corrects this. Be sure and do this procedure before setting up Rule-Based Row Level Security. 1. Contact ThoughtSpot Support, and tell them that you will be setting up RuleBased Row Level Security. Ask them to disable data value suggestions for you. 2. When this setting has been made, continue with Access the Rule Builder. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 200 About security Access the Rule Builder To set up rule-based row level security, you first need to access the Rule Builder. To get to the Rule Builder: 1. Click on Data, on the top navigation bar. Figure 107: Data 2. Select a table from the list by clicking on it. Row level security can only be set on tables and imported data, not on worksheets. The settings you make to the tables will automatically apply to any worksheets or pinboards created on top of them. 3. Click Row security at the top right side of the page. 4. Click the + Add row security button. Figure 108: Add row security ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 201 About security 5. The Rule Builder will appear, where you can define row level security rules. Figure 109: The Rule Builder Define Rule-Based Row Level Security rules You define row level security by creating an expression that gets evaluated for every row and group combination. This powerful feature can be used with up to thousands of groups. To define a row level security rule: 1. Access the Rule Builder. 2. Use the Rule Builder window to enter a row security rule. You'll type in an expression, which gets evaluated for every row and group combination. If the rule evaluates to true for a particular row and group, that group will be able to see that row. Use the variable ts_groups to refer to the group name. For example, the expression ts_groups = location would allow users to only see rows where the value in the location column was the same as their group name. Notice how this type of security rule is self-maintaining. If you were to later add additional locations, the rule will still work, as long as users are placed in the group that matches their location. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 202 About security Figure 110: Enter an expression 3. Use formulas if you want to define more complex expressions. You can see a list of available operators by clicking on Rule Assistant. Figure 111: The Rule Assistant 4. Use the suggestions to build a valid expression. When the expression is valid, you'll see a green indicator at the bottom of the Rule Builder. As you type, you'll see suggestions for formula syntax, variables, and column names. Using these suggestions can make defining an expression easier, particularly when it comes to specific syntax, like enclosing parameters in ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 203 About security parenthesis. And if you can't remember the exact column name or variable you want to use, the suggestions can help. 5. Click Save. 6. The rule you created will be shown in the list of rules. Click on its name to view or edit the rule. You can also add more rules by clicking + Add. 7. To test your rule, create a search that includes the column(s) you used in your expression, save it to a pinboard, and share it with all users. Log in as users in different groups, and make sure they are seeing the appropriate rows in the pinboard you created. About Legacy Row Level Security Legacy Row Level Security is no longer used. This documentation is retained to support implementations that are already using this method. If you're setting up row level security for the first time, see About Rule-Based Row Level Security. That is the preferred method. It supports thousands of groups, and is easier to set up and maintain then Legacy Row Level Security. This legacy version of row level security allowed you to define which groups could see individual rows in a table, based on the values in a column. Legacy Row Level Security settings You define the rules for row level security by editing the security file in Excel. Each row in the file is a rule, consisting of: Table 24: Columns in the security file Value Security file heading Details Group GroupName Group of users for which to apply this setting. Table LogicalTableName TableName from the model file. Table LogicalTableGUID TableGUID from the model file. identifier ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 204 About security Value Security file heading Details Column ColumnName ColumnName from the model file. Column ColumnGUID ColumnGUID from the model file. Operation Supply one of these operators: identifier Operation • BEGINS_WITH • ENDS_WITH • CONTAINS • EQ (equals) • NE (not equals) • GE (greater than or equal to) • GT (greater than) • LE (less than or equal to) • LT (less than) • BW (between) • BW_INC (between with both boundaries included) • BW_INC_MIN (between with the lower boundary included) • BW_INC_MAX (between with upper boundary included) Column Value value(s) Column value(s) to apply the operator to. If multiple values are used, as with the between operators, they are separated by a pipe (|) character. Best practices for using Legacy Row Level Security Use these best practices when using Legacy Row Level Security: 1. Give users access to worksheets instead of tables. For each table that uses row level security, you should create a worksheet and share it, so that users can search it instead of the underlying table. The worksheet must include the column(s) where the security rule(s) are defined. Then remove access to the table where you applied the row level security. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 205 About security This is the safest way to ensure that users never see any data from rows they shouldn't see. Row level security only applies in searches that include the column on which the security rule is defined. So if a user has access to the tables directly, they could do a search that omits the column that has the security rules applied, and then see the data in the other columns. But a worksheet that includes the column with the row level security definition will always hide all values from rows that don't satisfy the row level security condition. Because of this, it is a best practice to always use worksheets with row level security, and remove access to the underlying tables. This will ensure that groups can only see the data that is defined as visible to them. 2. Explicitly grant access for users that should see all rows. As soon as you create a row level security definition on a table for one group, all other groups are then blocked from seeing any rows in the table. You have to specifically grant other groups access in order for them to see any rows. If you want to ensure that a group can always see all rows in a table, grant them NE (not equals) privileges on a dummy value that does not appear in that row. For example, these settings would enable users in the group Managers to see all rows in the Sales table: • Managers Sales Region NE xxx 3. The operations GE (greater than or equal to), GT (greater than), LE (less than or equal to), and LT (less than) are meant to be used with measures (numeric values) and not with attributes. Though they technically can be used with attributes, they would be applied based on alphabetical order, which could return confusing results. Row level security with multiple conditions When multiple row level security rules apply, the permissions are additive. That is, when there are multiple row level security conditions specified on a table, they ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 206 About security are applied using an OR operator. These examples will help you understand how this works in practice: In this example, the group Analysts has two separate security entries for the Region column in the Customer table. One entry allows Analysts to see rows where the Region is equal to West. The other allows them to see rows where the Region is equal to East. So users in the Analysts group will be able to see all the rows in either Region. Figure 112: Multiple equality conditions in security file In this example, a user is a member of two groups: Analysts and Sales. The security file lists different permissions for each group based on the column Category. Analysts can see rows where Category is equal to Consumer and Sales can see rows where Category is equal to Enterprise. So the user who is a member of both groups will be able to see all the rows where the Category is equal to either Consumer or Enterprise. Figure 113: Multiple group membership in security file It works the same way with ranges and between operators as with equality. The user will be able to see any row that fulfills any of the conditions. Workflow for setting Legacy Row Level Security Setting row level security is a three steps process: 1. Obtain the security file. 2. Define Legacy Row Level Security. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 207 About security 3. Upload the edited security file. Obtain the security file Before you can make changes to the security file, you need to download it using the ThoughtSpot Web interface to access ThoughtSpot. Then you will be able to edit it using Microsoft Excel or a similar spreadsheet editing tool. To obtain the security file: 1. Log in to ThoughtSpot from a browser as the admin user. 2. Click on the Admin icon, on the top navigation bar. Figure 114: The Admin icon 3. Click on Data Security. 4. Click Download security.xls. Figure 115: Download the security file 5. Save it to your machine. 6. Open the security file in Excel or a text editor. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 208 About security • If you are using Excel, you may see a warning message. Click "Yes" to proceed. Figure 116: Warning when opening the security file • If your security file includes multi-byte characters, edit the file using vi or vim. This is because security files containing multi-byte characters must be saved as UTF-8 encoded. Otherwise you won't be able to upload them after making your edits. Define Legacy Row Level Security Edit the security file to add row level security to tables. Before you can add row level security: • Download the model file. • Obtain the security file. • Create the appropriate groups within ThoughtSpot, if they don't already exist. You can create groups and add users to them by following the procedures in Manage users, groups, and privileges. You can edit the security file using Microsoft Excel or a compatible tool. You will need to copy and paste some of the required information from the model file. 1. In the model file, find the table for which you want to add row level security. Then find the entry for the column you will use to define visibility criteria. You will copy some of the values from this entry into the security file. 2. Enter the appropriate values into the security file in the first empty row. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 209 About security Table 25: Columns in the security file Value Security file heading Details Group GroupName Group of users for which to apply this setting. Table LogicalTableName TableName from the model file. Table LogicalTableGUID TableGUID from the model file. Column ColumnName ColumnName from the model file. Column ColumnGUID ColumnGUID from the model file. Operation Supply one of these operators: identifier identifier Operation • BEGINS_WITH • ENDS_WITH • CONTAINS • EQ (equals) • NE (not equals) • GE (greater than or equal to) • GT (greater than) • LE (less than or equal to) • LT (less than) • BW (between) • BW_INC (between with both boundaries included) • BW_INC_MIN (between with the lower boundary included) • BW_INC_MAX (between with upper boundary included) Column Value value(s) Column value(s) to apply the operator to. If multiple values are used, as with the between operators, they are separated by a pipe (|) character. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 210 About security If the condition made up of the column, operator, and value combination for a particular row is true, that row will be visible to the members of the designated group. 3. Repeat this for each group, column, and condition you want to define. 4. Save the security file in the same format as it was when you downloaded it (as a Microsoft Excel file with the name security.xls). You will see a warning when attempting to save the file. Click "Yes" and save the file. Figure 117: Warning message when saving the security file Upload the edited security file After you have made changes to the security file, you must upload it back to ThoughtSpot before the changes will take effect. To upload the security file: 1. Log in to ThoughtSpot from a browser as the admin user. 2. Click on the Admin icon, on the top navigation bar. Figure 118: The Admin icon 3. Click on Data Security. 4. Click BROWSE YOUR FILES to upload the security.xls file, or drag and drop it in the zone. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 211 About security Figure 119: Upload the security file If you receive an error message upon uploading the file, check that it does not include any multi-byte characters (i.e. Japanese or other multi-byte language characters). If it does, you'll need to download the file again and make your edits using vi or vim. After the file is uploaded, ThoughtSpot is updated and the row level security changes are applied. Network security This section discusses setting up network security, both for external traffic and traffic within the cluster. Allowing External Requests Some ports must remain open for handling network requests from outside the ThoughtSpot instance. To see a list of network ports that must remain open to outside traffic, and for inter-cluster communication, review the information in Network ports. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 212 Chapter 8: System administration System administration Topics: • System monitoring • Generate and send a log bundle • Send logs to the administrator • System administration includes applying upgrades, backing up and restoring the cluster, snapshotting, and adding or removing nodes. Administration tools Use these tools to perform administrative actions: Set up recording for • tscli: an administrative command line interface. Replay Search • tsload: a command for loading data directly into the database. • TQL: a command line SQL interface to interact with databases. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 213 System administration System monitoring System monitoring tools in ThoughtSpot include the Control Center, system log files and out-of-the-box system monitoring pinboards. System monitoring notifications You can configure ThoughtSpot to send emails to addresses you specify with monitoring reports and a cluster heartbeat. Follow these steps to Set up monitoring. System Health Center The ThoughtSpot application includes a System Health Center, for easy monitoring of usage and cluster health, including alerts. You can view the System Health Center by clicking on the Admin icon and then clicking System Health. Figure 120: Getting to the System Health Center The System Health Center shows the following information: The Overview section shows a summary of overall cluster status, usage and capacity information, configuration changes, and critical alerts. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 214 System administration Figure 121: Partial view of the System Health Center: Overview The Data section shows all the stored tables with details on the last update time, time taken for auto-indexing, number of rows, etc. The status columns have these possible values and meanings: Table 26: System Health Center: Data section statuses and meanings Section Status Meaning Database READY The data has been loaded. IN PROGRESS The data is still being loaded. STALE The data is not up to date. ERROR The table is invalid. Call ThoughtSpot Support. READY The data is up to date and searchable. NOT READY The data is not ready to be searched. DELETING INDEX The table has already been deleted, but the index still exists due Search to the latency between the database and search engine. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 215 System administration Section Status Meaning INDEXING Either too many tokens exist in a column for it to be indexed, or DISABLED indexing has been disabled manually. CREATING INDEX The index is being created. UPDATING INDEX A change has been made to indexing or the data, and the index is being updated to reflect it. Figure 122: Partial view of the System Health Center: Data The Cluster Manager section show detailed information about a cluster including latency over time, snapshot status, installed release, node functions, and logs. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 216 System administration Figure 123: Partial view of the System Health Center: Cluster Manager The Events and Alerts section shows notifications, alerts, and an audit trail of cluster configuration changes.. Figure 124: Partial view of the System Health Center: Events and Alerts ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 217 System administration System monitoring pinboards There are several system monitoring pinboards in ThoughtSpot that include information about the system status and resource usage. The information in these pinboard is updated hourly from an internal database that collects monitoring statistics. Figure 125: Example of a system monitoring pinboard The monitoring pinboards can only be viewed by users with the administrator privilege. They are based on worksheets, which administrators can view, but not modify. The underlying tables are protected system tables that cannot be accessed directly. However, you can create new monitoring reports from the worksheets. The worksheets for system monitoring are: • Ts: bi server ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 218 System administration • Ts: database • Ts: loader • Ts: search • Ts: service resources Here is a list of the system monitoring pinboards: Table 27: Monitoring pinboards Pinboard Description Examples TS Stats: Usage Helps you see how much the • Questions asked by user system is being used. Shows • Questions asked by date • Pinboard impressions • Suggestion volume over search and pinboard activity by user and by date. TS Stats: Suggestions Helps you monitor the performance statistics for the suggestions provided in the time • search bar. Shows the number Suggestion latency over time and latency of suggestions given over time. TS Stats: Queries TS Stats: Resource Usage Helps you monitor database • response showing query volume, latency, • and any errors. Average vs. maximum query • Database queries and errors • CPU usage per component Helps you monitor cluster resources by showing memory and CPU usage per • Service • • Database • Search • Host • Aggregate (all) © Memory usage per component over time • Copyright latency over time component: ThoughtSpot Administration Guide Query latency by size of performance over time by Aggregate memory usage over time 2017 by ThoughtSpot. All Rights Reserved. 219 System administration Log files Many of the administration commands output logging information to log files. The logs get written into the fixed directory /export/logs, with a sub-directory for each subsystem. The individual log directories are: • /export/logs/orion • /export/logs/oreo • /export/logs/hadoop • /export/logs/zookeeper About the Space Utilization chart The Space Utilization chart is one of the available charts for you to use when checking the cluster overview. You can find the chart in the Overview section of the System Health center. This line chart displays the total used space, which consists of raw uncompressed data, including replication. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 220 System administration Figure 126: Space Utilization chart example The x-axis is by time. It allows you to zoom in and see daily or hourly data. The yaxis measures the size in GB. So in the Space Utilization chart above, the green line shows the amount of capacity in use in the system, while the red line shows the total capacity. The increase in the red line at the end of the period indicates the addition of extra hardware, resulting in increased capacity. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 221 System administration Generate and send a log bundle Use theses steps to generate a log bundle, which you can then send to ThoughtSpot Support. Before you can send a log bundle to ThoughtSpot Support, you must Connect to the ThoughtSpot Support file server. This is a one-time setup operation. To generate a log bundle: 1. Log in to the Linux shell using SSH. 2. Issue the command to generate the log bundle: tscli callhome generate-bundle --d --since Note: Don't forget to include d after your specified number of days. For example, 30d. 3. Change directories to the directory where you wrote the log bundle. 4. Issue the command to send the log bundle to ThoughtSpot Support: tscli fileserver upload --file_name --server_dir_path Send logs to the administrator Alternately, you can easily send log files directly to your administrator with a single click. When ThoughtSpot encounters a problem, a red bar displays in the browser with an error message. You can use the Report Problem option to complete this task. Click Report Problem in the bottom right corner of the error message. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 222 System administration Figure 127: Report problem The logs will be sent to your administrator as an email attachment from your email account. Your administrator then has the option to followup with ThoughtSpot, if necessary. Set up recording for Replay Search Only administrator users can record the Replay Search using the ThoughtSpot application. You can use the recording to create training for your users on how to search your own data. Recording a search replay requires administrator privileges and a Firefox browser. You must override some of your browser security settings in order to use the ThoughtSpot application to make the recording. This is a one time setup operation. If you do not wish to do this, you can replay the search and record it using QuickTime, Camtasia, or another screencam recording tool. To record a search replay using ThoughtSpot: 1. While viewing a chart or table in ThoughtSpot, click the Replay Search icon. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 223 System administration Figure 128: The Replay Search icon 2. Click the Record Replay button. If you do not see the button, you must log in as a user with administrator privileges. Figure 129: Record Replay button 3. A message will display, showing a URL and a domain or an IP address. Make note of both of these items. Open a new browser tab and go to the URL shown in the message ( for example, "about:config"). Depending on which browser and version you are using, you may need to access the browser configurations through a menu or by typing in a different ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 224 System administration URL. Check your own browser help section for information on how to access the browser configuration settings, if necessary. Figure 130: Record Search message 4. You may see a message warning that you are about to override the browser settings. If you trust yourself, click "I'll be careful, I promise!". Figure 131: Browser warning message 5. Find the setting for media.getusermedia.screensharing.allowed_domains, and add the domain used by ThoughtSpot. This domain will be the same one you made note of from the Cannot record screen message. Figure 132: Set the domain for screensharing ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 225 System administration 6. If you see a message asking if you'd like to share your screen with the IP address or domain name of ThoughtSpot, select Entire screen. Figure 133: Select entire screen 7. When the search replay has been recorded, you'll see a confirmation. Select Download. Figure 134: Download the recorded search 8. Save the recording on your computer by selecting Save File and clicking OK. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 226 System administration Figure 135: Record Search save file ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 227 Backup and restore Chapter 9: Backup and restore Backup and restore Topics: • About backups • About restore operations ThoughtSpot provides a full set of backup and restore features to protect your data from disasters and human error. You can use these for disaster recovery, for migrating a cluster, or for recreating a cluster on another appliance or virtual appliance. There are two ways to save a cluster. Snapshots run faster and are taken on a running cluster, but in some cases you'll need to pull a snapshot out as a backup. Snapshots A snapshot is a point-in-time capture of a cluster persisted on disk in HDFS. You can take a snapshot at any time, and it takes about 20 seconds. Snapshots are both taken on and restored to a cluster while it is running. A snapshot may only be restored to the same cluster on which it was taken. In addition, the cluster software release version must match the snapshot release version. If you need to move data between clusters or restore to a cluster that has been updated to a new release, contact ThoughtSpot Support. Backups A backup is similar to a snapshot, except that it is self-contained and portable, because it is stored on disk in the file system. A backup is created by pulling an existing snapshot out as a backup. It ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 228 Backup and restore is recommended to pull a backup periodically, to protect you from losing data and any work that has been done by users. You can set up ThoughtSpot to take these backups periodically at intervals and modes you define. Disaster recovery There is an even more robust strategy for backup and recovery that involves having a backup cluster offline that is kept in sync with the production cluster. Then if the production cluster fails, the backup cluster can be drafted to take its place with minimal loss of work and disruption to operations. Details on this architecture and instructions on setting it up are available in the ThoughtSpot Disaster Recovery Guide, which you can request from ThoughtSpot. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 229 Backup and restore About backups You can use a backup to restore a cluster to a prior state, a differently configured appliance, or move it to from an appliance to a virtual cluster or vice versa. Some advanced administrative operations also use backups. When to use a backup Backups are created from an existing snapshot, but they differ from snapshots in the following ways: • Backups are stored on disk in a directory, while snapshots are stored in HDFS. • You can use a backup to recover from data loss or corruption, even if your cluster has been destroyed. Snapshots can be lost if the HDFS name node fails, you lose multiple disks at once, or the entire cluster is destroyed. • If you need to move data between two appliances, you must use a backup. Snapshots may only be used to restore to the cluster they were taken from. Backups and disk space Backups can be quite large, so make sure you have enough disk space before taking one. Backups are usually stored using a NAS (network attached storage) file system. In the case of periodic backups, you must ensure enough space to store all of the backups you want to save, plus some extra space. Use a calculation like the example shown here to ensure enough disk space. Suppose a single backup takes up 6 GB of space, and that you specify three backups when setting up periodic backup, using the flag -num_backup 3. This means that three copies will be stored at the most. But the amount of space required will be larger than the simple calculation 6GB X 3. When doing a backup, ThoughtSpot copies the release tarball and several other files and then checks if those files already exist before beginning the backup. If the files are already present, then they are removed from the backup copy. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 230 Backup and restore Storing these files takes about 5 GB of extra space. So in this example, where you want to store three backups, you need about 18 + 5 = 23 GB of free disk space. Remember that the data to be backed up can grow over time, so you should occasionally check to make sure you are not running out of space to store backups. Types of backups Backups can be full, lightweight, or dataless. Full backups Full backups are entire backups of the cluster with all data, whether loaded from the web interface or from tsload. They are written to a directory, which may be moved between clusters, even if the cluster configuration is different. Full backups can be very large, so before taking one, you should make sure there is enough disk space in the directory where it will be stored. NAS (network attached storage) is recommended for storing backups. Lightweight backups Lightweight backups contain everything that makes up a cluster, except for any data loaded through ThoughtSpot Loader (tsload), ODBC/JDBC drivers, and Data Connect. Any data loaded via tsload can be re-loaded after the cluster has been restored, using the same scripts or remote connections you used to load it initially. tsload, ODBC/JDBC, and Data Connect data may be part of lightweight backups if they're used to load data into tables created through CSV import (i.e. userdata). The lightweight backups contain the following: • Cluster configuration (SSH, LDAP, etc.) • In-memory data cache • All data that is stored in HDFS ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 231 Backup and restore • Data uploaded by users • Metadata for the data store • Users, groups and permissions • Objects created by users (pinboards, worksheets, and formulas) with their shares and permissions. • Data model and row-level security rules. Dataless backups Dataless backup saves a backup of the schema, with no data. This is provided mainly for support purposes, to enable you to send a copy of your cluster metadata to ThoughtSpot Support for troubleshooting, without compromising data security and privacy. When restoring from a dataless backup, you must supply the correct release tarball, since this type of backup does not include the software release. The size of a dataless backup is usually within 10's of megabytes if you do not have customized binaries. Take a snapshot A snapshot takes a point-in-time image of your running cluster. You should create a snapshot before making any changes to the environment, loading a large amount of new data, or changing the structure of a table. Taking a snapshot is fast, and happens invisibly in the background while the cluster is running. Snapshots take about 20 seconds. Snapshot names must be 44 characters or fewer. You can have up to 20 manual snapshots at a time, after which, you have to clear one before you are able to create another. If you would like to restore from a snapshot, contact ThoughtSpot Support. To create a snapshot: 1. Log in to the Linux shell using SSH. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 232 Backup and restore 2. Initiate a snapshot, providing a name and reason for creating it: $ tscli snapshot create 3. Check that the snapshot was created: $ tscli snapshot ls Configure periodic snapshots You can configure ThoughtSpot to take snapshots for you automatically at specified times you define, or keep the default snapshot policy. The default snapshot policy is enabled for each cluster. A periodic snapshot is taken every hour, and the old snapshots are retained based on the following time-based pattern: • 3 * 1-hour snapshots • 2 * 4-hour snapshots This means you will always have a snapshot that was taken 1, 2, 3, 4, and 8 hours ago. You can create up to 20 automatic snapshots, but this is dependent on the garbage collection policy. The periodic snapshot policy can be adjusted by executing the tscli command, tscli snapshot update-policy. This will bring up an editor. However, it is not recommended that you change it. You cannot delete the snapshot policy. However, you can disable the policy by executing the tscli command, tscli snapshot disable-policy. And you can reenable it by running, tscli snapshot enable-policy. Caution: Disabling periodic snapshots will prevent periodic backups from working. To check your current periodic snapshot policy: 1. Log in to the Linux shell using SSH. 2. Issue tscli snapshot show-policy ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 233 Backup and restore Your policy will be displayed as such: admin@dogfood1:~$ tscli snapshot show-policy schedule { period { number: 1 unit: HOUR } retention_policy { bucket { time { number: 1 unit: HOUR } capacity: 3 } bucket { time { number: 4 unit: HOUR } capacity: 2 } } offset_minutes_from_sunday_midnight: 0 } enabled: true period denotes how often periodic snapshots are taken. number is the frequency in the chosen time unit. So in the example above, a snapshot is taken every one hour. retention_policy capacity denotes which snapshots are available at any given moment. sets the number of snapshots retained for a certain time policy. Therefore, you will always have the number of snapshots set in capacity for that time policy. So for the first bucket in the example above, a snapshot is available for each hour over the past three hours, which totals three snapshots. For the second bucket in the example above, a snapshot is available for every four hours over the past eight hours, which totals two snapshots. offset_minutes_from_sunday_midnight lets you set the minute within the hour you'd like to start your periodic snapshot. Setting it to zero, as depicted in the example above, will have the snapshot begin at midnight. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 234 Backup and restore Take a backup of a snapshot Taking a backup pulls a snapshot out and dumps it to persistent storage on disk, on a network mounted directory. Use this procedure when you want to create a backup. Before creating a backup, you must identify an existing snapshot to use or take a new snapshot. The time required to take a backup depends on the data size. Taking a backup does not take long, and happens in the background while the cluster is running. Choose the mode of backup you want to create, either full, lightweight, or dataless. The directory to which you will write the backup must not already exist. You specify the directory to use, and it will be created for you when taking the backup. If you would like to restore from a backup, contact ThoughtSpot Support. Use this procedure to make a backup of a snapshot. 1. Log in to the Linux shell using SSH. 2. Find out the name and size of the snapshot you want to back up. $ tscli snapshot ls --------------------------------------------------------------------------Name : pre330 Reason : pre3.3.0 Hdfs snapshot : pre330 Start : Wed May 4 18:07:32 2016 End : Wed May 4 18:08:23 2016 Size(Full) : 13.24 GB Size(LW) : 4.96 GB Size(Dataless): 39.76 MB --------------------------------------------------------------------------... 3. Make sure you have enough room on the disk where you will save the backup. In addition to the size of the snapshot, you will need 10 to 12 GB of disk space. This is because the process requires space for temporary files. $ df -h ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 235 Backup and restore 4. Create the backup, designating the type of backup, the snapshot name, and a directory: $ tscli snapshot backup [--mode {full|light|dataless}] or $ tscli.par backup create [-h] [--mode {full|light|dataless}] [--type {full|incremental}][--base BASE] [--storage_type {local|nas}][--remote] 5. Check that the backup was created: $ tscli backup ls Configure periodic backups You can configure ThoughtSpot to take backups for you automatically at specified times you define, following crontab format. Old backups are discarded automatically, using FIFO (first in, first out). Periodic backup takes a full, lightweight, or dataless backup of the cluster. It goes through the same steps as creating a backup manually, first taking a snapshot and then pulling it out into a backup on disk. Make sure you have adequate space to store the number of backups you want to archive. You can mount a NAS (network attached storage) file system to hold the periodic backups or use a local option. ThoughtSpot supports multiple periodic backups with different modes. And for each policy, you can have different types of backups. When choosing times and frequencies for periodic backups, you should choose a reasonable frequency. Do not schedule backups too close together, since a backup cannot start when another backup is still running. Avoid backing up when the system is experiencing a heavy load, such as peak usage or a large data load. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 236 Backup and restore When choosing the number of backups to store, keep in mind that there is no automatic checking to validate that a backup is valid for restoring a cluster. When the maximum number of backups has been reached, the next backup operation deletes the oldest stored backup and the corresponding snapshot before starting the new backup. This means that if you set the number of backups to one, that backup will be deleted before a new backup can be taken. For this reason, you should always set the number of backups to be greater than two, to ensure you have at least one backup available in the case of a failure while taking a backup. It is recommended to keep as many backups as you can reasonably store, to ensure you have a good backup available if you need to restore a cluster. The garbage collection policy for periodic backups align with the following policy: • 7 * 1-day backups • 4 * 1-week backups This means you will always have a backup that was created every day for the past seven days, and a backup for every week for the past four weeks. To configure periodic backups: 1. Log in to the Linux shell using SSH. 2. Find a directory with enough disk space to hold the number of backups you want to archive. Hint: You can use df -h to see free disk space and tscli snapshot ls to view existing snapshots and their size on disk. 3. Use the tscli backup list-policy command to see a list of present backup policies. 4. Use the tscli backup create-policy command to set the times to back up, directory to store backups, and number of backups to archive. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 237 Backup and restore The command will prompt an editor for you to edit the parameters of the backup policy. 5. After setting up the backup configurations, use tscli backup periodic-config to check the policy details and that it is ready. 6. You can also run tscli backup periodic-status at any time to check whether the periodic backup is running fine. The command will provide information on the periodic backup and when it was created. 7. If you would like to edit the policy, you can use tscli backup update-policy . This prompt will open an editor for you to edit the policy. 8. Use tscli backup disable-periodic to turn off the periodic backup. You can also use tscli backup disable-policy to disable a specific policy. And you can use tscli backup enable-policy to reenable it. Additionally, you can delete a policy with tscli backup delete-policy . About restore operations When restoring to a running cluster that has not been updated, you'll usually use a snapshot. But in the case where you've updated the cluster to a new release, the configuration has been changed significantly, or you're restoring to a different cluster, you'll need to restore from a backup. Snapshots are restored onto a running cluster, while backups require deleting the existing cluster first. Snapshot restore is not supported for cross versions. As such, if you’re running version 3.4, you can only restore to a snapshot taken in 3.4. Changes to a cluster that require restoring from a backup instead of a snapshot include: • Removal of a node. • Restoring to a different cluster from the one where the snapshot/backup was taken. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 238 Backup and restore • Restoring to a cluster running a different release from the one where the snapshot/backup was taken. Note: To perform a restore from a snapshot or backup, contact ThoughtSpot Support. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 239 About troubleshooting Chapter 10: About troubleshooting About troubleshooting Topics: The information here provides very basic troubleshooting. • Get logs • Network connectivity For more detailed troubleshooting, Contact issues ThoughtSpot. • Change the timezone • Browser untrusted connection error • Characters not displaying correctly • Clear the browser cache • Cannot open a saved answer that contains a formula • Data loading too slowly • Search results contain too many blanks ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 240 About troubleshooting Get logs For troubleshooting on specific incidents or cluster problems, getting a log bundle can help. How to get logs There are two ways to get logs: • When ThoughtSpot encounters a problem, a red bar displays in the browser with an error message. You can click on What Happened? in the error message for more details. To download related logs, click Download Trace. Send the logs as an email attachment to the support contact that is provided. Clicking Report Problem will also send the logs as an email attachment to your administrator. Figure 136: Download log trace • You can generate a log bundle using the tscli command tscli logs collect if you are comfortable with Linux. The command lets you specify which logs to collect and from what time periods. Usage for this command is: tscli logs collect --include [--exclude ] [--since | --from --to ] [--out ] [--maxsize ] [--sizeonly] ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 241 About troubleshooting The full list of all selectors is: • all collects all of the logs listed from the system and the ThoughtSpot application. • system collects all system logs, e.g. syslog, upstart, mail logs, etc. • ts collects all logs from the ThoughtSpot application. This includes falcon, sage, orion core (cluster management), etc. • orion collects all orion logs including cluster management, hdfs, zookeeper, etc. Detailed syntax and options are listed in the tscli command reference. Examples Here are some examples of usage for the command tscli logs collect: To collect all logs from the past day to the default location (/tmp/ logs.tar.gz): $ tscli logs collect --include all --since 1d In this example, all is a selector for all the available logs. In most cases, you'll probably use the selector ts to only capture logs for the ThoughtSpot application: $ tscli logs collect --include ts --since 2d ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 242 About troubleshooting For debugging cluster management issues, use a command like this one, which collects logs for system and orion from the past 2 hours. The output is written to /tmp/debug.tar.gz as specified using --out: $ tscli logs collect --include system,orion --since 2h --out /tmp/ debug.tar.gz This command collects logs from a specific time window: $ tscli logs collect --include system,orion --from 20150520-12:00:00 --to 20150522-12:30:00 Advanced usage alert! You can also use --include and --exclude to specify filesystem paths as a glob pattern. This works like the Linux find(1) command. Pass all the entries in --include starting with / to find(1), and all entries in --exclude which are not selectors to find(1) using the -not -path flag. $ tscli logs collect --include system,orion --exclude *hadoop*,*zookeeper* --since 2h The above command collects all system and all orion logs, but excludes hadoop (hdfs) and zookeeper logs. Upload logs to ThoughtSpot Support ThoughtSpot Support uses a secure file sever to collect log files or other files needed for troubleshooting. You can easily send log files to this file serve directly from the ThoughtSpot instance. Before you can upload a file to the secure file server: 1. Configure the connection to the file server. 2. Obtain the directory path on the file server. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 243 About troubleshooting The server directory path for uploading a file is formatted like this example: / Shared/support/. If you do not know the customer name, contact ThoughtSpot Support. You can upload files directly to the file server using this procedure: 1. Log in to the Linux shell using SSH. 2. Navigate to the directory where the file to be uploaded is located. 3. Issue the command to upload the file, specifying the file name and directory path: $ tscli fileserver upload --file_name --server_dir_path When your upload succeeds, you will see a confirmation message. Network connectivity issues If network connectivity to and from ThoughtSpot is not working, try using these steps to find and correct the issue. To troubleshoot network connectivity for ThoughtSpot: 1. Make sure that the network cables are connected correctly. 2. Check that the network cable is connecting the nodes to the network switch. 3. Try replacing the cable with a cable from a known working system to rule out a bad cable or switch connectivity issues. 4. Make sure the eth0 interface is connected to the network by issuing: ethtool eth0 The port that's currently connected will have "link detected" in the last line of the output. 5. If the networking settings have been reconfigured, reboot each of the nodes. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 244 About troubleshooting Change the timezone ThoughtSpot comes configured with the timezone where it is to be installed. If you need to change the timezone, for example when moving a ThoughtSpot appliance to another data center: Contact ThoughtSpot Support, and they will change the timezone for you. Note that if the timezone was not set correctly when shipped, it may be necessary to have ThoughtSpot Support reset it. This can be true even if the file /etc/timezone lists the correct timezone. Sometimes the timezone that is listed is not the active timezone and it needs to be reset. Browser untrusted connection error If you are not using a SSL certificate for authentication, users will see an untrusted connection error in their browser when accessing ThoughtSpot. The error looks slightly different depending upon the Web browser being used. ThoughtSpot uses secure HTTP (the HTTPS protocol) for communication between the browser and ThoughtSpot. By default there is no SSL certificate for authentication. This must be added by the site administrator. If the site administrator has not added the certificate, the browser warns the user. Table 28: Example of browser SSL certificate warnings Browser Warning Google Chrome The site's security certificate is not trusted! Mozilla Firefox This Connection is Untrusted If you see the warning message, choose one of the following options: • Ask the site administrator to install the certificate. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 245 About troubleshooting • Ask the site administrator to turn off SSL using this command in the shell on the ThoughtSpot instance: $ tscli ssl off • You can choose to ignore the message, and access ThoughtSpot without SSL. Characters not displaying correctly Your CSV files are more likely to load smoothly if they are encoded with UTF-8. If you're having problems with some characters rendering incorrectly, you can convert the files to UTF-8 encoding before loading the data. You might see unexpected characters in your data, especially characters whose ASCII values are at the high and low end of possible values. Some examples of characters that can appear incorrectly are: æ, ñ, ä, í, ö. If this happens, your data will look like this: Figure 137: Incorrectly displaying characters Instead of displaying correctly like this: To encode your data as UTF-8: Figure 138: Correctly displaying characters ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 246 About troubleshooting 1. On Windows, open your CSV file in Notepad. Save the file as CSV with the Unicode option. 2. On Linux or MacOS, issue a command like: $ iconv -f -t UTF-8 .csv > .csv 3. Reload the data. 4. Attempt to import it again. Clear the browser cache You might occasionally see unexpected behavior that is due to the Web browser caching information from ThoughtSpot. In this case, clearing the browser cache and reloading the page should resolve the problem. You can usually resolve these situations by clearing the browser cache: • During a ThoughtSpot session, the browser suddenly displays a white screen and reloading does not fix the problem. This is due to a self-signed SSL certificate that has timed out during the session. • When accessing the Help Center, you see a login screen. This is due to a problem during automatic authentication in the Help Center, after which the bad login gets cached by the browser. To resolve any of these situations, clear the browser cache: 1. Clear the browser cache. This works a little differently on individual browser versions and operating systems. For example, when using Chrome, to get to the browser cache settings, navigate to: chrome://settings/clearBrowserData ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 247 About troubleshooting Figure 139: Clear the browser cache 2. Click Clear browsing data. This is the name of the button on Chrome. The name may vary slightly on other browsers. 3. Reload the page. For example, on Chrome you would click the Reload icon: Figure 140: Reload the page 4. Now the problem should be fixed, and the page will appear as expected. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 248 About troubleshooting Cannot open a saved answer that contains a formula When working with formulas, keep in mind the data types they return. You may occasionally see unexpected results, or even be unable to open a saved answer, due to problems with data types and formulas. In this scenario, "data type" refers the data type as defined in the column definition when creating the schema (INT, TIMESTAMP, VARCHAR, etc.). When you define a formula, both the data type it returns is set automatically. This can lead to problems, if you build another formula that uses the output of the first formula as input. This can be hard to understand, so an example will be helpful. Suppose you have created a worksheet that contains a formula called "weekday" defined as: day_of_week(date) The output of that formula is the day of the week (Monday, Tuesday, etc.) returned as a text string (VARCHAR, ATTRIBUTE). Then suppose you create an answer using the worksheet as a source. And in the answer, you create another formula on top of the formula column in the worksheet. This formula is supposed to return the day of the week that is two days after the given day of the week: weekday + 2 In this case, you have effectively created a formula on top of another formula. This works fine, so long as the data types in the worksheet formula can work in the answer formula. If not, you may not be able to save the answer, or open it once it has been saved. Here, the second formula you created does not work, because it is invalid. It is trying to subtract a number from a text string. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 249 About troubleshooting If you encounter this issue, you will need to open the worksheet and edit its formula so that it returns the type expected by the formula that was built on top of it. In this case, a numeric data type. You must change the underlying worksheet column to use day_number_of_week instead of day_of_week. This is because day_number_of_week returns a numeric data type. Here are the steps to resolve an issue like this: 1. Open the underlying worksheet that contains the formula whose output data type you need to change. 2. Click on the formula name to edit the formula. Figure 141: Edit the formula 3. In the Formula Builder, modify the formula, so that it returns the expected data type. There are data type conversion formulas available to make this easier. To view them and their syntax, open the Formula Assistant, and expand the section called Conversion. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 250 About troubleshooting Figure 142: Changing the formula to return a different data type 4. Make your changes, and saving the formula by clicking Save. 5. Save the worksheet by clicking Save. 6. Now you will be able to open the answer that was created on top of the worksheet. Data loading too slowly Some tables may take an unusually long time to load due to a high data version issue. This issue normally arises when ThoughtSpot completes an upgrade or the system is recovering after a crash. The data version is the number of loads that have been historically applied to a table. Every completed load increments the version number of the table by one. ThoughtSpot would need to process each version of the table during restoration, which could increase the time it takes to build the table. There are a few steps you can take to check for a high data version issue and fix it. To improve data loading speed: 1. Run the following command to find the number of tables that are building and their names tscli cluster status --mode table ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 251 About troubleshooting 2. You may notice that a few small tables are taking up a lot of time to be built. However, this could simply be due to the deceivingly large size of the table. There is also the chance it could be due to a high data version issue. In order to determine if this is a high data version issue, check the size of the table by running the following command: echo 'show statistics for server;' | tql 3. If there is a large number of rows in the table, proceed to shard the table. 4. If the table has a small number of rows, then the slow loading speed is caused by a high data version issue, and you do not have to shard the table. Use the compact table functionality to trim the table down to its actual size: tql> compact table
; Search results contain too many blanks If you find that your search results contain too many blanks when your data source is a worksheet, there is a simple adjustment you can make to fix this. If you find that the charts and tables built on a worksheet contain a large number of null values (which display as {blank} in the web browser), you can fix this by changing the inclusion rule for the worksheet. An inclusion rule that specifies Exclude Empty Rows (Inner Join) will reduce the number of null values in the result. Imagine a worksheet that includes data about a retail grocery store. There are rows in the worksheet from these source tables: Table 29: Tables in worksheet inclusion rule example Table Name Description sales Fact table with sales made per product per store. products Dimension table with information about every product. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 252 About troubleshooting Table Name Description stores Dimension table with information about every store. When creating the worksheet, suppose you choose Include Empty Rows (Left Outer Join) for the inclusion rule and Progressive Joins for the join rule. In this case, if you type "product name" in your search, you'll see a list of all the products that exist. Suppose you then add "store name" to your search. You will see a lot of null ({blank}) values in the result. This happens because the columns "store name" and "product name" are joined through the fact table, "sales". So for every product that has never been sold in a particular store, you'll see {blank} in the "store name" column. This may be what you want to see, in which case, you can leave the worksheet as is, and choose Exclude for the {blank} values in your table or chart, whenever you don't want to see them. However, in may cases, including all the {blank} values could confuse end users. So if you encounter this problem, you can edit the worksheet, and change the inclusion rule to Exclude Empty Rows (Inner Join). Now when searching for "store name" and "product name" on the worksheet, users will not be overwhelmed by null values. They'll only see the rows where a particular product has been sold in a particular store. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 253 Reference Chapter 11: Reference Reference Topics: • TQL reference • ThoughtSpot Loader flag This Reference section contains the commands and their syntax for all the command line tools in ThoughtSpot. reference Included in this guide are: • tscli command reference • TQL reference lists the SQL commands that are • Formula reference • Date and time formats reference • Row level security rules reference supported in TQL. • ThoughtSpot Loader flag reference lists the options for loading data with tsload. • tscli command reference lists the ThoughtSpot Command Line Interface commands. • Formula reference lists the available formula operators and functions. These are also listed in the Formula Assistant, which is available from the place in ThoughtSpot where you build formulas. • Date and time formats reference lists the accepted date, time, and timestamp formats that you can use when uploading data through the Web interface or using the ThoughtSpot Loader. • Row level security rules reference lists the operators for building row level security rules. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 254 Reference TQL reference TQL is the ThoughtSpot language for entering SQL commands. This reference lists TQL commands you can use to do things like creating a schema or verifying a data load. You can use TQL either through the ThoughtSpot application's web interface or the command line interface in the Linux shell. TQL commands Note that the worksheets and pinboards in ThoughtSpot are dependent upon the data in the underlying tables. Use caution when modifying tables directly. If you change or remove a schema on which those objects rely, the objects could become invalid. You can use TQL to view and modify schemas and data in tables. Remember to add a semicolon after each command. Commands are not case sensitive but are capitalized here for readability. When referring to objects using fully qualified object names, the syntax is: "database"."schema"."table" As a best practice, you should enclose object names (database, schema, table, and column) in double quotes, and column values in single quotes. Table 30: TQL basic commands Syntax help ThoughtSpot Administration Guide Description Examples Displays command help. Copyright © TQL> help 2017 by ThoughtSpot. All Rights Reserved. 255 Reference Table 31: TQL commands for viewing schemas and data Syntax Description Examples SHOW DATABASES Lists all available USE Switches the context to databases. the specified database. TQL> SHOW DATABASES; TQL> USE "fruit_database"; This is required if queries do not use fully qualified names (database.schema.table) for specifying tables. SHOW SCHEMAS Lists all schemas within SHOW TABLES Lists all tables within the current database. the current database by TQL> SHOW SCHEMAS; TQL> SHOW TABLES; schema. SHOW TABLE
Lists all the columns for SHOW VIEWS Lists all views within a table. the current database by TQL> SHOW TABLE "locations"; TQL> SHOW VIEWS; schema. SCRIPT SERVER Generates the TQL schema for all tables TQL> SCRIPT SERVER; in all databases on the server. SCRIPT DATABASE Generates the TQL schema for all tables in a TQL> SCRIPT DATABASE "fruit_database"; database. SCRIPT TABLE
Generates the TQL SELECT Shows specified set of ThoughtSpot Administration Guide schema for a table. table data. Copyright © TQL> SCRIPT TABLE "vendor"; TQL> SELECT TOP 10 "quantity" 2017 by ThoughtSpot. All Rights Reserved. 256 Reference Syntax FROM [WHERE ] [GROUP BY ] [ORDER BY ] Description Examples If you do not specify the TOP number of rows to select, the top 50 rows will be returned by default. The number of rows to return can be set using the TQL command line flag: --query_results _apply_top_row_count You can use the following aggregation FROM "sales_fact"; TQL> SELECT COUNT(*) FROM "vendor"; TQL> SELECT "vendor", SUM("quantity") FROM "sales_fact" GROUP BY "vendor"; TQL> SELECT "vendor", SUM("amount") FROM "vendor", "sales_fact" WHERE "sales_fact"."vendorid" = "vendor"."vendorid" AND "amount" > 100 GROUP BY "vendor" ORDER BY "amount" DESC; TQL> SELECT "vendor", SUM("quantity") FROM "sales_fact" GROUP BY "vendor" LIMIT 10; functions: • sum • count • count distinct • stddev • avg • variance • min • max You can use the following date functions: ThoughtSpot Administration Guide • absyear • absmonth • absday • absquarter • date • time Copyright © 2017 by ThoughtSpot. All Rights Reserved. 257 Reference Table 32: TQL commands for creating schemas Syntax Description Examples CREATE DATABASE Creates a database. TQL> CREATE DATABASE "fruit_database"; CREATE SCHEMA Creates a schema within the TQL> CREATE SCHEMA "fruit_schema"; CREATE TABLE
Creates a table with the ( []) [PARTITION BY HASH () [KEY ("")]]) current database. specified column definitions and constraints. Use PARTITION BY HASH to shard a table across all nodes. If no KEY is specified, the table will be randomly sharded. Note that you can specify relationship constraints (FOREIGN KEY or RELATIONSHIP) in the CREATE TABLE statement. But it is recommended to define TQL> CREATE TABLE "vendor" ("vendorid" int, "name" varchar(255)); TQL> CREATE TABLE "sales_fact" ("saleid" int, "locationid" int, "vendorid" int, "quantity" int, "sale_amount" double, "fruitid" int, CONSTRAINT PRIMARY KEY("saleid")) PARTITION BY HASH(96) KEY ("saleid"); these using ALTER TABLE statements at the end of your TQL script, after creating your tables. This works better in scripts, because it guarantees that tables are created before they are referenced in the constraint definitions. Table 33: TQL commands for modifying schemas Syntax DROP DATABASE ThoughtSpot Administration Guide Description Examples Drops a database and all of its Copyright © TQL> DROP DATABASE "fruit_database"; 2017 by ThoughtSpot. All Rights Reserved. 258 Reference Syntax Description Examples schemas and tables. DROP SCHEMA Drops a schema within the current TQL> DROP SCHEMA "fruit_schema"; database, and drops all of the tables in the schema. DROP TABLE
Drops a table. TQL> DROP TABLE "location"; ALTER TABLE
ADD|DROP|RENAME COLUMN Alters a table TQL> ALTER TABLE "cart" ADD COLUMN "nickname" varchar(255) DEFAULT 'no nickname'; to add, drop, or rename a column. When you add a column to an existing table, you must provide a TQL> ALTER TABLE "cart" DROP COLUMN "nickname"; TQL> ALTER TABLE "cart" RENAME COLUMN "nickname" TO "shortname"; default value to use for existing rows. TRUNCATE TABLE
Removes all data from a table, TQL> TRUNCATE TABLE "location"; but preserves its metadata, including all GUIDs, relationships, etc. This can be used to force a new schema for a table without losing the metadata. However, this operation removes ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 259 Reference Syntax Description Examples all existing data from the table and must be used with caution. You must reload the data following a TRUNCATE, or all dependent objects (worksheets and pinboards) in ThoughtSpot will become invalid. ALTER TABLE
DROP CONSTRAINT PRIMARY KEY; Drops the primary key from a table. Note that if you then add a new primary TQL> ALTER TABLE "sales" DROP CONSTRAINT PRIMARY KEY; TQL> ALTER TABLE "sales" ADD CONSTRAINT PRIMARY KEY ("PO_number"); key, the same upsert behavior will be applied as with adding any primary key. This can result in data deletion, so make sure you understand how the upsert will affect your data ahead of time. ALTER TABLE
DROP [FOREIGN KEY| RELATIONSHIP] ; Drops the named foreign key or relationship between two tables. ThoughtSpot Administration Guide Copyright © TQL> ALTER TABLE "sales_fact" DROP FOREIGN KEY "FK_PO_number"; TQL> ALTER TABLE "fruit_dim" 2017 by ThoughtSpot. All Rights Reserved. 260 Reference Syntax Description Examples DROP RELATIONSHIP "REL_dates"; ALTER TABLE
DROP [CONSTRAINT FOREIGN KEY [] | RELATIONSHIP [WITH ]; You must use this syntax when dropping relationships between tables created before ThoughtSpot version 3.2. This is because TQL> ALTER TABLE "shipments" DROP CONSTRAINT FOREIGN KEY "orders"; TQL> ALTER TABLE "wholesale_buys" DROP RELATIONSHIP WITH "retail_sales"; /* */ Drops all relationships that have wholesale_buys as a source. not be named in TQL> ALTER TABLE "wholesale_buys" DROP RELATIONSHIP; older versions. /* Drops the foreign */ relationships could key or relationship between two tables where you Drops all foreign keys from wholesale_buys. TQL> ALTER TABLE "wholesale_buys" DROP CONSTRAINT FOREIGN KEY; cannot reference it by relationship name. If the relationship was created without a name, use: • the name of the referenced table, for a foreign key. • the name of the related table, for a relationship. If you drop a foreign key without specifying ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 261 Reference Syntax Description Examples the referenced table, all foreign keys from the table you are altering will be dropped. ALTER TABLE
[SET DIMENSION | SET FACT [PARTITION BY HASH [()] [KEY()]]] Changes the partitioning on a table by doing one of: • re-sharding a TQL> ALTER TABLE "sales_fact" SET FACT PARTITION BY HASH (96) KEY ("PO_number"); TQL> ALTER TABLE "fruit_dim" SET DIMENSION; sharded table. • changing a replicated table to a sharded table. • changing a sharded table to a replicated table. To change the partitioning on a table, or to change a dimension table to a sharded table, use ALTER TABLE...SET FACT PARTITION BY HASH...; To make a sharded table into a dimension table (replicated on every node), ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 262 Reference Syntax Description Examples use ALTER TABLE...SET DIMENSION; ALTER TABLE
MODIFY COLUMN ; Changes the data type of a column. This can have TQL> ALTER TABLE fact100 MODIFY COLUMN product_id int; implications on sharding and primary key behavior. See About data type conversion. Table 34: TQL commands for modifying data Syntax INSERT INTO
VALUES ... Description Examples Inserts values into a table. Only use this for testing. Do not use INSERT on a production system. UPDATE
... SET ... [WHERE ...] Updates rows in a table that match optionally provided predicates. Predicates have the form column = value connected TQL> INSERT INTO "vendor" VALUES 'helen rose', 'jacob norse', 'eileen ruff', 'manny gates'; TQL> UPDATE "location" SET "borough" = 'staten island', "city" = 'new york' WHERE "borough" = 'staten isl' AND city = 'NY'; by the AND keyword. Sets the column values to the specified values. DELETE FROM
[WHERE...] Deletes rows from a table that match optionally provided predicates. Predicates have the form TQL> DELETE FROM "vendor" WHERE "name" = 'Joey Smith' AND "vendorid" = '19463'; column = value connected by the AND keyword. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 263 Reference Constraints and relationships Constraints and relationships in ThoughtSpot are used to define the relationships between tables (i.e. how they can be joined). However, constraints are not enforced, as they would be in a transactional database. You can define the following constraints when creating a table with CREATE TABLE, or add them to an existing table using the ADD CONSTRAINT syntax: Table 35: Constraints in TQL Constraint Description Example PRIMARY KEY Designates a unique, non-null value as the primary key for a table. This can be one column or a combination of columns. If values are not unique, an upsert will be performed if a row CREATE TABLE "schools" ( "schoolID" varchar(15), "schoolName" varchar(255), "schoolCity" varchar(55), "schoolState" varchar(55), "schoolNick" varchar(55), CONSTRAINT PRIMARY KEY ("schoolID") ) ; TQL> ALTER TABLE "cart" ADD CONSTRAINT PRIMARY KEY ("cart_id"); TQL> ALTER TABLE "cart" DROP CONSTRAINT PRIMARY KEY "cart_id"; includes a primary key that is already present in the data. FOREIGN KEY Defines a relationship TQL> ALTER TABLE "batting" ADD CONSTRAINT "FK_player" FOREIGN KEY ("playerID") REFERENCES "players" ("playerID"); where the value(s) in the table are used to join to a second TQL> ALTER TABLE "batting" ADD CONSTRAINT "FK_lg_team" FOREIGN KEY ("lgID" ,"teamID") REFERENCES "teams" ("lgID" ,"teamID"); table. Uses an equality operator. The foreign key must match the primary key of the table that is referenced in number, column type, and order of columns. TQL> ALTER TABLE "shipment" ADD CONSTRAINT "FK_PO_vendor" FOREIGN KEY ("po_number", "vendor") REFERENCES "orders" ("po_number", "vendor"); TQL> ALTER TABLE "shipment" ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 264 Reference Constraint Description Example DROP CONSTRAINT "FK_PO_vendor"; When creating a foreign key, give it a name. You can reference the foreign key name later, if you want to remove it. RELATIONSHIP Defines a relationship TQL> ALTER TABLE "wholesale_buys" ADD RELATIONSHIP "REL_fruit" WITH "retail_sales" AS "wholesale_buys"."fruit" = "retail_sales"."fruit" AND ("wholesale_buys"."date_order" < "retail_sales"."date_sold" AND "retail_sales"."date_sold" < "wholesale_buys"."expire_date"); where the value(s) in the table can be used to join to a second table, using an equality condition (required) and one or more range conditions (optional). TQL> ALTER TABLE "wholesale_buys" DROP RELATIONSHIP "REL_fruit"; These conditions act like a WHERE clause when the two tables are joined. They are applied using AND logic, such that all conditions must be met for a row to be included. You may add multiple relationships between tables. When creating a relationship, give it a name. You can reference the relationship name later, if you want to remove it. Flags The following flag can be used with TQL: ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 265 Reference --query_results_apply_top_row_count limits to the number of result rows returned by a query. Example: $ tql --query_results_apply_top_row_count 100 Data types ThoughtSpot supports a simplified list of data types: Table 36: Supported data types Kind of Supported data data types Character • VARCHAR(n) Details Specify the maximum number of characters, as in VARCHAR(255). The size limit is 1GB for VARCHAR values. Floating • DOUBLE DOUBLE is recommended. point • FLOAT Boolean • BOOL Can be true or false. Integer • INT INT holds 32 bits. • BIGINT BIGINT holds 64 bits. Date or • DATE DATETIME, TIMESTAMP, and TIME are stored at the granularity time • DATETIME of seconds. • TIMESTAMP TIMESTAMP is identical to DATETIME, but is included for syntax • TIME compatibility. ThoughtSpot Loader flag reference For recurring data loads and for scripting loads, use the ThoughtSpot Loader (tsload). This reference section lists all the flags that can be used to modify the behavior of tsload. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 266 Reference General tsload flags Table 37: General tsload flags Flag Description Notes --target_database Specifies the pre- existing target database into which tsload should load the data. --target_schema Specifies the target Default is "falcon_default_schema". schema. --target_table
Specifies the tables that The tables must exist in the database you want to load into the specified by --target_database. database. --empty_target Specifies that any data in If supplied, any rows that exist in the the target table is to be table specified by --target_database removed before the new and --target table will be deleted data is loaded. before this data load. To perform an "upsert" on the existing data, omit this flag or specify --noempty_target. --max_ignored_rows Specifies the maximum If the number of ignored rows exceeds number of rows that can this limit, the load will be aborted. be ignored if they fail to load. --bad_records_file Specifies the file to use Input rows that do not conform to the / for storing rows that defined schema in ThoughtSpot will be failed to load. ignored and inserted into this file. --date_format Specifies the format The default format is yearmonthday e.g. string for date values. "Dec 30th, 2001" and is represented as 20011230. Use the date format specifications supported in the strptime library function. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 267 Reference Flag Description Notes --date_time_format Specifies the format The default is yearmonthday string for datetime hour:minute:second e.g. Dec 30th, 2001 values. 1:15:12 and is represented as 20011230 01:15:12. Use the datetime format specifications supported in the strptime library function. --time_format Specifies the format The default is hour:minute:second. string for time values. Use the time format specifications supported in the strptime library function. --v=[0|1|2|3|4|5|6] Specifies the verbosity Provide a values from 0 to 6, to of log messages. designate the verbosity level. By default, verbosity is set to the minimum, which is 0. --skip_second_fraction Skips fractional seconds If supplied, the upserts logic may be when loading data. affected, especially if the date time being loaded is a primary key, and the data has millisecond granularity. Load the data twice, once as a string with a primary key, and again with second granularity date time. There is no support to store fractional seconds in the ThoughtSpot system. File loading tsload flags The following flags are used when loading data from an input file: Table 38: File loading tsload flags Flag Description --source_file Specifies the location of / the file to be loaded. ThoughtSpot Administration Guide Copyright © Notes 2017 by ThoughtSpot. All Rights Reserved. 268 Reference Flag Description Notes --source_data_format [csv| Specifies the data file Optional. The default is csv. delimited] format. --field_separator Specifies the field "" delimiter used in the input file. -- trailing_field_separator Specifies that the field Example row with trailing field separator appears after separator: a,b,c, every field, including the The default is false. last field per row. --null_value Specifies how null values These values will be converted to "" are represented in the NULL upon loading. input file. --date_converted_to_epoch Specifies whether the [true|false] “date” or “datetime” values in the input file are represented as epoch values. --boolean_representation Specifies the format in The default is T_F. You can also [true_false | 1_0 | T_F | which boolean values use this flag to specify other Y_N] are represented in the values. For example, if your data input file. used Y for true and NULL for false, you could specify: --boolean_representation Y_NULL --has_header_row Indicates that the input If supplied, the first row of the file file contains a header is ignored. If not supplied, the first row. row of the file is loaded as data. --escape_character Specifies the escape If no value is specified, the default "" character used in the is "(double quotes). input file. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 269 Reference Flag Description Notes --enclosing_character Specifies the enclosing If the enclosing character is "" character used in the double quotes, you need to input file. escape it, as in this example: -enclosing_character "\"" --use_bit_boolean_values = Specifies how boolean If supplied, the input CSV file [true | false] values are represented uses a bit for boolean values, i.e. in the input file. the false value is represented as 0x0 and true as 0x1. If omitted or set to false, boolean values are assumed to be T_F, unless you specify something else using the flag --boolean_representation [true_false | 1_0 | T_F | Y_N]. tscli command reference The tscli command line interface is an administration interface for the ThoughtSpot instance. Use tscli to take snapshots (backups) of data, apply updates, stop and start the services, and view information about the system. tscli syntax Usage for tscli is: tscli [-h] [--helpfull] [--verbose] [--noautoconfig] [--autoconfig] [--yes] [--cluster ] [--zoo ] {alert, backup, callhome, cluster, command, etl, event, fileserver, firewall, ldap, logs, monitoring, nas, node, os, release, saml, security, smtp, snapshot, ssl, storage, support} ... ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 270 Reference tscli commands Table 39: tscli commands Command Description tscli [] -h Shows help, optionally for the specified command. tscli alert count Shows counts of generated alerts by type. tscli alert info Lists all alerts. tscli alert list Lists the generated alerts. tscli alert off Disables all alerts from the cluster. tscli alert on Enables alerts from the cluster. tscli alert silence --name Silences the alert with . For example, DISK_ERROR. Silenced alerts are still recorded in postgres, however emails are not sent out. tscli alert status Shows the status of cluster alerts. tscli alert unsilence-name Unsilences the alert with . For example, tscli backup create [--mode {full|light| dataless}] Same as the tscli snapshot backup command. DISK_ERROR. Pulls a snapshot and saves it as a backup where: • is the name of the snapshot to pull out as a backup. • is the new directory that will be created for the backup. This directory must not already exist. Use --mode to specify the type of backup. tscli.par backup create [-h] [--mode {full|light| dataless}] ThoughtSpot Administration Guide Same as the tscli snapshot backup command. This backup creation method make backups a part of Copyright © 2017 by ThoughtSpot. All Rights Reserved. 271 Reference Command Description [--type {full|incremental}] [--base BASE] [--storage_type {local|nas}] [--remote] the cluster so that they can be managed using tscli commands. Pulls a snapshot and saves it as a backup where: • Use --mode to specify the type of backup. • --type should always be full. • --base should not be set. This parameter is reserved for incremental backups. • --remote is a switch to use the cluster-linked backup method. The backup taken in this way will be on the node where the orion master is running. It therefore requires the cluster to be running. • --storage_type is only valid if you turn on [-remote] switch. You can choose from local or nas. For local, the backup will be taken on the node where the active orion master is runing. • is the name of the snapshot to pull out as a backup. • is the new directory that will be created for the backup. This directory must not already exist. tscli backup create-policy Prompts an editor for you to edit the parameters of tscli backup delete Deletes the named backup. tscli.par backup delete [-h] This command deletes backups generated with the backup policy. tscli.par backup create [-h]. Use tscli backup ls to find the backup id for each backup. The backup ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 272 Reference Command Description metadata is removed from the cluster, and the backup directory is deleted from the node. tscli backup delete-policy Deletes the backup policy with . tscli backup disable-periodic Disables periodic backups. Restarts the cluster when tscli backup disable-policy Disables the policy . tscli backup enable-policy Enables the policy . tscli backup list-policy Shows a list of present backup policies. tscli backup ls Lists available backups and their size. tscli backup mirror-status Checks whether the current cluster is running in mirror tscli backup periodic-config Shows the periodic backup configuration. tscli backup periodic-status Shows the periodic backup status. tscli backup set-periodic --at --directory [--num_backups ] [--mode {full|light| dataless}] Enables or updates a periodic full backup applied. mode or not. configuration where: • is the list of times at which to take backups daily. Comma separated string of hour of day specified as HH using a 24 hour clock (e.g. 01, 13, 23). • is the directory where backups are to be written. • is the number of backups to keep archived. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 273 Reference Command Description Use --mode to specify the type of backup. tscli backup start-mirror Starts a mirror cluster which will continuously pull backups generated from a primary cluster where: • is the directory where backups from the primary cluster are written (usually a SAN or NFS mounted drive). • is a comma separated list of IP addresses of the nodes in the mirror cluster. • is the cluster name of the mirror cluster. • is the ID of the mirror cluster. Used only in systems specifically architected for disaster recovery. tscli backup stop-mirror Stops mirroring on the local cluster. Used only in tscli backup update-policy Prompts an editor for you to edit the policy . tscli callhome disable Turns off the periodic call home feature. tscli callhome enable --customer_name Enables the "call home" feature, which sends usage systems specifically architected for disaster recovery. statistics to ThoughtSpot Support every six hours via the secure file server. Before using this command for the first time, you need to set up the file server connection using tscli fileserver configure. The parameter takes the form Shared//stats. Contact ThoughtSpot if you do not know the customer name to specify. tscli callhome generate-bundle ThoughtSpot Administration Guide Generates a tar file of the cluster metrics and writes it to the specified directory where: Copyright © 2017 by ThoughtSpot. All Rights Reserved. 274 Reference Command Description --d --since is how far back you'd like to generate the tar file from in days. For example, <30d>. If this parameter is not specified, the command will collect stats from the last 15 days by default. Once the stats are collected, the data gets backfilled in dogfood for all the missing dates. tscli cluster start Starts the cluster. tscli cluster status Gives the status of the cluster, including release number, date last updated, number of nodes, pending tables time, and services status. tscli cluster stop Pauses the cluster (but does not stop storage tscli command run Runs a command on all nodes. tscli etl change-password --admin_username --username Changes the Informatica Cloud account password services). used by ThoughtSpot Data Connect. Required parameters are: • --admin_username specifies the Administrator username for ThoughtSpot. • --username specifies the username for the Informatica Cloud. tscli etl disable-lw Disables ThoughtSpot Data Connect. tscli etl download-agent Downloads the ThoughtSpot Data Connect agent to tscli etl enable-lw --admin_username --username --thoughtspot_url [--org_id ] [--pin_to ] [--proxy_host ] Enables ThoughtSpot Data Connect. Some parameters ThoughtSpot Administration Guide the cluster. are given below, but you should contact ThoughtSpot Support for assistance in setting this up. Required parameters are: • Copyright --admin_username specifies the Administrator username for ThoughtSpot. © 2017 by ThoughtSpot. All Rights Reserved. 275 Reference Command Description [--proxy_port ] • --username specifies the username for the Informatica Cloud. • --thoughtspot_url specifies the URL for ThoughtSpot • --org_id specifies the Informatica id of the organization (company). NOTE: org_id shouldn't include the prefix "Org". For example, if on Informatica cloud, the orgid is Org003XYZ, then use only 003XYZ. Optional parameters are: • --pin_to specifies the IP address to pin to. If you specify an IP to pin to, that node becomes sticky to the Informatica agent, and will always be used. Defaults to the public IP address of the localhost where this command was run. • --proxy_host and -proxy_port specifies the proxy details. tscli etl show-lw Shows the status of ThoughtSpot Data Connect. It also tscli event list [--include ] [--since | --from --to ] [--detail] [--summary_contains <'string1'| 'string2' ...>] [--detail_contains <'string1'| 'string2' ...>] [--attributes ] Optional parameters are: returns the Informatica username and OrgId. • --include specifies the type of events to include, and can be all, config, or notification. • --detail returns the events in a detail format rather than a tabular summary, which is the default. • --summary_contains <'string1'| 'string2' ...> specifies a string to check for in the event summary. Enclose strings in single quotes, and separate multiple strings with |. Events that match all specified strings will be returned. • --detail_contains <'string1'| 'string2' ...> specifies a string to check for in the detail. Enclose strings in single quotes, and ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 276 Reference Command Description separate multiple strings with |. Events that match all specified strings will be returned. • --attributes specifies attributes to match as key=value pairs. Separate multiple attributes with |. Events that match all specified key/value pairs will be returned. Put single quotes around the value(s). And a time window made up of either: • --since is a time in the past for where the event audit begins, ending at the present time. Specify a human readable duration string, e.g. 4h (4 hours), 30m (30 minutes), 1d (1 day). Or both: • --from is a timestamp for where to begin the event audit. It must be of the form: yyyymmdd-HH:MM. • --to is a timestamp for where to end the event audit. It must be of the form: yyyymmdd-HH:MM. tscli feature get-all-config Gets the configured features in a cluster. The command will return a list of features, such as custom branding, Data Connect, and call home, and tell you whether they are enabled or disabled. tscli fileserver configure --user [--password ] Configures the secure file server username and password for file upload/download and the call home feature. You only need to issue this command once, to set up the connection to the secure file server. You only need to reissue this command if the password changes. The parameter is optional. If a ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 277 Reference Command Description password is not specified, you will be prompted to enter it. tscli fileserver download-release [--user --password ] Downloads the specified release file and its checksum. Specify the release by number, to the second decimal point (e.g. 3.1.0, 3.0.5, etc.). You may optionally specify the --user and --password to bypass the credentials that were specified when configuring the file server connection with tscli fileserver configure. Before using this command for the first time, you need to set up the file server connection using tscli fileserver configure. tscli fileserver purge-config Removes the file server configuration. tscli fileserver show-config Shows the file server configuration. tscli fileserver upload --file_name --server_dir_path [--user --password ] Uploads the file specified to the directory specified on the secure file server. The parameter specifies the directory to which you want to upload the file. It is based on your customer name, and takes the form /Shared/support/. If you don't know the path to specify, Contact ThoughtSpot. You may optionally specify the --user and --password to bypass the credentials that were specified when configuring the file server connection with tscli fileserver configure. Before using this command for the first time, you need to set up the file server connection using tscli fileserver configure. tscli firewall close-ports Closes given ports through firewall on all nodes. Takes a list of ports to close, comma separated. Only closes ports which were previously opened using ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 278 Reference Command Description "open-ports". Ignores ports which were not previously opened with "open-ports" or were already closed. tscli firewall disable Disable firewall. tscli firewall enable Enable firewall. tscli firewall open-ports Opens given ports through firewall on all nodes. Takes a list of ports to open, comma separated. Ignores ports which are already open. Some essential ports are always kept open (e.g. ssh), they are not affected by this command or by "close-ports". tscli firewall status Shows whether firewall is currently enabled or tscli ldap add-cert Adds an SSL certificate for LDAP. Use only if LDAP has disabled. been configured without SSL and you wish to add it. Use to supply an alias for the certificate you are installing. Configures LDAP using an interactive script. You can tscli ldap configure see detailed instructions for setting up LDAP in About LDAP integration. tscli ldap purge-configuration Purges (removes) any existing LDAP configuration. tscli logs collect --include [--exclude ] [--since | --from --to ] [--out ] [--maxsize ] [--sizeonly] Extracts logs from the cluster. Does not include any logs that have been deleted due to log rotation. Required parameters are: • --include is a comma separated list of logs to include. Each entry is either a selector (one of all, orion, system, or ts) or a glob for matching files. Anything starting with / is assumed to be a glob pattern and interpreted via find(1). Other entries are ignored. TIP: put single quotes around the parameter value to prevent undesired glob expansion. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 279 Reference Command Description And a time window made up of either: • --since is a time in the past for where log collection begins, ending at the present time. Specify a human readable duration string, e.g. 4h (4 hours), 30m (30 minutes), 1d (1 day). Or both: • --from is a timestamp for where to begin log collection. It must be of the form: yyyymmdd-HH:MM. • --to is a timestamp for where to end log collection. It must be of the form: yyyymmdd-HH:MM. Optional parameters are: • --exclude is a comma separated list of logs to exclude. Each entry is either a selector (one of orion, system, or ts) or a glob for matching files. Anything starting with / is assumed to be a glob pattern and interpreted via find(1). • --out is the location where log tarball is written. If not specified, the tarball will be written in /tmp. • --maxsize is the maximum size to allow. Only fetches logs if the total size is smaller that this value. Can be specified in megabytes or gigabytes, e.g. 100MB, 10GB. • --sizeonly means do not collect logs. Just report the size. tscli logs runcmd -cmd command> [--include selector | glob>] [--exclude selector | glob>] [-since hours,minutes,days> | --from yyyymmdd-HH:MM> ThoughtSpot Administration Guide Runs a Unix command on logs in the cluster matching the given constraints. Results are reported as text dumped to standard out, the specified output file, or as tarballs dumped into the specified directory. Copyright © 2017 by ThoughtSpot. All Rights Reserved. 280 Reference Command Description --to yyyymmdd-HH:MM>] [--outfile ] [--outdir ] [--cmd_infmt [C | U]] [--cmd_outfmt [C | U]] Required parameters are: • --cmd is a Unix command to be run on the selected logs. Use single quotes to escape spaces, etc. Language used to specify the command has following rules. 1. A logfile and its corresponding result file can be referenced using the keywords SRCFILE and DSTFILE. E.g. 'cp SRCFILE DSTFILE' 2. If there is no reference to DSTFILE in the command, '> DSTFILE' will be appended to the command for output redirection. E.g. 'du -sch SRCFILE' gets automatically translated to 'du sch SRCFILE > DSTFILE' 3. If there is no reference to SRCFILE, the content of the log is streamed tot he command using a pipe. E.g. 'tail -n100 | grep ERROR' gets automatically translated to 'cat SRCFILE | tail n100 | grep ERROR > DSTFILE' • --include is a comma separated list of logs to include. Each entry is either a selector (one of all, orion, system, or ts) or a glob for matching files. Anything starting with / is assumed to be a glob pattern and interpreted via find(1). Other entries are ignored. TIP: put single quotes around the parameter value to prevent undesired glob expansion. And a time window made up of either: • --since is a time in the past for where log collection begins, ending at the present time. Specify a human readable duration string, e.g. 4h (4 hours), 30m (30 minutes), 1d (1 day). Or both: ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 281 Reference Command Description • --from is a timestamp for where to begin log collection. It must be of the form: yyyymmdd-HH:MM. • --to is a timestamp for where to end log collection. It must be of the form: yyyymmdd-HH:MM. Optional parameters are: • --exclude is a comma separated list of logs to exclude. Each entry is either a selector (one of orion, system, or ts) or a glob for matching files. Anything starting with / is assumed to be a glob pattern and interpreted via find(1). • --outfile is the file path for printing all the results. By default, results get printed to stdout. • --outdir is the directory path for dumping results from each node, with their original directory structure. This may be used as an alternative to printing output to outfile/stdout. • --cmd_infmt [C | U] specifies if the input file should be compressed (C) or uncompressed (U) before running the command. Don't use this flag if the command works on both. • --cmd_outfmt [C | U] specifies if the output file generated by the command will be compressed (C) or uncompressed (U). Don't use this flag if the output file will be of the same format as the input file. tscli map-tiles enable Enables ThoughtSpot's map tiles, which are used when constructing geomap charts. If you don't have interest access, you must download the map tiles tar and md5 files. Then you must append the following to the tscli command --offline -- --md5 ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 282 Reference Command Description , where and are the locations of the two files. Sets the monitoring configuration. tscli monitoring set-config --email --clear_email --heartbeat_interval --heartbeat_disable --report_interval --report_disable Parameters are: • --email is a comma separated list (no spaces) of email addresses where the cluster will send monitoring information. • --clear_email disables emails by clearing the email configuration. • --heartbeat_interval is the heartbeat email generation interval in seconds. Must be greater than 0. • --heartbeat_disable disables heartbeat email generation. • --report_interval sets the cluster report email generation interval in seconds. Must be greater than 0. • --report_disable disables cluster report email generation. tscli monitoring show-config Shows the monitoring configuration. tscli nas mount-cifs --server --path_on_server --mount_point --username --password --uid --gid Mounts a CIFS device on all nodes. Parameters are: • --server is the device address. • --path_on_server is the path on the server to mount (source). • --mount_point is the location where the CIFS device should be mounted (target). • --username is the username with which to connect to the CIFS device. • ThoughtSpot Administration Guide Copyright --password is the CIFS password. © 2017 by ThoughtSpot. All Rights Reserved. 283 Reference Command Description • --uid is the uid that will own all files or directories on the mounted filesystem, when the server does not provide ownership information. See the man page for mount.cifs for more details. • --gid is the gid that will own all files or directories on the mounted filesystem, when the server does not provide ownership information. See the man page for mount.cifs for more details. tscli nas mount-nfs --server --path_on_server --mount_point Mounts a NFS device on all nodes. Parameters are: • --server is the device address. • --path_on_server is the path on server to mount (source). • --mount_point is the location where the NFS device should be mounted (target). tscli nas unmount --dir Unmounts all devices from the specified directory tscli node ls Lists all nodes in the cluster. tscli os find-package Lists all packages and versions included in the product location. image whose name contains . Package names you can specify are: ThoughtSpot Administration Guide • alert • backup • callhome • cluster • event • fileserver • firewall • ldap • logs Copyright © 2017 by ThoughtSpot. All Rights Reserved. 284 Reference Command Description • monitoring • node • os • release • smtp • snapshot • ssl • storage • support tscli os list-packages Lists all packages and versions included in the product tscli release info Prints information about the release contained in the tscli saml configure Configures SAML. tscli saml purge-configuration Purges any existing SAML configuration. tscli scheduled-pinboards Enables scheduled pinboards, which is disabled in tscli security clear-min-tls-version Clears any customizations for the minimum TLS tscli security set-min-tls-version {1.0|1.1|1.2} Sets the minimum SSL version to be supported by the ThoughtSpot Administration Guide image. specified release file. prod clusters by default. version to support. ThoughtSpot application. Please ensure that client Copyright © 2017 by ThoughtSpot. All Rights Reserved. 285 Reference Command Description browsers are enabled for this version or newer. It is recommended that you set the version to 1.2. tscli security ssl-off Disables SSL authentication for the ThoughtSpot tscli security ssl-on Enables SSL authentication for the ThoughtSpot tscli security ssl-status Shows whether of SSL authentication is enabled or tscli security tls-status Prints the status of TLS support. tscli smtp reset-canonical-mapping Deletes the current postmap mapping. tscli smtp set-canonical-mapping Sets a new postmap mapping. tscli smtp set-mailfromname Sets the mailname, from which email alerts are sent, tscli smtp set-mailname Sets the mailname, where email alerts are sent, for the tscli smtp Sets the relayhost for SMTP (email) sent from the ThoughtSpot Administration Guide application. application. disabled for the ThoughtSpot application. for the cluster. cluster. cluster. Copyright © 2017 by ThoughtSpot. All Rights Reserved. 286 Reference Command Description set-relayhost tscli smtp show-canonicalmapping Shows the current postmap mapping. tscli smtp show-mailfromname Shows the mailname, from which email alerts are sent, tscli smtp show-mailname Shows the mailname, where email alerts are sent, for tscli smtp show-relayhost Shows the relayhost for SMTP (email) sent from for the cluster. the cluster. the cluster. If there is no relayhost configured, the command shows "NOT FOUND". Same as the tscli backup create command. tscli snapshot backup [--mode {full|light| dataless}] Pulls a snapshot and saves it as a backup where: • is the name of the snapshot to use. • is the new directory that will be created for the backup. This directory must not already exist. Use --mode to specify the type of backup. Creates a new snapshot with the name and reason tscli snapshot create provided. This command does not accept periods (.), but does accept dashes (-). tscli snapshot delete Deletes the named snapshot. tscli snapshot disable-policy Disables the automatic snapshot policy. tscli snapshot enable-poicy Enables the automatic snapshot policy. tscli snapshot ls Lists available snapshots and their size. tscli snapshot update-policy Updates the automatic snapshot policy. tscli ssl add-cert Adds SSL certificate, key pair. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 287 Reference Command Description tscli ssl off Disables SSL. Disabling SSL will stop users from seeing a security warning when accessing ThoughtSpot from a browser if there is no SSL certificate installed. tscli ssl on Enable SSL. If SSL is enabled and there is no certificate, users will see a security warning when accessing ThoughtSpot from a browser. tscli ssl rm-cert Removes the existing SSL certificate, if any. tscli ssl status Shows whether SSL authentication is enabled or tscli storage gc [--log_age ] [--localhost_only>] Before issuing this command, you must stop the disabled. cluster using tscli cluster stop. After garbage collection has completed, you can restart the cluster with tscli cluster start. Garbage collects unused storage. Accepts these optional flags: --log_age Specifies the number of elapsed hours after which logs will be deleted. Default is 24 hours. --localhost_only If used, only the logs on the localhost will be removed. If not specified, the command acts on the entire cluster. The command frees space in these directories: /tmp /usr/local/scaligent/logs/ /export/logs/orion /export/logs/oreo /export/logs/hadoop /export/logs/zookeeper ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 288 Reference Command Description cores tscli support restart-remote Restarts remote support. tscli support rm-admin-email Removes the email address for contacting the customer administrator. Replaces it with the default ThoughtSpot Support email address. tscli support rm-admin-phone Removes the phone number for contacting the customer administrator. Replaces it with the default ThoughtSpot Support phone number. tscli support set-admin-email Sets the email address for contacting the customer administrator. If you would like to display a blank email address, issue the command tscli support setadmin-email ' '. tscli support set-admin-phone Sets the phone number for contacting the customer administrator. Specify a phone number using any value (e.g. +1 800-508-7008 Ext. 1). If you would like to display a blank phone number, issue the command tscli support set-admin-phone ' '. tscli support set-remote --addr --user Configures the cluster for remote support through SSH tunneling, where is the address of support, e.g. tunnel.thoughtspot.com, and is the support username. tscli support show-admin-email Shows the email address for customer administrator, if tscli support show-admin-phone Shows the phone number for customer administrator, tscli support show-remote Prints the status and configuration of remote support. tscli support Starts remote support. ThoughtSpot Administration Guide set. if set. Copyright © 2017 by ThoughtSpot. All Rights Reserved. 289 Reference Command Description start-remote Stops remote support. tscli support stop-remote Table 40: tscli optional arguments Argument Description --verbose Turns on verbose logging to the console. By default, logs are written to both the console and log files, with the log files containing the highest verbosity log messages. --helpfull Shows the full help text for all command sections. --noautoconfig Will prompt for a y or n for each configuration question. --autoconfig Automatically configure properties of the cluster when possible. User may still be prompted for certain inputs. --yes Same as --autoconfig. Automatically configure properties of the cluster when possible. User may still be prompted for certain inputs. --cluster Optional. Used to designate the cluster to operate on by name, if it is not detected automatically. --zoo Optional. Used to provide a comma-separated list of Zookeeper servers, when a cluster is not detected automatically. Formula reference ThoughtSpot allows you to create derived columns in worksheets using formulas. This reference lists the various operators and functions you can use to create formulas. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 290 Reference You can also see this list of operators and examples from within the Formula Builder by selecting Formula Assistant. Aggregate functions These functions can be used to aggregate data. Table 41: Mixed functions for use in formulas Function Description Examples average Returns the average of all the • average (revenue) • count (product) • cumulative_average (revenue, values of a column. count Returns the number of rows in the table containing the column. cumulative_average Takes a measure and one or more attributes. order date, state) Returns the average of the measure, accumulated by the attribute(s) in the order specified. cumulative_max Takes a measure and • cumulative_max (revenue, state) • cumulative_min (revenue, one or more attributes. Returns the maximum of the measure, accumulated by the attribute(s) in the order specified. cumulative_min Takes a measure and one or more attributes. campaign) Returns the minimum of the measure, accumulated by the attribute(s) in the order specified. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 291 Reference Function Description Examples cumulative_sum Takes a measure and • one or more attributes. cumulative_sum (revenue, order date) Returns the sum of the measure, accumulated by the attribute(s) in the order specified. group_average Takes a measure and one • or more attributes. Returns group_average (revenue, customer region, state) the average of the measure grouped by the attribute(s). group_count Takes a measure and one or • more attributes. Returns the group_count (revenue, customer region) count of the measure grouped by the attribute(s). group_max Takes a measure and one or • more attributes. Returns the group_max (revenue, customer region) maximum of the measure grouped by the attribute(s). group_min Takes a measure and one • or more attributes. Returns group_min (revenue, customer region) the minimum of the measure grouped by the attribute(s). group_stddev Takes a measure and one • or more attributes. Returns group_stddev (revenue, customer region) the standard deviation of the measure grouped by the attribute(s). group_sum Takes a measure and one or more attributes. Returns the • group_sum (revenue, customer region) sum of the measure grouped by the attribute(s). ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 292 Reference Function Description Examples group_unique_count Takes a measure and one or • more attributes. Returns the group_unique_count (product , supplier) unique count of the measure grouped by the attribute(s). group_variance Takes a measure and one • or more attributes. Returns group_variance (revenue, customer region) the variance of the measure grouped by the attribute(s). max Returns the maximum value of • max (sales) • min (revenue) • moving_average (revenue, 2, 1, a column. min Returns the minimum value of a column. moving_average Takes a measure, two integers to define the window to customer region) aggregate over, and one or more attributes. The window is (current - Num1...Current + Num2) with both end points being included in the window. For example, "1,1" will have a window size of 3. To define a window that begins before Current, specify a negative number for Num2. Returns the average of the measure over the given window. The attributes are the ordering columns used to compute the moving average. moving_max Takes a measure, two integers to define the window to • moving_max (complaints, 1, 2, store name) aggregate over, and one or more attributes. The window ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 293 Reference Function Description Examples is (current - Num1...Current + Num2) with both end points being included in the window. For example, "1,1" will have a window size of 3. To define a window that begins before Current, specify a negative number for Num2. Returns the maximum of the measure over the given window. The attributes are the ordering columns used to compute the moving maximum. moving_min Takes a measure, two integers • to define the window to moving_min (defects, 3, 1, product) aggregate over, and one or more attributes. The window is (current - Num1...Current + Num2) with both end points being included in the window. For example, "1,1" will have a window size of 3. To define a window that begins before Current, specify a negative number for Num2. Returns the minimum of the measure over the given window. The attributes are the ordering columns used to compute the moving minimum. moving_sum Takes a measure, two integers to define the window to • moving_sum (revenue, 1, 1, order date) aggregate over, and one or more attributes. The window is (current - Num1...Current + ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 294 Reference Function Description Examples Num2) with both end points being included in the window. For example, "1,1" will have a window size of 3. To define a window that begins before Current, specify a negative number for Num2. Returns the sum of the measure over the given window. The attributes are the ordering columns used to compute the moving sum. stddev Returns the standard deviation • stddev (revenue) • sum (revenue) • unique count (customer) • variance (revenue) of all values of a column. sum Returns the sum of all the values of a column. unique count Returns the number of unique values of a column. variance Returns the variance of all the values of a column. Conversion functions These functions can be used to convert data from one data type to another. Conversion to or from date data types is not supported. Table 42: Conversion functions for use in formulas Function Description Examples to_bool Returns the input as a boolean • to_bool (0) = false (true or false). • to_bool (married) Accepts a date represented • to_date (date_sold, '%Y-%m-%d') to_date as an integer or text string, and a second string parameter ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 295 Reference Function Description Examples that can include strptime date formatting elements. Replaces all the valid strptime date formatting elements with their string counterparts and returns the result. Does not accept epoch formatted dates as input. to_double to_integer to_string Returns the input as a double. • to_double ('3.14') = 3.14 • to_double (revenue * .01) • to_integer ('45') + 1 = 46 • to_integer (price + tax - cost) Returns the input as a text • to_string (45 + 1) = '46' string. • to_string (revenue - cost) Returns the input as an integer. Date functions Table 43: Date functions for use in formulas Function Description Examples add_days Returns the result of adding • the specified number of days date add_days (01/30/2015, 5) = 02/04/2015 to the given date. • add_days (invoiced, 30) Returns the date portion of a • date (home visit) Returns the number (1-31) of • day (01/15/2014) = 15 the day for the given date. • day (date ordered) Returns the number (1-7) • day_number_of_week given date. day day_number_of_week of the day in a week for the given date with 1 being Monday and 7 being Sunday. ThoughtSpot Administration Guide Copyright © (01/30/2015) = 6 • day_number_of_week (shipped) 2017 by ThoughtSpot. All Rights Reserved. 296 Reference Function Description Examples day_number_of_year Returns the number (1-366) of • day_number_of_year the day in a year for the given (01/30/2015) = 30 date. • day_number_of_year (invoiced) day_of_week Returns the day of the week • for the given date. diff_days Friday Subtracts the second date • day_of_week (serviced) • diff_days (01/15/2014, from the first date and returns the result in number of days, 01/17/2014) = -2 • rounded down if not exact. diff_time Subtracts the second date the result in number of diff_days (purchased, shipped) • from the first date and returns diff_time (01/01/2014, 01/01/2014) = -86,400 • seconds. hour_of_day day_of week (01/30/2015) = diff_time (clicked, submitted) Returns the hour of the day • hour_of_day (received) • is_weekend (01/31/2015) = for the given date. is_weekend Returns true if the given date falls on a Saturday or Sunday. month month_number true • is_weekend (emailed) Returns the month from the • month (01/15/2014) = January given date. • month (date ordered) Returns the number (1-12) of • month_number (09/20/2014) = the month for the given date. now Returns the current 9 • month_number (purchased) • now () timestamp. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 297 Reference Function Description Examples start_of_month Returns the date for the first • day of the month for the start_of_quarter ( 01/31/2015 ) = Jan FY 2015 given date. • start_of_month (shipped) Returns the date for the first • start_of_quarter day of the quarter for the start_of_week ( 09/18/2015 ) = Q3 FY 2015 given date. • start_of_quarter (sold) Returns the date for the first • start_of_week ( 06/01/2015 ) day of the week for the given start_of_year = 05/30/2015 Week date. • start_of_week (emailed) Returns the date for the first • start_of_year ( 02/15/2015 ) day of the year for the given time = FY 2015 date. • start_of_year (joined) Returns the time portion of a • time (3/1/2002 10:32) = given date. year start_of_month 10:32 • time (call began) Returns the year from the • year (01/15/2014) = 2014 given date. • year (date ordered) Mixed functions These functions can be used with text and numeric data types. Table 44: Mixed functions for use in formulas Function Description Examples != Returns true if the first value is • 3 != 2 = true not equal to the second value. • revenue != 1000000 Returns true if the first value is • 3 < 2 = false less than the second value. • revenue < 1000000 < ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 298 Reference Function Description Examples <= Returns true if the first value is • 1 <= 2 = true less than or equal to the second • revenue <= 1000000 Returns true if the first value is • 2 = 2 = true equal to the second value. • revenue = 1000000 Returns true if the first value is • 3 > 2 = true greater than the second value. • revenue > 1000000 Returns true if the first value • 3 >= 2 = true is greater than or equal to the • revenue >= 1000000 • greatest (20, 10) = 20 • greatest (q1 revenue, q2 revenue) • least (20, 10) = 10 • least (q1 revenue, q2 revenue) value. = > >= second value. greatest least Returns the larger of the values. Returns the smaller of the values. Number functions Table 45: Number functions for use in formulas Function Description Examples * Returns the result of • 3 * 2 = 6 multiplying both numbers. • price * taxrate + - Returns the result of adding • 1 + 2 = 3 both numbers. • price + shipping Returns the result of • 3 - 2 = 1 subtracting the second • revenue - tax number from the first. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 299 Reference Function Description Examples / Returns the result of • 6 / 3 = 2 dividing the first number by • markup / retail price Returns the first number • 3 ^ 2 = 9 raised to the power of the • width ^ 2 • abs (-10) = 10 • abs (profit) Returns the inverse cosine • acos (0.5) = 60 in degrees. • acos (cos-satellite-angle) Returns the inverse sine • asin (0.5) = 30 (specified in degrees). • asin (sin-satellite-angle) Returns the inverse tangent • atan (1) = 45 in degrees. • atan (tan-satellite-angle) Returns the inverse tangent • atan2 (10, 10) = 45 in degrees. • atan2 (longitude, latitude) Returns the cube root of a • cbrt (27) = 3 number. • cbrt (volume) Returns the smallest • ceil (5.9) = 6 following integer. • ceil (growth rate) Returns the cosine of an • cos (63) = 0.45 the second. ^ second. abs acos asin atan atan2 cbrt ceil cos cube exp Returns the absolute value. angle (specified in degrees). • cos (beam angle) Returns the cube of a • cube (3) = 27 number. • cube (length) Returns Euler's number • exp (2) = 7.38905609893 (~2.718) raised to a power. • exp (growth) ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 300 Reference Function Description Examples exp2 Returns 2 raised to a power. • exp2 (3) = 8 • exp2 (growth) floor ln log10 log2 mod Returns the largest previous • floor (5.1) = 5 integer. • floor (growth rate) Returns the natural • ln (7.38905609893) = 2 logarithm. • ln (distance) Returns the logarithm with • log10 (100) = 2 base 10. • log10 (volume) Returns the logarithm with • log2 (32) = 5 base 2 (binary logarithm). • log2 (volume) Returns the remainder of • mod (8, 3) = 2 first number divided by the • mod ( revenue , quantity ) Returns the first number • pow (5, 2) = 25 raised to the power of the • pow (width, 2) Returns a random number • random ( ) = .457718 between 0 and 1. • random ( ) Returns the first number • round (35.65, 10) = 40 rounded to the second • round (battingavg, 100) Returns the result of • safe_divide (12, 0) = 0 dividing the first number by • safe_divide (total_cost, second number. pow second number. random round number (the default is 1). safe_divide the second. If the second number is 0, returns 0 units) instead of NaN (not a number). ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 301 Reference Function Description Examples sign Returns +1 if the number is • sign (-250) = -1 greater than zero, -1 if less • sign (growth rate) Returns the sine of an angle • sin (35) = 0.57 (specified in degrees). • sin (beam angle) Returns the distance in km • spherical_distance (37.465191, than zero, 0 if zero. sin spherical_distance between two points on -122.153617, 37.421962, Earth. -122.142174) = 4,961.96 • spherical_distance (start_latitude, start_longitude, start_latitude, start_longitude) sq sqrt tan Returns the square of a • sq (9) = 81 numeric value. • sq (width) Returns the square root. • sqrt (9) = 3 • sqrt (area) • tan (35) = 0.7 Returns the tangent of an angle (specified in degrees). • tan (beam angle) Operators Table 46: Operators for use in formulas Operator Description Examples and Returns true when both conditions • (1 = 1) and (3 > 2) = true are true, otherwise returns false. • lastname = 'smith' and state ='texas' if...then...else Conditional operator. • if (3 > 2) then 'bigger' else 'not bigger' ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 302 Reference Operator Description Examples • if (cost > 500) then 'flag' else 'approve' ifnull Returns the first value if it is not • ifnull (cost, 'unknown') null, otherwise returns the second. isnull Returns true if the value is null. • isnull (phone) not Returns true if the condition is • not (3 > 2) = false false, otherwise returns false. • not (state = 'texas') Returns true when either condition • (1 = 5) or (3 > 2) = true is true, otherwise returns false. • state = 'california' or state or ='oregon' Text functions Table 47: Text functions for use in formulas Function Description Examples concat Returns the two values as a • concatenated text string. contains concat ( 'hay' , 'stack' ) = 'haystack' Returns true if the first string • concat (last_name , first_name) • contains ('broomstick', 'room') = contains the second string, true otherwise returns false. • contains (product, 'trial version') edit_distance Accepts two text strings. Returns • edit_distance ('attorney', the edit distance (minimum 'atty') = 4 number of operations required to transform one string into the • edit_distance (color, 'red') other) as an integer. Works with strings under 1023 characters. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 303 Reference Function Description Examples edit_distance_with_cap Accepts two text strings and • an integer to specify the upper limit cap for the edit distance edit_distance_with_cap ('pokemon go', 'minecraft pixelmon', 3) = 4 • (minimum number of operations edit_distance_with_cap (event, 'burning man', 3) required to transform one string into the other). If the edit distance is less than or equal to the specified cap, returns the edit distance. If it is higher than the cap, returns the cap plus 1. Works with strings under 1023 characters. similar_to Accepts a document text string • and a search text string. Returns true if relevance score (0-100) similar_to ('hello world', 'hello swirl') = true • of the search string with respect similar_to (current team, drafted by) to the document is greater than or equal to 20. Relevance is based on edit distance, number of words in the query, and length of words in the query which are present in the document. similarity Accepts a document text string • and a search text string. Returns the relevance score (0-100) of the search string with respect similarity ('where is the burning man concert', 'burning man') = 46 • similarity (tweet1, tweet2) to the document. Relevance is based on edit distance, number of words in the query, and length of words in the query which are present in the document. If the two strings are an exact match, returns 100. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 304 Reference Function Description Examples spells_like Accepts two text strings. • Returns true if they are spelled similarly and false if they are not. spells_like ('thouhgtspot', 'thoughtspot') = true • spells_like (studio, distributor) • strlen ('smith') = 5 • strlen (lastname) • strpos ('haystack_with_needles', Works with strings under 1023 characters. strlen strpos Returns the length of the text. Returns the numeric position (starting from 0) of the first occurrence of the second string 'needle') = 14 • strpos (complaint, 'lawyer') • substr ('persnickety', 3, 7) = in the first string, or -1 if not found. substr Returns the portion of the given string, beginning at the location specified (starting from 0), and of the given length. snicket • substr (lastname, 0, 5) Date and time formats reference This is a list of all the date and time formats you can load into ThoughtSpot, whether using data upload from the browser or tsload. Using ThoughtSpot Loader For date data types, the default format is yearmonthday e.g. "Dec 30th, 2001" and is represented as 20011230. Use the date format specifications supported in the strptime library function. For time and datetime data types, the default is yearmonthday hour:minute:second e.g. Dec 30th, 2001 1:15:12 and is represented as 20011230 01:15:12. Use the datetime format specifications supported in the strptime library function. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 305 Reference Using data upload from a browser These date and time formats are supported in an Excel or CSV file when uploading via the browser: • 1/30/2014 • 2014-01-30 • 2014-1-9 • 30-Jan-2014 • 2014-Jan-13 • 2014-01-30 10:32 AM • 2014-01-30 14:52 • 2014-01-30 10:32:22 • 2014-01-30 10:32:22 AM • 2014-01-30 10:32:22.0 • 2014-01-30 10:32:22.0 AM • 2014-01-30 10:32:22.000 • 2014-01-30 10:32:22.000 AM • 1/9/2014 • 30-Jan-14 • 01-Mar-02 (assumes 2002) • 3/1/2002 10:32 AM • 3/1/2002 14:52 • 3/1/2002 10:32:22 • 3/1/2002 10:32:22 AM • 3/1/2002 10:32:22.0 • 3/1/2002 10:32:22.0 AM • 3/1/2002 10:32:22.000 • 3/1/2002 10:32:22.000 AM • 30-Jan-14 10:32 AM ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 306 Reference • 30-Jan-14 14:52 • 30-Jan-14 10:32:22 • 30-Jan-14 10:32:22 AM • 30-Jan-14 10:32:22.0 • 30-Jan-14 10:32:22.0 AM • 30-Jan-14 10:32:22.000 • 30-Jan-14 10:32:22.000 AM • Fri Oct 04 2013 3:26 PM • Fri Oct 04 2013 13:46 • Fri Oct 04 2013 10:32:22 • Fri Oct 04 2013 10:32:22 AM • Fri Oct 04 2013 10:32:22.0 • Fri Oct 04 2013 10:32:22.0 AM • Fri Oct 04 2013 10:32:22.000 • Fri Oct 04 2013 10:32:22.000 AM • 14:52 • 10:32 AM • 10:32:22 • 10:32:22 AM • 10:32:22.0 • 10:32:22.000 • 10:32:22.0 AM • 10:32:22.000 AM Row level security rules reference ThoughtSpot allows you to create row level security rules using expressions. If an expression evaluates to "true" for a particular row and group combination, that ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 307 Reference group will be able to see that row. This reference lists the various operators and functions you can use to create rules. For information on how to use the row level security functions and operators, see About Rule-Based Row Level Security. There is a special variable called ts_groups, which you can use when creating row level security rules. It fetches a list of the groups that the currently logged in user belongs to. For each row, f the expression in the rule evaluates to 'true' for any one of these groups, that row will be shown to the user. You can also see this list of operators and examples from within the Rule Builder by selecting Rule Assistant. Conversion functions These functions can be used to convert data from one data type to another. Conversion to or from date data types is not supported. Table 48: Conversion functions for use in rules Function Description Examples to_bool Returns the input as a boolean • to_bool (0) = false (true or false). to_double Returns the input as a double. • to_double ('3.14') = 3.14 to_integer Returns the input as an integer. • to_integer ('45') + 1 = 46 to_string Returns the input as a text • to_string (45 + 1) = '46' string. • to_string (revenue - cost) ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 308 Reference Date functions Table 49: Date functions for use in rules Function Description Examples add_days Returns the result of adding • the specified number of days date add_days (01/30/2015, 5) = 02/04/2015 to the given date. • add_days (invoiced, 30) Returns the date portion of a • date (home visit) Returns the number (1-31) of • day (01/15/2014) = 15 the day for the given date. • day (date ordered) Returns the number (1-7) • day_number_of_week given date. day day_number_of_week of the day in a week for the given date with 1 being (01/30/2015) = 6 • Monday and 7 being Sunday. day_number_of_year day_number_of_week (shipped) Returns the number (1-366) of • day_number_of_year the day in a year for the given (01/30/2015) = 30 date. • day_number_of_year (invoiced) day_of_week Returns the day of the week • for the given date. diff_days Subtracts the second date Friday • day_of_week (serviced) • diff_days (01/15/2014, from the first date and returns the result in number of days, 01/17/2014) = -2 • rounded down if not exact. diff_time Subtracts the second date seconds. ThoughtSpot Administration Guide Copyright diff_days (purchased, shipped) • from the first date and returns the result in number of day_of week (01/30/2015) = diff_time (01/01/2014, 01/01/2014) = -86,400 • diff_time (clicked, submitted) © 2017 by ThoughtSpot. All Rights Reserved. 309 Reference Function Description Examples hour_of_day Returns the hour of the day • hour_of_day (received) • is_weekend (01/31/2015) = for the given date. is_weekend Returns true if the given date falls on a Saturday or Sunday. month month_number true • is_weekend (emailed) Returns the month from the • month (01/15/2014) = January given date. • month (date ordered) Returns the number (1-12) of • month_number (09/20/2014) = the month for the given date. now Returns the current 9 • month_number (purchased) • now () • start_of_month (01/31/2015) timestamp. start_of_month Returns the epoch for the first day of the month for the start_of_quarter =1420099200 given date. • start_of_month (shipped) Returns the epoch for the • start_of_quarter first day of the quarter for the given date. start_of_week • time start_of_quarter (sold) Returns the epoch for the first • start_of_week (05/30/2015) = day of the week for the given 1432450800 date. start_of_year (09/18/2015) = 1441090800 • start_of_week (paid) Returns the epoch for the first • start_of_year (02/15/2015) = day of the fiscal year for the 1420099200 given date. • start_of_year (joined) Returns the time portion of a • time (3/1/2002 10:32) = given date. 10:32 • ThoughtSpot Administration Guide Copyright © time (call began) 2017 by ThoughtSpot. All Rights Reserved. 310 Reference Function Description Examples year Returns the year from the • year (01/15/2014) = 2014 given date. • year (date ordered) Mixed functions These functions can be used with text and numeric data types. Table 50: Mixed functions for use in rules Function Description Examples != Returns true if the first value is • 3 != 2 = true not equal to the second value. • ts_groups != public Returns true if the first value is • 3 > 2 = true • 3 >= 2 = true Returns true if the first value is • 2 = 2 = true equal to the second value. • ts_groups = region Returns true if the first value is • 3 < 2 = false • 3 >= 2 = true > greater than the second value. >= Returns true if the first value is greater than or equal to the second value. = < less than the second value. <= Returns true if the first value is less than or equal to the second value. Number functions Table 51: Number functions for use in rules Function Description Examples * Returns the result of • 3 * 2 = 6 multiplying both numbers. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 311 Reference Function Description Examples • + - Returns the result of adding • 1 + 2 = 3 both numbers. • price + shipping Returns the result of • 3 - 2 = 1 subtracting the second • revenue - tax Returns the result of • 6 / 3 = 2 dividing the first number by • markup / retail price Returns the first number • 3 ^ 2 = 9 raised to the power of the • width ^ 2 • abs (-10) = 10 • abs (profit) Returns the inverse cosine • acos (0.5) = 60 in degrees. • acos (cos-satellite-angle) Returns the inverse sine • asin (0.5) = 30 (specified in degrees). • asin (sin-satellite-angle) Returns the inverse tangent • atan (1) = 45 in degrees. • atan (tan-satellite-angle) Returns the inverse tangent • atan2 (10, 10) = 45 in degrees. • atan2 (longitude, latitude) Returns the cube root of a • cbrt (27) = 3 number. • cbrt (volume) Returns the smallest • ceil (5.9) = 6 following integer. • ceil (growth rate) number from the first. / the second. ^ second. abs acos asin atan atan2 cbrt ceil price * taxrate Returns the absolute value. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 312 Reference Function Description Examples cos Returns the cosine of an • cube exp exp2 floor greatest cos (63) = 0.45 angle (specified in degrees). • cos (beam angle) Returns the cube of a • cube (3) = 27 number. • cube (length) Returns Euler's number • exp (2) = 7.38905609893 (~2.718) raised to a power. • exp (growth) Returns 2 raised to a power. • exp2 (3) = 8 • exp2 (growth) Returns the largest previous • floor (5.1) = 5 integer. • floor (growth rate) Returns the larger of the • greatest (20, 10) = 20 values. • greatest (q1 revenue, q2 revenue) least ln log10 log2 mod Returns the smaller of the • least (20, 10) = 10 values. • least (q1 revenue, q2 revenue) Returns the natural • ln (7.38905609893) = 2 logarithm. • ln (distance) Returns the logarithm with • log10 (100) = 2 base 10. • log10 (volume) Returns the logarithm with • log2 (32) = 5 base 2 (binary logarithm). • log2 (volume) Returns the remainder of • mod (8, 3) = 2 first number divided by the • mod ( revenue , quantity ) Returns the first number • pow (5, 2) = 25 raised to the power of the • pow (width, 2) second number. pow second number. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 313 Reference Function Description Examples random Returns a random number • random ( ) = .457718 between 0 and 1. • random ( ) Returns the first number • round (35.65, 10) = 40 rounded to the second • round (battingavg, 100) Returns +1 if the number is • sign (-250) = -1 greater than zero, -1 if less • sign (growth rate) Returns the sine of an angle • sin (35) = 0.57 (specified in degrees). • sin (beam angle) Returns the distance in km • spherical_distance (37.465191, round number (the default is 1). sign than zero, 0 if zero. sin spherical_distance between two points on -122.153617, 37.421962, Earth. -122.142174) = 4,961.96 • spherical_distance (start_latitude, start_longitude, start_latitude, start_longitude) sq sqrt tan Returns the square of a • sq (9) = 81 numeric value. • sq (width) Returns the square root. • sqrt (9) = 3 • sqrt (area) • tan (35) = 0.7 Returns the tangent of an angle (specified in degrees). • ThoughtSpot Administration Guide Copyright © tan (beam angle) 2017 by ThoughtSpot. All Rights Reserved. 314 Reference Operators Table 52: Operators for use in rules Operator Descriptions Examples if...then...else Conditional operator. • if (3 > 2) then 'bigger' else 'not bigger' • if (country = 'USA') then 'US_group' = ts_groups else false isnull Returns true if the value is null. • isnull (phone) ifnull Returns the first value if it is not • ifnull (cost, 'unknown') Returns true if the condition is • not (3 > 2) = false false, otherwise returns false. • not (region contains 'west') Returns true when either condition • (1 = 5) or (3 > 2) = true is true, otherwise returns false. • (ts_groups = region) or null, otherwise returns the second. not or (ts_groups =state) Text functions Table 53: Text functions for use in rules Function Description Examples concat Returns the two values as a • concatenated text string. concat ('hay', 'stack') = 'haystack' • concat (region , 'sales') = ts_groups contains Returns true if the first string • contains the second string, otherwise returns false. ThoughtSpot Administration Guide Copyright true • © contains ('broomstick', 'room') = contains (country, 'US') 2017 by ThoughtSpot. All Rights Reserved. 315 Reference Function Description Examples strlen Returns the length of the text. • strlen ('smith') = 5 strpos Returns the numeric position • strpos ('haystack_with_needles', (starting from 0) of the first 'needle') = 14 occurrence of the second string in the first string, or -1 if not found. substr Returns the portion of the given • string, beginning at the location substr ('persnickety', 3, 7) = 'snicket' specified (starting from 0), and of the given length. Variables These variables can be used in your expressions. Table 54: Mixed functions for use in formulas Function Description Examples ts_groups Returns the list all the groups • ts_groups = east the current logged in user belongs to. For any row, if the expression evaluates to true for any of the groups, the user can see that row. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 316 Appendix Contact ThoughtSpot You can contact ThoughtSpot by phone, mail, email, or by filing a support ticket. File a support ticket If you encounter a technical issue, file a support ticket using the Support Portal ticket filing system at: http://support.thoughtspot.com/ Please provide as much detail as possible about your issue, to help us resolve it quickly. You need a Support Portal login to file a ticket. Please contact ThoughtSpot to get an account, if necessary. Address ThoughtSpot, Inc. 1 Palo Alto Square, Building 1, Suite 200 Palo Alto, CA 94306 Phone numbers Table 55: Phone numbers Phone Number Description 1-800-508-7008 ThoughtSpot Support ext 1 ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 317 Contact ThoughtSpot Phone Number Description 1-800-508-7008 Toll free number for ThoughtSpot headquarters. Email Table 56: Email addresses Reason for contacting Email For sales inquiries. sales@thoughtspot.com For customer support and support@thoughtspot.com software update inquiries. For other inquiries. ThoughtSpot Administration Guide Copyright © hello@thoughtspot.com 2017 by ThoughtSpot. All Rights Reserved. 318 Appendix Open source software ThoughtSpot uses open source libraries. You may request the machine-readable source code from ThoughtSpot and may use the source code subject to the terms and conditions of the applicable open source license. Refer to our third party software page for a complete list of our software library and licenses. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 319 Copyright Appendix Copyright Copyright for ThoughtSpot publications. ©2017 ThoughtSpot, Inc. All rights reserved. ThoughtSpot, Inc. 1 Palo Alto Square, Building 1, Suite 200, Palo Alto, CA 94306 All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. ThoughtSpot is a trademark of ThoughtSpot, Inc. in the United States and/or other jurisdictions. All other marks and names mentioned herein may be trademarks of their respective companies. ThoughtSpot Administration Guide Copyright © 2017 by ThoughtSpot. All Rights Reserved. 320
Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
Linearized                      : No
Page Count                      : 320
Profile CMM Type                : lcms
Profile Version                 : 2.1.0
Profile Class                   : Display Device Profile
Color Space Data                : RGB
Profile Connection Space        : XYZ
Profile Date Time               : 1998:02:09 06:49:00
Profile File Signature          : acsp
Primary Platform                : Apple Computer Inc.
CMM Flags                       : Not Embedded, Independent
Device Manufacturer             : IEC
Device Model                    : sRGB
Device Attributes               : Reflective, Glossy, Positive, Color
Rendering Intent                : Perceptual
Connection Space Illuminant     : 0.9642 1 0.82491
Profile Creator                 : lcms
Profile ID                      : 0
Profile Copyright               : Copyright (c) 1998 Hewlett-Packard Company
Profile Description             : sRGB IEC61966-2.1
Media White Point               : 0.95045 1 1.08905
Media Black Point               : 0 0 0
Red Matrix Column               : 0.43607 0.22249 0.01392
Green Matrix Column             : 0.38515 0.71687 0.09708
Blue Matrix Column              : 0.14307 0.06061 0.7141
Device Mfg Desc                 : IEC http://www.iec.ch
Device Model Desc               : IEC 61966-2.1 Default RGB colour space - sRGB
Viewing Cond Desc               : Reference Viewing Condition in IEC61966-2.1
Viewing Cond Illuminant         : 19.6445 20.3718 16.8089
Viewing Cond Surround           : 3.92889 4.07439 3.36179
Viewing Cond Illuminant Type    : D50
Luminance                       : 76.03647 80 87.12462
Measurement Observer            : CIE 1931
Measurement Backing             : 0 0 0
Measurement Geometry            : Unknown
Measurement Flare               : 0.999%
Measurement Illuminant          : D65
Technology                      : Cathode Ray Tube Display
Red Tone Reproduction Curve     : (Binary data 2060 bytes, use -b option to extract)
Green Tone Reproduction Curve   : (Binary data 2060 bytes, use -b option to extract)
Blue Tone Reproduction Curve    : (Binary data 2060 bytes, use -b option to extract)
Format                          : application/pdf
Language                        : en
Date                            : 2017:03:10 15:11:00-08:00
Producer                        : Apache FOP Version 1.1
PDF Version                     : 1.4
Creator Tool                    : Apache FOP Version 1.1
Metadata Date                   : 2017:03:10 15:11:00-08:00
Create Date                     : 2017:03:10 15:11:00-08:00
Page Mode                       : UseOutlines
Creator                         : Apache FOP Version 1.1
EXIF Metadata provided by EXIF.tools

Navigation menu