Chat Script System Functions Manual

ChatScript-System-Functions-Manual

User Manual:

Open the PDF directly: View PDF .
Page Count: 61

ChatScript System Functions Manual
- Rule Tags
Topic Functions
- ^addtopic ( topicname )
Marking Functions
- ^mark ({"SINGLE" word location )
Input Functions
- ^analyze ( stream )
Number Functions
- ^compute ( number operator number )
Output Functions
Control Flow Functions
- ^argument ( n )
External Access Functions
JSON Functions
- ^jsonarrayinsert ( arrayname value )
Word Manipulation Functions
- ^burst ( {count once} data-source burst-character-string )
Multipurpose Functions
- ^disable ( what ? )
FACT FUNCTIONS
- ^findfact ( subject verb object )

ChatScript System Functions Manual

Bruce Wilcox, gowilcox@gmail.com www.brilligunderstanding.com Revision

2/18/2018 cs8.1

•Topic Functions

•Marking Functions

•Input Functions

•Number Functions

•Output Functions

•Control Flow Functions

•External Access Functions

•JSON Functions

•Word Manipulation Functions

•Multipurpose Functions

•Facts Functions

System functions are predeﬁned and can be intermixed with direct output.

Generally they are used from the output side of a rule, but in many cases nothing

prevents you from invoking them from inside a pattern. When used in a pattern,

they do not write out any text output to the user. But their output will be tested

the same as it would from an if statement, meaning 0 and false are failures.

You can write them with or without a

in front of their name. With is clearer,

but you don’t have to. The only time you must is if the ﬁrst thing you want to

do in a gambit is call a function (unlikely).

t: name(xxx)

This is ambiguous. Is it function call or label and pattern?

The above is treated as a label and pattern. You can force it to be a function

call by one of these:

t: ^name(xxx) # explicilty say it is a function

t: () name(xxx) # explicitly add an empty pattern

Rule Tags

Some functions out or take “rule tags”. All rules have an internal label consisting

of ~topic.toplevelindex.rejoinderindex. E.g.

~introductions.0.5

stands for the 0th rule in the ~introductions topic, rejoinder #5.

Topic Functions

ˆaddtopic ( topicname )

adds the named topic as a pending topic at the head of the list. Typically you

don’t need to do this, because ﬁnding a reaction from a topic which is not a

system, disabled, or nostay topic will automatically add the topic to the pending

list. Never returns a fail code even if the topic name is bad.

ˆavailable ( ruletag optionalfail )

Sees if the named rule is available (1) or used up (0). If you supply the optional

argument, the function will fail if the rule is not available.

ˆcleartopics()

Empty the pending topics list.

ˆcounttopic ( topic what )

For the given topic, return how many rules match what.

What is gambit,available,rules,used.

That is, how many gambits exist, how many available gambits exist (not erased),

how many top level rules (gambits + responders) exist, and how many top level

rules have been erased.

ˆgambit ( value value ... )

If value is a topic name, runs the topic in gambit mode to see if any gambits

arise. If none arise from the ﬁrst value, it will try the second, and so on. It

does not fail unless a rule forces it to fail or the named topic doesn’t exist or

is disabled. You can supply an optional last argument

FAIL

, in which case it

will return

FAILRULE_BIT

if it didn’t fail but it didn’t generate any new output

either.

The value may be

, which means use the current topic you are within. It can

also be

PENDING

, which means pick a topic from the pending topics stack (they

are all pending being returned to but not including the current topic). Or it can

be any other word, which will be a keyword of some topic to pick. E.g.,

^gambit(~ PENDING ~mygeneraltopic FAIL)

ˆgetrule ( what label )

for the given rule label or tag, return some fragment of the rule.

what can be tag,type,label,pattern,output,topic, and usable.

The type will be t,?,s,a, etc.

If a rule label is involved, optional third argument if given means only ﬁnd

enabled rules with that label. For usable, returns 1 if is can be used or null if it

has been erased. The label

means the current rule. The label

means the top

level rule above us (if we are a rejoinder, otherwise it is the same as ~).

ˆhasgambit ( topic )

fails if topic does not have any gambits left unexecuted.

Even it if does, they may not execute if they have patterns and they don’t match.

Optional second argument, if

any

will return normally if topic has any gambits

(executed or not) and will failrule if topic has no gambits (a reactor topic).

ˆkeep()

do not erase this top level rule when it executes its output part (you could

declare a topic to be this, although it wouldn’t aﬀect gambits).

Doing

keep()

on a gambit is quite risky since gambits after it may not ever ﬁre.

ˆlastused ( topic what )

given a topic name, get the volley of the last what, where what is

GAMBIT

RESPONDER,REJOINDER,ANY. If it has never happened, the value is 0.

ˆnext ( what {label} )

Given what of

GAMBIT

RESPONDER

REJOINDER

RULE

and a rule label or

tag, ﬁnd the next rule of that what. Fails if none is found.

REJOINDER will fail if it reaches the next top level rule.

If label is

, it will use the last call’s answer as the starting point, enabling you

to walk rules in succession.

There is also ˆnext(FACT @xxx) – see fact manual.

For

ˆnext(INPUT)

the system will read the next sentence and prep the system

with it. This means that all patterns and code executing thereafter will be in

the context of the next input sentence. That sentence is now used up, and will

not be seen next when the current revised sentence ﬁnishes.

Sample code might be:

t: Do you have any pets

a: ( ~yes ) refine()

b: ( %more ) ^next(input) refine()

c: ( ~pets ) ... # react to pet

c: () ^retry(SENTENCE) # return to try input from scratch

b: () What kind do you have?

c: ( ~pets ) ... # react to pet

If label is

LOOP

, the system will stop processing code in the current loop and

return to the next iteration of it, e.g. C++/Java continue, except that it will

stop all code and return to however high up the loop really is, exiting topics and

functions willy nilly if need be.

ˆpoptopic ( topicname )

Removes the named topic as a pending topic. The intent is not to automatically

return here in future conversation. If topicname is omitted, removes the current

topic AND makes the current topic fail execution at this point.

ˆrefine ( ? )

This is like a switch statement in C language. It executes in order the rejoinders

attached to its rule in sequence.

When the pattern of one matches, it executes that output and is done, regardless

of whether or not the output fails or generates nothing. It does not “fail”, unless

you add an optional FAIL argument. You can also provide a rule tag. Normally

it uses the rule the reﬁne is executing from, but you can direct it to reﬁne from

any rule.

ˆrejoinder ( {tag/label} )

Without argument, see if the prior input ended with a potential rejoinder rule,

and if so test it on the current sentence. If we match and dont fail on a rejoinder,

the rejoinder is satisﬁed. If we fail to match on the 1st input sentence, the

rejoinder remains in place for a second sentence. If that doesn’t match, it is

canceled. It is also canceled if output matching the ﬁrst sentence sets a rejoinder.

You can give an optional tag or label to pretend the named rule had been the

one to set a rejoinder and so therefore execute its rejoinders explicitly.

ˆrespond ( value value ... )

Tests the sentence against the named value topic in responder mode to see if any

rule matches (executes the rule when matched). It does not fail (though it may

not generate any output), unless a rule forces it to fail or the topic requested

does not exist or is disabled.

This rule will not erase but the responding rule might. If the ﬁrst value fails to

generate an answer, it tries the second, and so on. You can supply an optional

last argument

FAIL

, in which case it will return

FAILRULE_BIT

if it didn’t fail

but it didn’t generate any new output either. You could instead supply an

optiona last argument

TEST

, in which case a topic is executed to see if a rule

will match. If so, the tag is returned and no output is made from the topic (and

no rule is used up).

If a value designates a labelled or tagged rule (e.g.,

~mytopic.mylabel

~mytopic.1.0

) then the system will skip over all rules until it reaches that rule,

then begin linear scanning, even if the topic is designated random.

The value may be ~, which means use the current topic you are within.

It can also be

PENDING

, which means pick a topic from the pending topics stack

(they are all pending being returned to but not including the current topic). Or

it can be any other word, which will be a keyword of some topic to pick.

ˆretry ( item )

If item is

RULE

reexecute the current rule. It will automatically try to match

one word later than its ﬁrst match previously.

If item is TOPIC it will try the topic over again.

If item is

SENTENCE

it will retry doing the sentence again. To prevent inﬁnite

loops, it will not perform more than 5 retries during a volley.

SENTENCE

particularly useful with changing the tokenﬂags to get input processing done

diﬀerently. If item is INPUT it will retry all input again.

ˆretry(TOPRULE)

will return back to the top level rule (not of the topic but of

a rejoinder set) and retry.

It’s the same if the current rule was a top level rule, but if the current rule is

from

ˆrefine()

, then it returns to the outermost rule to restart. If the current

rule is not from

ˆrefine()

, then

TOPRULE

means the lexically placed toprule

above the current rule and a ˆreuse() will be performed to go to it.

ˆreuse ( rule label optional-enable optional-FAIL )

Uses the output script of another rule. The label can either be a simple rule

label within the current topic, or it can be a dotted pair of a topic name and a

label within that topic or it can be a rule tag.

ˆreuse

stops at the ﬁrst correctly labeled rule it can ﬁnd and issues a RULE

fail if it cannot ﬁnd one. Assuming nothing fails, it will return 0 regardless of

whether or not any output was generated.

When it executes the output of the other rule, that rule is credited with matching

and is disabled if it is allowed. If not allowed, the calling rule will be disabled if

it can be.

t: NAME () My name is Bob.

?: ( << what you name >> )

^reuse(NAME)

?: ( << what you girlfriend name >> )

^reuse(~SARAH.NAME)

Normally reuse will use the output of a rule whether or not the rule has been

disabled. But, if you supply a 2nd argument (whatever it is), then it will ignore

disabled ones and try to ﬁnd one with the same label that is not disabled. You

can also supply a

FAIL

argument (as either 2nd or 3rd) which indicates the

system should issue a RULE FAIL if it doesn’t generate any output.

If you want to use a common rule to hold an answer and ONLY ﬁre when reused,

perhaps with rejoinders, the most eﬃcient way to do that is with a rule whose

pattern can never match. E.g. like this:

s: COMMON (?) some answer

a: () some rejoinder...

You make

ˆreuses

go to COMMON (or whatever you name it) or even

ˆsetrejoinder

on it. The rule itself can never trigger because it only considers

its pattern when the input is a statement, but the pattern says the input must

be a question. So this rule never matches on its own.

There are also a variety of functions that return facts about a topic, but you

have to read the facts manual to learn about them.

ˆsequence ( ? )

This is like

ˆrefine

, except instead of only executing the ﬁrst rejoinder that

matches, it executes all matching rejoinders in order. If one of the rule outputs

fails, it stops by failing the calling rule.

Normally

ˆsequence

uses the rejoinders of the rule that it is executing from,

but you can direct it to ˆsequence the rejoinders of any rule.

ˆsetrejoinder ( {kind} tag )

Force the output rejoinder to be set to the given tag or rule label. It’s as though

that rule had just executed, so the rules beneath it will be the rejoinders to try.

If kind is input then the input rejoinder is set.

If kind is output or is omitted, then it sets the output rejoinder.

ˆsetrejoinder

does not jump anywhere. It establishes the context for

ˆrejoinder.

When you do:

t: what is your name

a: ATX(_~propernoun) Hi, '_0

the outputrejoinder is set to

ATX

. You can change that if you want. When the

next volley comes in, the outputrejoinder is now the inputrejoinder and used for

ˆrejoinder

. You can modify that as well. Both can exist simultaneously, you

have the input context and you set an output context before having used up the

inputrejoinder.

Setting a rejoinder on a rule means starting with the rejoinder immediately after

it. If you were trying to copy a rejoinder that had already been established and

redo it later, eg.

^setrejoinder(output %inputrejoinder)

this would be problematic, because it would set it to the rule after, which would

be wrong. For this use the kind of “copy” which does not have issues with this.

^setrejoinder(copy %inputrejoinder)

If kind is output or copy and no tag is given or the tag is

null

, the output

rejoinder is cleared (analogous to ˆdisable).

If the kind is input and no tag is given or the tag is

null

, the input rejoinder is

cleared.

ˆtopicflags ( topic )

Given a topic name, return the control bits for that topic. The bits are mapped

in dictionary_system.h as TOPIC_*.

ˆsleep ( milliseconds )

This stalls the engine for that many milliseconds. If this is a server, the server

is unavailable until sleep is done. Use with care. A good use is when starting

up a server instance and the boot process involves reading from an API. If your

machine runs 30 instances of ChatScript launched at once (to use max CPU),

then all of them hitting the same API at once may be bad for the API and

forcing a randomized sleep based on processid is a good use.

Marking Functions

ˆmark ({"SINGLE" word location )

Marking and unmarking words and concepts is fundamental to the pattern match-

ing mechanism, so the system provides both an automatic marking mechanism

and manual override abilities. You can manually mark or unmark something.

Automatic system marking marks all concepts implied by chasing up membership

in other concepts, as does this call

ˆmark

word

can be any word, which also

means you can mark something with a concept name whether or not the concept

actually is deﬁned anywhere.

There are two mechanisms supported using

ˆmark

and

ˆunmark

: speciﬁc and

generic.

With

specific

, you name words or concepts to mark or unmark, either at a

particular point in the sentence or throughout the sentence.

With

generic

you disable or reenable all existing marks on a word or words

in the sentence. In fact, you go beyond that because during patttern matching

words you disable are invisble entirely, and matching proceeds as if they do not

exist.

Specific

: eﬀects are permanent for the volley and cross over to other rules. In

documentation below, use of _0 symbolizes use of any match variable.

ˆmark ( ~meat _0 )

This marks

~meat

as though it has been seen at whereever sentence location

is bound to (start and end)

ˆmark ( ~meat n )

Assuming

is within 1 and sentence word limit, this marks meat at nth word

location. If

was gotten from ˆposition of a match variable, it is the range of

that match variable.

ˆmark ( tomboy _0 )

This marks the word tomboy as visible at the location designated, even though

this word is not actually in the sentence. While patterns will react to its presence,

it will not show up in any memorizations using _.

While usually you mark a concept, you can also mark a word (though you should

generally use the canonical form of the word to trigger all its normal concept

hierarchy markings as well).

Although

ˆconceptlist

(see Facts manual) normally only reports concepts

marked at a word, if you explicitly mark using a word and not a concept, that

will also be reported in ˆconceptlist.

ˆmark ( ~meat )

With location omitted, this marks

~meat

as though it has been seen at sentence

start (location 1).

ˆmark()

Clears all global unmarks. restore a global ˆunmark(0) exactly as it was before

the global unmark.

ˆunmark ( word _0 )

The inverse of speciﬁc

ˆmark

, this takes a matchvariable that was ﬁlled at the

position in the sentence you want erased and removes the mark on the word

or concept set or topic name given. Pattern matching for it in that position

will now fail. But it is not symmetric to

ˆmark

because it does not remove all

implied marks that mark may have set.

ˆunmark ( * n )

Assuming

is within

and sentence word limit, this unmarks all concepts at

nth word location. If n was gotten from

ˆposition

of a match variable, it is the

range of that match variable.

ˆunmark ( word all )

All references to word (or

~concept

if you named one) are removed from anywhere

in the sentence.

Generic:

eﬀects are transient if done inside a pattern, last the volley if done

in output. When you are trying to analyze pieces of a sentence, you may want

to have a pattern that ﬁnds a kind of word, notes information, then hides that

kind of word and reanalyzes the input again looking for another of that ilk.

Being able to temporarily hide marks can be quite useful, and this means typically

you use

ˆunmark

of some ﬂavor to hide words, and then

ˆmark

later to reenable

access to those hidden words.

ˆunmark ( * _0 )

Aays turn oﬀ ALL matches on this location temporarily. The word becomes

invisible. It disables matching at any of the words spanned by the match variable.

This unmark will also block subsequent speciﬁc marking using

ˆmark

at their

locations.

ˆmark ( * _0 )

To restore all marks to some location.

ˆunmark ( * )

Turns oﬀ matching on all words of the sentence.

ˆmark ( * )

Restores all marks of the sentence.

Reminder: If you do a generic unmark from within a pattern, it is transient and

will be turned oﬀ when the pattern match ﬁnishes (so you don’t ruin later rules),

whereas when you do it from output, then the change persists for the rest of the

volley. Furthermore it is handy to ﬂip speciﬁc collections of generic unmarks on

an oﬀ.

ˆmark()

memorizes the set of all * unmarks (generic unmarks) and then turns

them oﬀ so normal matching will occur.

ˆunmark()

will restore the set of generic unmarks that were ﬂipped oﬀ using

ˆmark().

ˆposition ( how matchvariable )

This returns the integer representing where the named match variable is located.

how can be

START

END

, or

BOTH

. Both means an encoding of where the start

and end of the the match was. See

@_n

in pattern matching to set a position or

the ˆsetposition function.

ˆmarked ( word )

returns

if word is marked, returns

FAILRULE_BIT

if the given word is not

currently marked from the current sentence.

ˆsetposition ( _var start end )

Sets the match location data of a match var to the number values given.

Alternatively you can do

ˆsetposition ( _var _var1 )

, which is redundant

with just doing _var = _var1.

ˆsetcanon ( wordindex value )

Changes the canonical value for this word.

ˆsettag ( wordindex value )

Changes the pos tag for the word.

ˆsetoriginal ( wordindex value )

Changes the original value for this word.

ˆsetrole ( wordindex value )

Changes the parse role for this word. These are used in conjunction with

$cs_externaltag

to replace the CS inbuilt English postagger and parser with

one from outside. See end of ChatScript PosParser manual.

ˆsavesentence ( label ) /ˆrestoresentence ( label )

These two functions save and restore the current entire sentence preparation

context. That means everything that pattern matching depends upon from

the current sentence can be saved, you can go on to a new sentence (either via

ˆnext(INPUT)

ˆanalyze()

or whatever), and then rapidly ﬂip back to some

previous sentence analysis. Label is a value used to label the saved analysis.

This only works during the current volley.

Cannot be used in document mode.

ˆsavesentence

returns the number of

4-byte words the save took.

Input Functions

ˆanalyze ( stream )

The stream generates output (not printed to user) and then prepares the content

as though it were current input sentence. This means the current sentence

ﬂagging and marking are all replaced by this one’s. It does not aﬀect any

pending input still to be processed. If the stream is quoted string, the quotes

are removed. This would be common, for example, when analyzing output from

the chatbot gotten via grabbing facts with “chatoutput” as the verb.

Note that the stream is considered a single sentence. If you want to supply

multiple sentences, you need to call

ˆtokenize

and then loop on the facts

created.

Note that

ˆanalyze

does not call any prepass topic you may have, but you can

invoke that topic directly aterwards yourself.

ˆtokenize ( {WORD SENTENCE} stream )

WORD or SENTENCE are optional parameters (SENTENCE is default).

SENTENCE

, then splits the stream into sentences and creates facts of each like

this: (sentence ˆtokenize ˆtokenize).

WORD

, then splits it entirely into words paying no attention to sentence

boundaries.

ˆcapitalized ( n )

Returns

if the nth word of the sentences starts with a capital letter in user

input, else returns 0.

If nis alphabetic, it returns whether or not it starts with a capital letter. Illegal

values of n return failrule.

ˆinput ( ... )

The arguments, separated by spaces, are injected back into the input stream as

the next input, processed before any pending additional input. Typically this

command is then followed by

ˆfail(SENTENCE)

to cancel current processing

and move onto the revised input.

Since the sentence is fed in immediately after the current input, if you want to

feed in multiple sentences, you must reverse the order so the last sentence to be

processed is submitted via input ﬁrst. You can detect that the current sentence

comes from

ˆinput

and not from the user by

%revisedInput

(bool) being true

(1).

ˆoriginal ( _n )

The argument is the name of a match variable. Whatever it has memorized will

be used to locate the corresponding series of words in the original raw input

from the user that led to this match.

E.g., if the input was: I lick ice crem, the converted input became I lick

ice_create and you’d memorized the food onto a match variable, then you could

do ˆoriginal(_0) and get back ice crem.

Another example:

# get foreign language proper name, without any CS standard processing.

u: what's your first name?

#! Anna Lisa

a: ( _* )

$firstname = ^original ( _0 )

Nice to meet you, $firstname

ˆposition ( which _var )

If which is

start

this returns the starting index of the word matched in the

named _var.

If which is end this returns the ending index. E.g.,

if the value of

was the fox, it might be that start was 3 and end was 4 in the

sentence it was the fox .

ˆremovetokenflags ( value )

Rremoves these ﬂags from the tokenﬂags returned from the preprocessing stage.

ˆsettokenflags ( value )

Adds these ﬂags to the tokenﬂags return from the preprocessing stage. Par-

ticularly useful for setting the

#QUESTIONMARK

ﬂag indicating the input was

perceived to be a question.

For example, I treat tell me about cars sentences as questions by marking them

as such from script (equivalent to what do you know about cars?).

ˆsetwildcardindex ( value )

Tells the system to start at

value

for future allocations of wildcard slots. This is

only useful inside some pattern where you are trying to protect data from some

previous match. Eg.

u: (_~animals) refine()

a: ( ^setwildcardindex(_1) _~color)

is set to an animal. Normally the rejoinder would set a color onto

and

clobber it, but the call to

ˆsetwildcardindex

forces it to use

instead, so

both _0 and _1 have values.

ˆisnormalword (value)

Fails if value has a character that is not alphabetic, numberic, a hyphen, an

underscore, or an apostrophe.

Number Functions

ˆcompute ( number operator number )

Performs arithmetic and puts the result into the output stream.

Numbers can be integer or ﬂoat and will convert appropriately. There are a

range of operators that have synonyms, so you can pass in directly what the

user wrote. The answer will be

if the operation makes no sense and inﬁnity if

you divide by 0.

~numberOperator recognizes these operations:

operator symbol description

+plus add and (addition)

-minus subtract deduct (subtraction)

*x time multiply (multiplication)

operator symbol description

/divide quotient (ﬂoat division)

%remainder modulo mod (integer only- modulo)

root square_root (square root)

ˆˆ power exponent (exponent )

<< and >> shift (limited to shifting 31 bits or less)

random ( 0 random 7 means 0,1,2,3,4,5,6 - integer only)

Basic operations can be done directly in assignment statements like:

$var = $x + 43 – 28

ˆtimefromseconds ( seconds {offset} )

This converts time in seconds (Unix epoch time) from the given time in whatever

timezone, to a string like

%time

returns. You can compute a diﬀerence in times

by merely doing a subtraction of the two times.

%fulltime

will give you the

current time that you could plug in here. The optional second argument will

displace that time by the hours oﬀset (can be plus or minus).

ˆtimeinfofromseconds ( seconds )

This converts time in seconds (Unix epoch time) into its component bits, spread

across 7 match variables. Starting by default at _0, if you assign it like this:

_3 = ^timeinfofromseconds(%fulltime)

it will start at

. The items you get are: seconds, minutes, hours, date in

month, month name, year, day name of week, month index (jan==0), dayofweek

index (sun==0).

ˆtimetoseconds ( seconds minutes hours date-of-month month year )

This converts time data since 1970 (Unix epoch time). Analogous to

%fulltime

which returns the current time in seconds. Month can be number 1-12 or name

of month or abbreviation of month. Date-of-month must be 1 or more. Year

must be on or 1970 and less than 2100. Optional 7th argument indicates whether

time is within daylight savings or not , values can be 1 or 0, t or f, T or F.

Default is false.

ˆisnumber ( value )

Fails if value is not an integer, ﬂoat, or currency,

Output Functions

The following functions cannot be used during postprocessing since output has

been ﬁnished in theory and you can now analyze it.

ˆflushoutput()

Takes any current pending output stream data and sends it out. If the rule later

fails, the output has been protected and will still go out (though the rule will

not erase itself).

ˆinsertprint ( where stream )

The stream will be put into output, but it will be placed before output number

where or before output issued by the topic named by where. The output is safe

in that even if the rule later fails, this output will go out. Before the where, you

may put in output control ﬂags as either a simple value or a value list in parens.

ˆkeephistory ( who count )

The history of either

BOT

USER

(values of who) will be cut back to the count

give. This aﬀects detecting repeated input on the part of the user or detecting

repeating output by the chatbot.

ˆlastsaid ()

Returns what the bot said last volley.

ˆprint ( stream )

Sends the results of outputing that stream to the user. It is isolated from the

normal output stream, and goes to the user whether or not one later generates

a failure code from the rule. Before the output you may put in output control

ﬂags as either a simple value without a

(e.g.,

OUTPUT_EVALCODE

) or a value

list in parens.

Flags include:

Flag description

OUTPUT_EVALCODE is automatic, so not particularly

useful. Useful ones would control

how print decides to space things

Flag description

OUTPUT_RAW

does not attempt to interpret ( or

{or [or "

OUTPUT_RETURNVALUE_ONLY

does not go to the user, is merely

return as an answer. Print

normally stores directly into the

response system, meaning failing

the rule later has no eﬀect. Print

normally does not return a value

so you can’t store it into a

variable. And print has a number

of ﬂags that can aﬀect its

formatting that dont exist with

normal output. This ﬂag

converts print into an ordinary

function returning a value,

reversing all those diﬀerences

OUTPUT_NOCOMMANUMBER dont add commas to numbers

OUTPUT_NOQUOTES remove quotes from strings

OUTPUT_NOUNDERSCORE convert underscores to blanks

These ﬂags apply to output as it is sent to the user:

Flag description

RESPONSE_NONE turn oﬀ all default response conversions

RESPONSE_UPPERSTART force 1st character of output to be

uppercase

RESPONSE_REMOVESPACEBEFORECOMMAas the name says

RESPONSE_ALTERUNDERSCORES convert underscores to spaces

RESPONSE_REMOVETILDE remove leading ~ on class names

RESPONSE_NOCONVERTSPECIAL don’t convert ecaped n, r, and t into ascii

direct characters

RESPONSE_CURLYQUOTES change simple quotes to curly quotes

(starting and ending)

ˆpreprint ( stream )

The stream will be put into output, but it will be placed before all previously

generated outputs instead of after, which is what usually happens. The output

is safe in that even if the rule later fails, this output will go out. Before the

output you may put in output control ﬂags as either a simple value or a value

list in parens.

ˆrepeat ()

Allows this rule to generate output that may repeat what has been said recently

by the chatbot.

ˆreviseOutput ( n value )

Allows you to replace a generated response with the given value.

nis one based and must be within range of given responses. One can use this,

for example, alter output to create accents. Using

ˆresponse

to get an output,

you can then use

ˆsubstitute

to generate a revised one and put it back using

this function.

Output Access

These functions allow you to ﬁnd out what the chatbot has said and why.

ˆresponse ( id )

What the chatbot said for this response. Id 1will be the ﬁrst output.

ˆresponsequestion ( id )

Boolean 1 if response ended in ?, null otherwise.

ˆresponseruleid ( id )

The rule tag generating this response from which you can get the topic. May be

joined pair of rule tags if rule was relayed (reuse) from a diﬀerent rule). The ﬁnal

rule will be ﬁrst and the relay second, eg ~keywordless.30.0.~control.3.4.

If the id is

-1

, then all output generated will be included, analogous to what

happens in the log ﬁle for why in the entries.

PostProcessing Functions

These functions are only available during postprocessing.

ˆpostprintbefore ( stream )

It prints the stream prepended to the existing output. You will not be able to

analyze or retrieve information about this, like you would from a normal print

because it generates no facts representing it. This is useful for adding outofband

messages

[ ]

to the front of input for controlling avatars and such. Or for adding

transitional phrases or other personality coloring before the main output.

ˆpostprintafter ( stream )

It prints the stream appended to the existing output. You will not be able

to analyze or retrieve information about this, like you would from a normal

print because it generates no facts representing it. This is useful for adding

summarizing data after output, e.g., when running the document reader.

Control Flow Functions

ˆargument ( n )

Retrieves the nth argument of the calling outputmacro (1-based).

ˆargument ( n ˆfn )

Looks backward in the callstack for the named outputmacro, and if found returns

the nth argument passed to it. Failure will be reported for n out of range or

ˆfn

not in the call path.

This is an alterative access to function variable arguments, useful in a loop

instead of having to access by variable name.

If nis

, the system merely tests whether the caller exists and fails if the caller

is not in the path of this call.

ˆcallstack ( @n )

Generates a list of transient facts into the named factset. The facts represent

the callstack and have as subject the critical value (the verb is

callstack

and

the object is the rule tag responsible for this entry). Items include function calls

(ˆxxxx) and topic calls (~xxxx) and internal calls (no preﬁx).

ˆcommand ( args )

Execute this stream of arguments through the

command processor. You can

execute debugging commands through here. E.g.,

^command(:execute ^print("Hello") )

Note that it is hard to turn on :trace this way, because the system resets It

internally at various points. The correct way to manipulate trace is to do

$cs_trace = -1 in regular script, outside of ˆcommand.

ˆend ( code )

Takes 1 argument and returns a code which will stop processing. Any data

pending in the output stream will be shipped to the user. If

ˆend

is contained

within the condition of an if, it merely stops it. An end rule inside a loop merely

stops the loop. All other codes propagate past the loop. The codes are:

code description

CALL stops the

current

outputmacro

w/o failing it.

more from

this user’s

volley. Does

not cancel

pending

output. It’s

the same as

END(INPUT)

Output that has been recorded via

ˆprint

ˆpreprint

, etc is never canceled.

Only pending output.

ˆload ( name )

Normally CS takes all the data you have compiled as

:build 0

and

:build

whatever as layers 0 and 1, and loads them when CS starts up. They are then

permanently resident. However, you can also compile ﬁles named

filesxxx2.txt

which will NOT be loaded automatically.

You can write script that calls

ˆload

, naming the

xxx

part and they will be

dynamically loaded, for that user only, and stay loaded for that user across all

volleys until you call

ˆload

again. Calling load again with a diﬀerent name will

load that new name. Calling

ˆload(null)

will merely unload the dynamic layer

previously loaded.

WARNING

It’s erroneous (you get whatever happens to you), if you call

ˆload

from within

topics you have loaded via ˆload.

ˆclearmatch()

This clears all match variables to empty.

ˆmatch ( what )

This does a pattern match using the contents of what (usually a variable reference).

It fails if the match against current input fails. It operates on the current analyzed

sentence which is usually the current input, but since you can call ˆnext(input)

or ˆanalyze() it is whatever the current analysis data is.

if (%more AND ^match(^"(< ![~emocurse ~emothanks] ~interjections >)" ) )

{FAIL(SENTENCE)}

$$newrule = GetRule(pattern $$newtag)

$$newtype = GetRule(type $$newtag)

if ($$newtype == $$type AND match($$newrule)) # we would match this rule

ˆmatch

can also take a rule tag for what, in which case it uses the pattern of

the rule given it.

ˆmatch

will normally take your pattern and compile it with

the script compiler during execution.

If you have discarded the script compiler in your build, it will run your pattern

directly and pray. In that case every token should be separated by a space: eg

not this:

[my you]

but this

[ my you ]

and relational tests won’t work so you can’t do

_0>5

_0?

or things like that.

If you know your pattern in advance, you can put it on a rule and then execute

that since it will have been compiled. E.g.

s: TEST (some fancy pattern)

and later

^match(~mytopic.test)

You can also just say

ˆmatch(~someconcept)

and it will test the current input

for that concept.

’

$$csmatch_start

and

$$csmatch_end

are assigned to provide the range of

words that ˆmatch used.

ˆmatches ()

Returns a string of indices of words that matched the most recent pattern match.

The indices are in order, so you can know the range of the match or the speciﬁc

word indices that were seen. Currently matches only include the words/concepts

that were matched, not things like

(sag*)

where the word is not fully named.

ˆnofail ( code ... script ... )

The antithesis of

ˆfail()

. It takes a code and and number of script elements,

executes the script and removes all failure codes through the listed code.

This is important when calling

ˆrespond

and

ˆgambit

from a control script.

You would want a control script to pass along codes at the sentence level, but if

the respond call generated a fail-rule return, you don’t want that to stop all the

code of a control script responder.

The nofail codes are:

code description

RULE a rule failure within the script does not propagate outside of

nofail

LOOP a loop failure or end within the script does not propagate

outside of nofail

TOPIC a topic or rule failure within the script does not propagate

outside of nofail

code description

SENTENCE a topic or rule or sentence failure within the script does not

propogate outside of nofail

INPUT no failure propagates outside of the script

notnull ( stream )

Execute the stream and if it returns no text value whatsoever, fail this code.

The text value is not used anywhere, just tested for existence. Useful in IF

conditions.

ˆnorejoinder ()

Prevents this rule from assigning a rejoinder.

ˆnotrace ( ... )

Suppresses normal tracing if if :trace all is on, for the duration of evaluation

of the contents of the parens. It does not block explicit traces of functions or

topics.

ˆreturn ( ... )

Evaluates it data and returns any output from the most recent calling output-

macro. It is nominally equivalent to:

here is some outputting

^end(CALL)

My personal coding convention is to use

ˆreturn

when the function is supposed

to return a value to a caller who will assign it somewhere. And not to use it if

the function is directly creating output to the user or is just being executed for

side eﬀects.

You can return the contents of a variable

ˆreturn($$myvar)

or the name of a

factset

ˆreturn(@19)

or just some literal value

ˆreturn(test)

. Returning a

factset just returns its name. But if you have

@0 = ^myfunc()

and ˆmyfunc returns a factset name, you have done the equivalent of

@0 = @19

which means copy the elements of set 19 into set 0.

Note that

ˆreturn()

and

ˆreturn(null)

are treated the same. An empty

string is returned. This is similar to assigning a variable by saying

$var = null

which assigns the empty string.

ˆaddcontext ( topic label )

Sets a topic and context name for use by ˆincontext.

The label doesn’t have to corrrespond to any real label.

The topic can be a topic name or ~meaning current topic.

ˆauthorized ()

Use same authorizedIP.txt ﬁle and rules that debug commands use, to validate

current user.

ˆclearcontext ()

Erases all context data (see ˆaddcontext).

ˆincontext ( label )

label can be a simple text label or a

topicname.textlabel

. The system tracks

rule labels that generated output to the user or rules starting with the label

CX_

whether or not the rule generates output as long as it didn’t fail during output.

ˆinContext

will return how many volleys have happened since the referenced

rule (normal return) if the label has output within the 5 prior volleys and will

fail if not. It’s like an extension of rejoinders. Rejoinders have a 1 volley context

and must be placed immediately after a rule. This has a 5 volley context and

are used in normal rule patterns.

u: (^incontext(PLAYTENNIS) why) because it was fun.

External Access Functions

ˆenvironment ( variablename )

Access environment variables of the operating system. E.g.

^environment(path)

ˆsystem ( any number of arguments )

The arguments, separated by spaces, are passed as a text string to the operating

system for execution as a command. The function always succeeds, returning

the return code of the call. You can transfer data back and forth via ﬁles by

using ˆimport and ˆexport of facts.

ˆpopen ( commandstring 'function )

The command string is a string to pass the os shell to execute. That will return

output strings (some number of them) which will have any

changed to

blanks and then the string stripped of leading and trailing blanks.

The string is then wrapped in double quotes so it looks like a standard ChatScript

single argument string, and sent to the declared function, which must be an

output macro or system function name, preceded by a quote.

The function can do whatever it wants. Any output it prints to the output buﬀer

will be concatenated together to be the output from ChatScript. If you need a

doublequote in the command string, use a backslash in front of each one. They

will be removed prior to sending the command. E.g.,

outputmacro: ^myfunc(^arg)

^arg \n

topic: ~test( testing )

u: () popen( "dir *.* /on" '^myfunc)

output this:

Volume in drive C is OS

Volume Serial Number is 24CB-C5FC

Directory of C:ChatScript

06/15/2013 12:50 PM <DIR> .

06/15/2013 12:50 PM <DIR> ..

12/30/2010 02:50 PM 5 authorizedIP.txt

06/15/2013 12:19 PM 10,744 changes.txt

05/08/2013 03:29 PM <DIR> DICT

...( additional lines omitted)

49 File(s) 29,813,641 bytes

24 Dir(s) 566,354,685,952 bytes free

'Function can be null if you are not needing to look at output.

ˆtcpopen ( kind url data 'function )

Analogous in spirit to popen.

You name the

kind

of service (

POST

GET

), the

url

(not including

http://

) but

including any subdirectory, the text string to send as data, and the quoted

function in ChatScript you want to receive the answer.

The answer will be read as strings of text (newlines separate and are stripped oﬀ

with carriage returns) and each string is passed in turn to your function which

takes a single argument (that text).

:trace TRACE_TCP

can be enabled to log what happens during the call.

Likely you will prefer

ˆjsonopen

which can deal with more complex web com-

munication scenarios and returns structured data so you don’t have to write

script yourself to parse the text.

'function can be null if you are not needing to look at output.

The system will set

$$tcpopen_error

with error information if this function

fails.

When you look at a webpage you often see it’s url looking like this:

http://xml.weather.com/weather/local/4f33?cc=*&unit ="+vunit+"&dayf=7"

There are three components to it.

The host: xml.weather.com.

The service or directory: /weather/local/4f33.

The arguments: everything AFTER the ?.

The arguments are URLencoded, so spaces have been replaced by

, special

characters will be converted to %xx hex numbers.

If there are multiple values, they will be separated by

and the left side of an

is the argument name and the right side is the value.

When you call

ˆtcpopen

, normally you provide the host and service as a single

argument (everything to the left of

) and the data as another argument

(everything to the right of ?).

Since ChatScript URL encodes, you don’t. If you don’t know the unencoded form

of the data or you don’t think CS will get it right, you can provide URL-encoded

data yourself, in which case make your ﬁrst argument either

POSTU

GETU

meaning you are supplying url-encoded data so CS should not do anything to

your arguments.

Below is sample code to ﬁnd current conditions and temperature in san francisco

if you have an api key to the service. It calls the service, gets back all the JSON

formatted data from the request, and line by line passes it to ˆmyfunc.

This, in turn, calls a topic to hunt selectively for fragments and save them, and

when all the fragments we want have been found,

ˆmyfunc

outputs a message

and stops further processing by calling ˆEND(RULE).

Note that in this example there is no data to pass, everything is in the service

named, so the data value is “”.

outputmacro: ^myfunc (^value)

$$tmp = ^value

nofail(RULE respond(~tempinfo))

if ($$currentCondition AND $$currentTEMP)

{

print( It is $$currentCondition. )

print(The temperature is $$currentTemp. )

^END(RULE)

}

topic: ~tempinfo system repeat keep()

u: (!$$currentCondition)

$$start = findtext($$tmp $$pattern1 0)

$$findtext_start = findtext($$tmp ^"\"" $$start)

$$currentCondition = extract($$tmp $$start $$findtext_start )

u: ($$currentCondition)

$$start = findtext($$tmp $$pattern2 0)

$$findtext_start = findtext($$tmp , $$start)

$$currentTemp = extract($$tmp $$start $$findtext_start)

topic: ~INTRODUCTIONS repeat keep (~emogoodbye ~emohello ~emohowzit name )

t: ^keep() Ready. Type "weather" to see the data.

u: (weather)

$$pattern1 = ^"\"weather\":\""

$$pattern2 = ^"\"temp_f\":"

if ( tcpopen(GET api.wunderground.com/api/yourkey/conditions/q/CA/San_Francisco.json "" '^myfunc))

{ hi }

else

{ $$tcpopen_error }

There is a subtlety in the

ˆmyfunc

code in that it uses ˆprint to put out the

result. Just writing:

if ($$currentCondition AND $$currentTEMP)

{

It is $$currentCondition.

The temperature is $$currentTemp.

^END(RULE)

}

will not work, because that output is being generated by the call to

ˆtcpopen

which is in the test part of the if, so everything it does is purely for eﬀect of

testing a condition. The generated output is dicarded.

If you moved the output generation to the

{ }

of the

, things would be ﬁne.

E.g.,

if ( tcpopen(GET api.wunderground.com/api/yourkey/conditions/q/CA/San_Francisco.json "" '^myfunc) )

{

It is $$currentCondition.

The temperature is $$currentTemp.

}

else { $$tcpopen_error }

Doing the output without using

ˆprint

is my preferred style; it is easier to see

what is going on for output if it is not hidden deep inside some if test.

ˆexport ( name from )

From must be a fact set to export. Name is the ﬁle to write them to. An optional

3rd argument

append

means to add to the ﬁle at the end, rather than recreate

the ﬁle from scratch.

Obviously, you must ﬁrst have done something like

ˆquery

to populate the fact

set. Eg.

^query(direct_sv item label ? -1 ? @3)

^export(myfacts.txt @3)

If the name includes the substring “ltm”, then the ﬁle will not be appendable, but

will be encryptable and routes to databases if the ﬁlesystem has been overridden

by Mongo, Postgres, or MySQL.

ˆimport ( name set erase transient )

name is the ﬁle to read from. Set is where to put the read facts.

erase can be

erase

meaning delete the ﬁle after use or

keep

meaning leave the

ﬁle alone.

transient can be

transient

meaning mark facts as temporary (to self erase at

end of volley) or permanent meaning keep the facts as part of user data. Eg

^import(myfacts.txt @3).

If set is null, then facts are created but not stored into any fact-set and the subject

of the ﬁrst fact is returned as the answer (presumed to be a json structure).

If the name includes the substring “ltm”, then the ﬁle will be decryptable and

routes to databases if the ﬁlesystem has been overridden by Mongo, Postgres, or

MySQL.

Debugging Function ˆdebug ()

As a last ditch, you can add this function call into a pattern or the output and

it will call DebugCode in

functionExecute.cpp

so you know exactly where you

are and can use a debugger to follow code thereafter if you can debug c code.

Logging Function ˆlog ( ... )

This allows you to print something directly to the users log ﬁle. If you want

it echoed to the console as well, you can do

ˆlog(OUTPUT_ECHO This is my

message).

You can actually append to any ﬁle by putting at the front of your output the

word FILE in capital letters followed by the name of the ﬁle. E.g.,

^log(FILE TMP/mylog.txt This is my log output.)

Logging appends to the ﬁle. If you want to clear it ﬁrst, issue a log command

like this:

^log(FILE TMP/mylog.txt NEW This is my log output)

The new tells it to initialize the ﬁle to empty.

Additionally you can optimize log ﬁle behavior. If you expect to write to a ﬁle a

lot during a volley (eg during :document mode), you can leave the ﬁle open by

using

^log(OPEN TMP/mylog.txt This is my log output.)

which caches the ﬁle ptr. After which you can write with OPEN or FILE

equivalently. To close the ﬁle use

^log(CLOSE TMP/mylog.txt)

By default, ˆlog acts like output to user, converting escaped nr, and t into their

actual ascii characters. The ﬂag RESPONSE_NOCONVERTSPECIAL passed

in will block this.

ˆmemorymark ()

Reading a document consists of performing a single volley of the entire document.

This can tie up a lot of memory in keeping facts, dictionary entries, user variables,

etc. If you are careful in what you do, you can make the memory burden go away.

ˆmemoryMark()

notes where memory is currently at, and is best done within the

document_pre topic. Then you can release memory after every sentence of the

document, so it doesn’t accumulate.

ˆmemoryfree ()

This releases memory back to the last

ˆmemorymark()

. It is best done after your

main control of the document bot has ﬁnished processing a sentence. Partly

because the analysis of the sentence is lost and so no later rules can pattern

match to it (though you can call ˆanalyze to reacquire your sentence). E.g.,

topic: ~document_pre system repeat()

t: ^memorymark() # note start

Log(OUTPUT_ECHO \n Begin $$document ) # instant display

topic: ~main_control system repeat () # executed each sentence of document

u: (%document)

respond(~filter)

^memoryfree()

The caveats and warnings about how this works. Whenenver you free memory,

the system will clear all fact sets. It will clear all user variables set after the

memory mark (leaving the ones before alone). It will then release facts, text,

and dictionary nodes created after the mark.

The only data you can pass out from a

memoryMark/

memoryfree zone is data

stored on match variables (which have size limitations) or on the count ﬁeld of a

dictionary word of a preexisting word.

ˆmemorygc ()

This can function in either document mode or chat mode. It does what it can to

release unused memory. It has restrictions in it does not work if you have facts

with facts as ﬁelds or are in planning mode. It also discards saved sentence data,

and all of your analysis data for the current sentence. It also discards all data in

factsets.

JSON Functions

JSON functions and JSON are described more fully in the ChatScript JSON

manual.

ˆjsonarrayinsert ( arrayname value )

Given the name of a json array and a value, it addsthe value to the end of

the array.

SAFE

protects any nested JSON data from being deleted. See JSON

manual.

ˆjsonarraydelete ( [INDEX, VALUE] arrayname value {ALL} )

This deletes a single entry from a JSON array. It does not damage the thing

deleted, just its member in the array. If the ﬁrst argument is

INDEX

, then value

is a number which is the array index (0 . . . n-1). If the ﬁrst argument is

VALUE

then value is the value to ﬁnd and remove as the object of the json fact.

You can delete every matching

VALUE

entry by adding the optional 4th argument

ALL.

If there are numbered elements after this one, then those elements immediately

renumber downwards so that the array indexing range is contiguous.

ˆjsoncreate ( type )

Type is either array or object and a json composite with no content is created

and its name returned.

ˆjsondelete ( factid )

Deprecated in favor of ˆdelete.

ˆjsongather ( {factset} jsonid )

Takes the facts involved in the json data (as returned by

ˆjsonparse

ˆjsonopen

and stores them in the named factset. This allows you to remove

their transient ﬂags or save them in the users permanent data ﬁle.

You can omit fact-set as an argument if you are using an assignment statement:

@1 = ^jsongather(jsonid)

ˆJsongather

normally gathers all levels of the data recursively. You can limit

how far down it goes by supplying

level

. Level 0 is all. Level 1 is the top level

of data. Etc.

ˆjsonlabel ( label )

Assigns a text sequence to add to

jo-

and

ja-

items created thereafter. E.g.

ˆjsonlabel(x)

generates

jo-x1

and

ja-x1

. You can turn it back oﬀ again with

ˆjsonlabel("")

This allows you to create json namespaces which will not conﬂict. Eg, you may

load a bunch of json during a system bootup (

ˆcsboot

) under one naming and

then use a diﬀerent naming for user json created later and code can determine

the source of the data.

ˆjsonreadcvs ( TAB filepath )

reads a tsv (tab delimited spreadsheet ﬁle) and returns a JSON array representing

it. The lines are all objects in an array. The line is an object where non-empty

ﬁelds are given as ﬁeld indexes. The ﬁrst ﬁeld is 0. Empty ﬁelds are skipped

over and their number omitted.

ˆjsonundecodestring ( string )

Removes all json escape markers back to normal for possible printout to a user.

This translates

to newline,

to carriage return,

to tab, and

to a

simple quote.

ˆjsonobjectinsert ( {DUPLICATE} objectname key value )

Inserts the key value pair into the object named. The key does not require

quoting. Inserting a json string as value requires a quoted string. Duplicate keys

are ignored unless the optional 1st argument

DUPLICATE

is given.

SAFE

protects

any nested JSON data from being deleted. See JSON manual.

ˆjsonopen ( {UNIQUE} kind url postdata header )

This function queries a website and returns a JSON datastructure as facts. It

uses the standard CURL library, so it’s arguments and how to use them are

generally deﬁned by CURL documentation and the website you intend to access.

See ChatScript JSON manual for details.

ˆjsontree ( name )

name

is the value returned by

ˆJSONparse

ˆJSONopen

, or some query into

such structures. It prints out a tree of elements, one per line, where depth is

represented as more deeply indented. Objects are marked with { } as they are

in JSON. Arrays are marked with [].

ˆjsonwrite ( name )

name

is the name from a json fact set (returned by

ˆJSONparse

ˆJSONopen

, or

some query into such structures). Result is the corresponding JSON string (as a

website might emit), without any linefeeds.

ˆjsonparse ( {UNIQUE} string )

string

is a json text string (as might be returned from a website) and this

parses into facts exactly as

ˆjsonopen

would do, just not retrieving the string

from the web. It returns the name of the root node. One use for this is to pass

JSON data as a quoted string within out-of-band data, and have the system

parse that into facts you can use.

You can add

NOFAIL

before the string argument, to tell it to return null but not

fail if a dereference path fails cannot be found.

^jsonparse(transient NOFAIL "{ a: $var, b: _0.e[2] }")

ˆjsonparse

automatically converts any backslashunnnn into the corresponding

utf8 character.

ˆjsonkind ( something )

If something is a JSON object, the function returns

object

. If it is a JSON

array it returns array. Otherwise it fails.

ˆjsonpath ( string id )

string is a description of how to walk JSON. Id is the name of the node you

want to start at (typically returned from ˆjsonopen or ˆjsonparse.

Array values are accessed using typical array notation like

ja-1[3]

and object

ﬁelds using dotted notation like jo-7.id.

A simple path access might look like this:

[1].id

which means take the root

object passed as id, e.g.,

ja-1

, get the 2nd index value (arrays are 0-based in

JSON). That value is expected to be an object, so return the value corresponding

to the id ﬁeld of that object. In more complex situations, the value of id

might itself be an object or an array, which you could continue indexing like

[1].id.firstname.

ˆJsonpath

can also return the actual factid of the match, instead of the object

of the fact. This would allow you to see the index of a found array element, or

the json object/array name involved. Or you could use

ˆrevisefact

to change

the speciﬁc value of that fact (not creating a new fact). Just add

after your

ﬁnal path, eg

^jsonpath(.name* $$obj)

^jsonpath(.name[4]* $$obj)

If you need to handle the full range of legal keys in json, you can use text string

notation like this

^jsonpath(."st. helen".data $tmp)

You may omit the leading . of a path and CS will by default assume it

^jsonpath("st. helen".data $tmp)

Word Manipulation Functions

ˆburst ( {count once} data-source burst-character-string )

Takes the data source text and hunts within it for instances of the burst-character-

string. If it is being dumped to the output stream then only the ﬁrst piece is

dumped.

If it is being assigned to a fact set (like

) then a series of transient facts are

created for the pieces, with the piece as the subject and

ˆburst ˆburst

as the

verb and object.

If it is being assigned to a match variable, then pieces are assigned starting at

that variable and moving on to successively higher ones.

If burst does not ﬁnd a separator, it puts out the original value. For assignment

to match variables, it also clears the next match variable so the end of the list

will be a null match variable.

If burst_character is omitted, it is presumed to be BOTH

(which joins

composite words and names) and " “, which separates words.

If burst_character is the null string “”, it means burst into characters.

ˆburst

takes an optional ﬁrst parameter

count

, which tells it to return how

many items it would return if you burst, but not to do the burst.

ˆburst

takes an optional ﬁrst parameter

once

which says split only into the ﬁrst

burst and then the leftover rest.

ˆburst

has a special burst value

digitsplit

which will split a number-text

thing or a text-number thing into two pieces (text thing and number thing).

This is good for splitting a currency thing lik USD25 or 25$.

ˆwords ( someword )

Looks up the given word and returns all words matching it. Matching includes

the lower case form of it and any number of uppercase forms of it. E.g, you

might say ˆwords(ted) and get back facts for ted,Ted,TED.

The answers are a series of facts of the form (someword words words). In addition

to case switching, the system will automatically switch words with underscores

or blanks into words with changes in them to the other (since CS stores phrases

with underscores). So

ˆwords("I love you")

can match phrases already in the

dictionary of: I_love you I_love_you I love you I LOVE You

etc. Depending on which words are actually there (for example because they are

parts of a fact).

ˆcanon ( word canonicalform )

Same as

:canon

during a

:build

from a table. Fails during normal execution

not involving compiling.

ˆexplode ( word )

Convert a word into a series of facts of its letters.

ˆextract ( source start end )

Return the substring with the designated oﬀset range (exclusive of end location).

Useful for data extraction using ˆpopen and ˆtcpopen when combined with

ˆfindtext.

In addition to absolute unsigned values, start and end can take on oﬀsets or

relative values. A signed end is a length to extract plus a direction or shift in

start:

^extract($$source 5 +2) # to extract 2 characters beginning at position

^extract($$source 5 -2) # to extract 2 characters ending at position 5

A negative start is a backwards oﬀset from end.

^extract($$source -1 +1) # from end, 1 character before and get 1 character

^extract($$source -5 -1) # from end, 5 characters before and get 1 character before. i.e. the 6th char from end.

ˆfindtext ( source substring offset {insensitive} )

Find case sensitive substring within source+oﬀset and return oﬀset starting

immediately after match. Useful for data extraction using

ˆpopen

and

ˆtcpopen

when combined with

ˆextract

$$findtext_start

is bound to the actual start

of the match.

$$findtext_word

is bound to the word index in which the match

was found where one or more blanks separate words. Indexing starts at 1 (same

as sentence positional notation).

An optional fourth argument insensitive will match insensitively.

Failing to match will generate a rule failure. If the source or substring contains

, these will be converted to blanks before execution, to allow that or the

space notation to be considered equivalent (unless your source or substring is

literally an underscore only).

ˆflags ( word )

get the 64bit systemﬂags of a word.

ˆintersectwords ( arg1 arg2 optional )

Given two “sentences”, ﬁnds words in common in both of them. Output facts

will go to the set assigned to, or

if not an assignment statement. The optional

third argument, if it’s

canonical

, it will match the canonical forms of each

word.

ˆjoin ( any number of arguments )

Concatenates them all together, putting the result into the output stream. If

the ﬁrst argument is

AUTOSPACE

, it will put a single space between each of the

joined arguments automatically.

ˆactualinputrange ( start end )

Given the starting and ending word positions of an original input (what CS had

after tokenization but before adjustments), this returns the range of where the

words arose in the actual input. The return is a range whose start is shifted 8

bits left and ORed with the end position.

ˆoriginalinputrange ( start end )

Given the starting and ending word positions of an actual input (what CS sees

after adjustments and what you normally pattern match on), this returns the

range of where the words came from in the original input. The return is a range

whose start is shifted 8 bits left and ORed with the end position.

ˆproperties ( word )

Returns the 64bit properties of a word or fail-rule if the word is not already in

the dictionary.

ˆpos( part-of-speech word supplemental-data )

Generates a particular form of a word in any form and puts it in the output

stream. If it cannot generate the request, it issues a RULE failure. Most

combinations of arguments are obvious. Here are the 1st & 3rd choices. For

verbs with irregular pronoun conjugation, supply 4th argument of pronoun to

use.

part-of-speech

word/verb/number(+

supplement-data argument) action

conjugate pos-integer(as returned from

ˆpartofspeech)

returns the word with

that part of

speech (eg conjugate go

#VERB_PAST_PARTICIPLE)

raw integer 1 .. %length (returns the original

word in sentence)

syllable word tells you how many

syllables a word has

hex64 integer-word converts a number to

64bit hex

hex32 integer-word converts a number to

32 bit hex

ismodelnumber word return 1 if it is (mixed

alpha/numeric). Fails

otherwise.

isinteger word return 1 if it is all

digits, fails otherwise

isfloat word return 1 if it is ﬂoat,

fails otherwise

isuppercase word return 1 if it begins

with an uppercase

letter, fails otherwise

part-of-speech

word/verb/number(+

supplement-data argument) action

isalluppercase word return 1 if it starts

uppercase, and consists

of entirely uppercase

letters, hyphen,

underscore and

ampersand, fails

otherwise

type word returns concept,

number, word, or

unknown

common word returns level of

commonness of the

word

verb verb

given verb in any form,

return requested form

present_participle verb

past_participle verb

infinitive verb

past verb

present3ps verb

present verb

verb match noun returns noun form

matching verb

(sing./plural).e.g.

(walk match dog) ->

walks

aux auxverb pronoun returns verb form

matching pronoun

supplied.for do,have,be

pronoun word flip changes person form

for 1st and 2nd person

adjective word more writes the adjective in

its comparative form:

fast -> faster

most word the superlative form.

beautiful -> most

beautiful

adverb word more writes comparative

form: strong ->

strongly

noun word proper return word as a

proper noun

(appropriately cased)

part-of-speech

word/verb/number(+

supplement-data argument) action

lowercaseexist word

uppercaseexist word

singular word or a number == 1

plural word or a number > 1

irregular word return value only for

irregular nouns

determiner word noun add a determiner

“a/an” if it needs one

place integer

return place number of

integer

capitalize word

uppercase word

lowercase word

allupper word

canonical word see notes

integer ﬂoatnumber

generate integer if ﬂoat

is exact integer

Example:

# get first name (in a not English language), and capitalize

u: what's your first name?

#! giuditta

a: ( _* )

$_name = ^original(_0)

Nice to meet you, ^pos(capitalize $_name)

# if user enter giuditta, the rejoinder output: Nice to meet you, Giuditta

For ˆpos(canonical), there is an optional third argument which is the concept

name of the pos-tag. Foreign words may have multiple lemma forms based on

part of speech. E.g., in the German dictionary you can ﬁnd this entry:

Informationstechnische ( NOUN ADJECTIVE NOUN_SINGULAR NOUN_PLURAL ) lemma=`informationstechnisch`Informationstechnische`ADJA NN

which says there are two forms of canonical, one for ADJA (adjective) and one

for NN (noun). If you don’t specify a 3rd argument, you get the ﬁrst one (ADJA).

If you specify

~ADJA

you get the ﬁrst and if you specify

~NN

you get the second.

If your third argumernt is

all

then the list of all canonical forms is returned

with | separating the entries.

ˆdecodeInputtoken ( number )

Display the text values of tokenﬂag bits. You can pass it

%token

to see the

meanings of the current sentence analysis or

$cs_token

to see what you have

current set as token controls.

ˆdecodepos ( pos location )

Translates into text the 64bit pos data at given location.

location

can be a

position in the sentence (1. . . number of words) or a match variable found from

some location in the sentence). See dictionary.h for meanings of bits. Type word

will classify word as concept, word, number, or unknown.

ˆdecodepos ( role location )

Returns the text of the role data of the given location.

ˆlayer ( word )

When was this word entered into the dictionary. Answers are:

wordnet

user.

ˆpartofspeech ( location )

Gets the 64-bit part-of-speech information about a word at

location

, resulting

from parsing. Location can be a position in the sentence (1. . . number of words)

or a match variable found from some location in the sentence). See dictionary.h

for meanings of bits.

ˆphrase ( type matchvar )

Can be used to retrieve all of a prepositional phrase or a noun phrase.

type

noun,prepositional,verbal,adjective. Optional 3rd argument canonical

will return the canonical phrase rather than the original phrase. E.g.,for input:

u: (I ~verb _~directobject) $tmp = ^phrase(noun _0)

with input I love red herring $tmp is set to red herring

ˆrole ( location )

Gets the 32-bit role information about a word at location, resulting from parsing.

Location can be a position in the sentence (1. . . number of words) or a match

variable found from some location in the sentence). See dictionary.h for meanings

of bits.

ˆtally ( word {value} )

Only valid during current volley. You can associate a 32-bit number with a word

by ˆtally(test 35) and retrieve it via ˆtally(test).

ˆrhyme ( word )

Finds a word in the dictionary which is the same except for the ﬁrst letter (a

cheap rhyme).

ˆsubstitute ( mode find oldtext newtext)

Outputs the result of substitution. Mode can be character or word or insensitive.

In the text given by ﬁnd, the system will search for oldtext and replace it with

newtext, for all occurrences. This is non-recursive, so it does not also substitute

within replaced text. Since ﬁnd is a single argument, you pass a phrase or

sentence by using underscores instead of spaces.

ˆsubstitute

will convert all

underscores to spaces before beginning substitution and will output the spaced

results.

In character mode, the system ﬁnds oldtext as characters anywhere in newtext.

In word mode it only ﬁnds it as whole words in newtext. Finding is case sensitive,

unless you use the argument insensitive, which will do character mode insensitive

match. You can select insensitive word match by making the ﬁrst argument be

a text string containing the normal 1st argument values, e.g. insensitive word

^substitute(word "I love lovely flowers" love hate)

outputs I hate lovely ﬂowers

^substitute(character "I love lovely flowers" love hate)

outputs I hate hately ﬂowers

ˆspell ( pattern fact-set )

Given a pattern, ﬁnd words from the dictionary that meets it and create facts

for them that get stored in the referenced fact set. The facts are created with

subject 1, verb word, and object the found word. The pattern is a text string

describing possibly the length and letter constraints.

If there is an exact length of word, it must be ﬁrst in the pattern. After which

the system matches the letters you provide against the start of the word up until

your pattern either ends or has an asterisk or a period. A period means match

any letter.

An asterisk matches any number of letters and would normally be followed by

more letters. The

will swallow letters in the dictionary word until it can match

the rest of your given pattern. It will keep trying as needed. Eg.

^spell(4the @1) will find them but not their

^spell(am*ic @1) will find American

^spell(a*ent @1) will find abasement

^spell(h.l.o @1) will find hello

ˆsexed ( word he-choice she-choice it-choice )

Given a word, depending on its sex the system outputs one of the three sex

choices given. An unrecognized word uses it.

^sexed(Georgina he she it)

would return she

ˆuppercase ( word )

Is the given word starting with an uppercase letter? Match variable binds usually

reﬂect how the user entered the word. This allows you to see what case they

entered it in. Returns 1 if yes and 0 otherwise.

ˆformat( integer/float formatstring value)

This is a thin wrapper over sprintf. The ﬁrst argument tells ChatScript what

kind of argument you are passing (since everything is a string to ChatScript).

The second argument is a string which is the format string for sprintf. The third

argument is the number to convert. For ﬂoats, you will always be passing a

double ﬂoat so bear that in mind with your formatting. For integer, if you use

a %d format, you will be using a 32-bit value. For ll formats you will be using

64-bit but it won’t work well on Windows output because Windows uses their

own sprintf notation.

ˆaddproperty ( word flag1 ... flagn )

given the word, the dictionary entry for it is marked with additional properties,

the ﬂags given which must match property ﬂags or system ﬂags in dictionarySys-

tem.h. Typically used to mark up titles of books and things when building world

data.

In particular, however, if you are adding phrases or words not in the dictio-

nary which will be used as patterns in match, you should mark them with

PATTERN_WORD

. To create a dynamic concept, mark the set name as CONCEPT.

You can also add fact properties to all members of a set of facts via

^addproperty(@4 flag1 ... flagn).

These ﬂags are also predeﬁned in dictionarysystem.h and you can use some of the

predeﬁned but meaningless ones to do what you want. These are

User_flag4

User_flag3,User_flag2,User_flag1.

ˆdefine( word )

Output the deﬁnition of the word. An optional second argument is the part of

speech: noun verb adjective adverb, which will limit the deﬁnition to just that

part of speech. Never fails but may return null.

The second argument can also be

all

which means list all deﬁnitions per part

of speech, not just the ﬁrst. And it can be the third optional argument so you

can get all meanings of a word as a noun, for example.

ˆhasanyproperty ( word value )

Does this word have any of these property or systemﬂag bits? You can have up

to 5 values as arguments, e.g.,

^hasproperty(dog NOUN VERB ADJECTIVE ADVERB PREPOSITION)

If the word is not in the dictionary, it will infer it, allowing it to handle things

like verb tenses. If you want to insure the word already exists ﬁrst, you should

do ˆproperties(dog) AND

ˆhasproperty(dog xxx)

since property fails if the

word is not found.

ˆhasallproperty( word value )

Does this word have all property or systemﬂag bits mentioned? You can have

up to 5 values as arguments, e.g.,

^hasallproperties(dog NOUN VERB ADJECTIVE ADVERB PREPOSTION)

Values should be all upper case. If the word is not in the dictionary, it will infer

it, allowing it to handle things like verb tenses. If you want to insure the word

already exists ﬁrst, you should do

^properties(dog) AND ^hasproperty(dog xxx)

since property fails if the word is not found.

ˆremoveinternalflag ( word value )

Removes named internal ﬂag from word.

Currently only value is

HAS_SUBSTITUTE

, which allows you to disable a

word/phrase substitution. Use as word the full text of the left entry in a

substitutions ﬁle. E.g.,

maps to

~yes

normally. If you do

ˆremoveinternalflag(

<constantly> HAS_SUBSTITUTE) then it will no longer do that.

This is a permanent change to the resident dictionary, which will take eﬀect until

the system is reloaded.

ˆremoveproperty ( word value )

Remove this property bit from this word.

This eﬀect lasts until the system is reloaded. Value should be all upper case. Value

is normally a system ﬂag value or a property value from

dictionarysystem.h

which does not need a hash in front of it (system will look up the name).

word can be in doublequotes. And there are two internal bits that are also

allowed to be removed: CONCEPT and HAS_SUBSITUTE.

You can use

HAS_SUBSTITUTE

to disable some standard substitution in LIVE-

DATA, but you can’t apply this at build time because the system won’t remember.

Instead call it from ˆcsboot during startup.

Instead call it from

ˆcsboot

during startup. For example, in LIVEDATA

interjections ﬁle, there is an entry:

<surprise ~emosurprise

But if you didn’t want surpise at the start of a sentence declared the interjection

~emosurprise, you could do

^removeproperty("<surprise" HAS_SUBSTITUTE)

And the dictionary has some words which are composites, like iced_coﬀee and it

will automatically convert the two words into a single token. If you wanted to

stop this behavior, you could disable this composite word via

^removeproperty(iced_coffee NOUN)

since it is declared as a NOUN (you can see with

:word iced_coffee

). You

can do this to nouns, adjectives, adverbs.

ˆwalkdictionary ( 'function )

calls the named output macro from every word in the dictionary. The function

should have 1 argument, the word.

ˆIterator ( ? member ~concept )

An iterator is a repeatable fact query that allows you to walk through each

member of a concept, either at top level or recursively. Useful in conjunction

with a

loop()

, the function is deﬁned in the planning manual but can be used

outside of planning. You can have one iterator in progress per rule.

loop () # unload every resource on board

{

$$resource = ^iterator(? member ~resources)

...

}

ˆwordAtIndex ( ({original, canonical} n))

ˆwordAtIndex

retrieves the word from the current sentence at the index given,

as either the original word or as a canonical word (as a match variable sees it)

ˆwordAtIndex ( canonical n n1) gathers a range from n thru n1.

ˆwordAtIndex ( original "_0")

gathers a range from that which _0 repre-

sents (but uses the original data so it is not like merely saying _0, which may

not have real data if you did an arbitrary assignment to it setting its position).

Multipurpose Functions

ˆdisable ( what ? )

What can be

topic

rule

inputrejoinder

outputrejoinder

save

or write @set.

topic

, the next argument can be a topic name (with or without

or just

meaning the current topic). It means to disable (BLOCK) that topic.

If a

rule

, you erase (disable) the labeled rule (or rule tag and

means the

current rule).

outputrejoinder

, it cancels the current output rejoinder mark, allowing a

new rule to set a rejoinder.

If inputrejoinder then it cancels any pending rejoinder on input.

save

, then the user data will not be saved. It’s as though this interchange

didn’t happen.

ˆdisable(write @1)

then factset

will not be written out into user data

and is junk at the next volley (normally this is true).

You can also disable the

inputrejoinder

with

ˆsetrejoinder(input)

and the

output rejoinder with ˆsetrejoinder(output).

ˆenable ( what ? )

What can be topic or rule or save or write @set.

topic

, thenext argument can be a topic name or the word

all

for all topics .

Designated topics will be enabld (unBlocked).

If a

rule

, the label (or rule tag) will be enabled, allowing the rule to function

again. If save, then it reenables saving user context if you had disabled it.

ˆenable(write @1)

then factset

will be written out into user data and is

restored at the next volley (normally this is not true).

ˆlength ( what )

If what is a fact set like

, length returns how many facts are in the set. If

what is a word, length counts its characters. If what is a concept set, length

returns a count of the top level (nonrecursive) members. If the name of a json

array or object, returns how many top level elements it has.

ˆpick ( what )

Retrieve a random member of the concept if what is a concept. Pick is also used

with factsets to pick a random fact (see FACTS MANUAL).

For a concept, if the member chosen is itself a concept, the system will recurse

to pick randomly from that concept. If the argument to pick is a $var or _var, it

will be evaluated and then pick will be tried on the result (but it won’t recurse

to try that again).

If the argument is a JSON object it randomly picks a fact whose verb/object is a

key-value pair. If the argument is a JSON array, it randomly picks a fact whose

verb is the index and whose object is the value. The fact id returned can be used

with ˆﬁeld or you can use something like $result.object to get the speciﬁc object.

ˆreset ( what ? )

What can be user or topic or factset.

If what is user, the system drops all history and starts the user afresh from ﬁrst

meeting (launching a new conversation), having erased the user topic ﬁle.

If what is a factset, the “next” pointer for walking the set is reset back to the

beginning. If what is a topic, all rules are re-enabled and all last accessed values

are reset to 0.

FACT FUNCTIONS

ˆfindfact ( subject verb object )

The simplest fact ﬁnd involves knowing all the components (meanings) and

asking if the fact already exists. If it does, it returns the index of the fact. If it

doesn’t it returns FAILRULE_BIT.

ˆquery ( kind subject verb object)

The simplest query names the kind of query and gives some or all of the ﬁeld

values that you want to ﬁnd. Any ﬁeld value can be replaced with ? which

means either you don’t care or you don’t know and want to ﬁnd it. The kinds of

queries are programmable and are deﬁned in LIVEDATA/queries.txt (but you

need to be really advanced to add to it). The simplest query kinds are:

query kind description

direct_s ﬁnd all facts with the given subject

direct_v ﬁnd all facts with the given verb

direct_o ﬁnd all facts with the given object

direct_sv ﬁnd all facts with the given subject and verb

direct_so ﬁnd all facts with the given subject and object

direct_vo ﬁnd all facts with the given object and verb

direct_svo ﬁnd all facts given all ﬁelds (prove that this fact exists)

Unipropogate ﬁnd how subject joins into the object set

If no matching facts are found, the query function returns the RULE fail code.

?: (do you have a dog) ^query( direct_svo I own dog) Yes.

If the above query ﬁnds a fact (I own dog) then the rule says yes. If not, the rule

fails during output. This query could have been put inside the pattern instead.

ˆquery( kind subject verb object count fromset toset propagate

match )

query can actually take up to 9 arguments. Default values are ?.

The count argument defaults to

-1

and indicates how many answers to limit to.

When you just want or expect a single one, use 1 as the value.

fromset speciﬁes that the set of initial values should come from the designated

factset.

Special values of fromset are

user

and

system

which do not name where the

facts come from but specify that matching facts should only come from the

named domain of facts.

toset names where to store the answers. Commonly you don’t name it because

you did an assignment like

@3 = ^query(...)

and if you didn’t do that, toset defaults to @0 so

if (^query(direct_s you ? ?))

puts its answers in @0. It is equivalent to:

if (^query(direct_s you ? ? -1 ? @0))

You can also simultaneously query and unpack a single ﬁeld from the ﬁrst

matching fact by naming the ﬁeld on the to argument.

$$tmp = ^query(direct_sv you eat ? 1 ? @5object)

Typically you would do this if you expected only a single fact and were trying

to ﬁnd a speciﬁc ﬁeld value. So you’d set the limit to 1. If a fact is found, it is

stored in

, and the object ﬁeld is pulled oﬀ and stored onto

$$tmp

. If no fact

is found, the query does not fail, $$tmp is merely set to null.

The ﬁnal two arguments only make sense with speciﬁc query types that use

those arguments.

For unipropogate, if you have these concepts;

concept: ~things (~animals ~vegetables ~minerals)

concept: ~animals (~canine ~feline)

concept: ~canine (dog)

Then

ˆquery(unipropogate dog ? ~things 1)

would return

(~animals

member ~things).

Note that the set to be found (

~things

) is not expanded. Normal queries expand

any reference to a set into all of its members and expand simple words to the

entire wordnet hierarchy above it. You can block this expansion behavior by

putting a single quote in front. Note for the idiom

'_0

which means the original

form of the match variable, you have to use two quotes: ''_0.

^query(direct_svo 'bomb ''_0 '$$tmp)

unipropogate

expects a set as its object argument, so it does not need to be

quoted. A query can also be part of an assignment statement, in which case the

destination set argument (if supplied) is ignored in favor of the left side of the

assignment, and the query doesn’t fail even if it ﬁnds no values. E.g.,

@2 = ^query(direct_sv I love you)

The above query will store its results (including no facts found) in

. Queries

can also be used as test conditions in patterns and if constructs. A query that

ﬁnds nothing fails, so you can do:

u: ( dog ^query(direct_sv dog wants ?)) A dog wants @0object.

You can also do !ˆquery. Or

if (^query(direct_vo ? want toy)) {@0subject wants a toy.}

ˆfirst ( fact-set )

Retrieve the ﬁrst fact, e.g.

_1 = ^first(@1all)

ˆlast ( fact-set )

Retrieve the last fact.

ˆpick ( fact-set )

Retrieve a random fact. Removing the fact is the default, but you can suppress

it with the optional second argument

KEEP

, e.g.,

_1 = ˆlast(@1all)

gets the

last value but leaves it in the set.

You can erase the contents of a fact-set merely by assigning null into it.

@1 = null

This does not destroy the facts; merely the collection of them.

ˆsort ( {alpha alphabetic age} @0 ... )

You can sort a fact set which has number values as a ﬁeld.

ˆsort ( fact-set {more fact sets} )

The fact set is sorted from highest ﬁrst. By default, the subject is treated as

a ﬂoat for sorting. You can say something

like@2object

to sort on the object

ﬁeld. You can add additional factsets after the ﬁrst, which will move their

contents slaved to how the ﬁrst one was rearranged. Eg.

^sort(@1subject @2 @3)

will perform the sort using the subject ﬁeld of

, and then rearrange

and

in the same way (assuming they have the same counts). Instead of sorting

by numeric value, can do an alpha sort or an oldest fact ﬁrst sort similar to the

normal sort.

ˆdelete ( factset or factid or jsonid )

If you actually want to destroy facts, you can query them into a fact-set and

then do this:

ˆdelete(@1)

And all facts in

will be deleted and the set erased You can also

delete an individual fact who’s id is sitting on some variable

ˆdelete($$f)

And you can delete a json array or object, including all of its

substructure the same way.

If you pass something that is not deleteable, the system will do nothing and

does not fail.

ˆlength ( factset or~setor jsonid or word )

If you want to know how many facts a fact-set has, you can do this:

ˆlength(@1) - outputs the count of facts

Likewise how many top level members of a concept (not recursive). Or how

many ﬁelds in a json object or elements in a json array. Or how many characters

in a word. A Null value for the argument is legal, and is of length 0.

Note: if you do length of a name that starts with

and is not a deﬁned concept

set, the function fails rather than return 0.

ˆnth ( factset count )

If you want to retrieve a particular set fact w/o erasing it, you can use

^nth(@1 5)

where the ﬁrst argument is like

ˆfirst

because you also specify how to interpret

the answer and the second is the index you want to retrieve, eg.,

^nth(@0object 5)

An index out of bounds will fail. Factsets are always numbered 1. . . n, so the

ﬁrst element is, in fact,

ˆnth(@0object 1)

would correspond to

@0object

ˆfirst(@0object)

Similarly you can do

nth(~concept 2)

to retrieve the third member of a concept

(numbering starts at 0).

You can also do

nth

of a JSON object (returns the factid of the nth key/value

pair) or JSON array (returns the factid of the nth index/value pair).

ˆunpackfactref

examines facts in a set and generates all fact references from it. That is, it lists

all the ﬁelds that are themselves facts.

@1 = ^unpackfactref( @2)

All facts which are ﬁeld values in @2 go to @1. You can limit this:

@1 = ^unpackfactref(@2object)

only lists object ﬁeld facts, etc

ˆsave ( factset boolean )

Unlike variables, which by default are saved across inputs, fact sets are by default

discarded across inputs. You can force a set to be saved by saying:

^save(@9 true) # force set to save thereafter

^save(@9 false) # turn off saving thereafter

ˆmakereal ( {set , factid} )

If you give this a factset, it will convert any transient facts in that set into

permanent. If you give this a factid, it will convert all transient facts created after

that id into permanent. This might allow you, for example, to call ‘ˆjsonopen

and get back a transient JSON structure and after inspection you could convert

it to permanent if you wanted to.

ˆaddproperty ( set flag )

Add this ﬂag onto all facts in named set. Typically you would be adding private

marker ﬂags of yours. If set has a ﬁeld marker (like

@2subject

) then the property

is added to all values of that ﬁeld of facts of that set.

ˆconceptlist ( kind location )

ˆconceptlist(kind location {filter})

generates a list of transient facts for

the designated word position in the sentence of the concepts (or topics or both)

referenced by that word, based on kind being CONCEPT or TOPIC or BOTH.

Facts are

(~concept ˆconceptlist location)

where location is the range

location in the sentence (start <<8 + end).

^conceptlist( CONCEPT 3) # absolute sentence word index

^conceptlist( TOPIC _3) # whereever _3 is bound

Otherwise, if you don’t use an assignment, it stores into set 0 and fails if no facts

are found. Any set already marked

ˆAddproperty(~setname NOCONCEPTLIST)

will not be returned from ˆconceptlist.

Special preexisting lists you might use the members of to exclude include:

~pos

(all bits of word properties)

~sys

(all bits of system properties) and

~role

(all

role bits from pos-tagging). Only one instance of a concept or topic will be

returned as a fact.

If a concept reference covers multiple words (like an unmerged New York City),

the concept is indexed at the ﬁrst word. The location returned is:

(start <<8)

| end .

If you omit the 2nd argument (location), then it generates the set of all such in

the sentence, iterating over every one but only doing the ﬁrst found reference of

some kind. If you use

ˆmark

to mark a position, both the word and all triggered

concepts will be reported via ˆconceptlist.

But if the mark is a non-canonical word, mark does not do anything about the

canonical form, and so there may be no triggered concepts as well. (Best to use

a canonical word as mark).

If you add the optional 3rd argument, it will ﬁlter concepts to be only those that

start with the ﬁlter characters.

@0 = ^conceptlist(CONCEPT _0 ^"~bot-")

retrieves only concepts that start with ~bot- .

ˆwordinconcept ( word conceptname )

Takes any casing of a word and ﬁnds which casing in the dictionary is a member

of the concept. When you memorize a word, you will get how the user spelled it.

Eg., Ebay, when the dictionary actually has eBay. The correct spelling of the

word can be found this way as a member of a concept. And composite words

using either spaces or underscores can be found as well and returns the correct

notation.

ˆcreateattribute ( subject verb object flags )

This is just like

ˆcreatefact

, except that it only allows one fact with this

subject and verb to exist.

It will kill oﬀ any other such facts. If, for example, you had a fact (car1 cost

$1500) and executed

^createattribute(car1 cost $1000)

then after this the $1500 fact would no longer exist and only the new price fact

would exist.

Note if you have facts that reference facts that would be killed oﬀ, the createat-

tribute call will decline to create a new fact and fail instead. Also, don’t have

those old facts as values of variables or factsets because those values will become

erroneous. The system will not stop you, but you cannot guarantee the results

after that. BE CAREFUL you don’t create facts where the verb and object are

intended to be constant and the subject varies. It won’t work correctly.

(car space 10) – ﬁne if 10 can vary

(10 space car) – wrong if 10 can vary

See also ˆrevisefact which is probably easier to use for most cases.

ˆcreatefact ( subject verb object flags )

The arguments are a stream, so

flags

is optional. Creates a fact of the listed

data if it doesn’t exist (unless ﬂags allows duplicates). Or

ˆcreatefact($$tmp)

or some other variable that evaluates to a fact stream will also create/ﬁnd a fact.

$$tmp might have been written previously using WriteFact.

ˆwritefact ( F )

Given a fact index such as might be returned by

first(@1fact)

, writes out the

fact in std text notation (such as done by ˆexport or written into user ﬁles). (see

ˆcreatefact).

ˆrevisefact ( factid subject verb object )

The existing non-dead user fact will have ﬁelds replaced when arguments are

not

null

. You cannot change type of ﬁeld, so a fact subject will require a factid

as subject, etc.

ˆdelete(set)

erase all facts in this set. This is the same as

ˆaddfactproperty(set

FACTDEAD).

ˆfield ( fact fieldname )

given a reference to a fact, pull out a named ﬁeld. If the ﬁeldname is in lower

case and the ﬁeld is a fact reference, you get that number. If the ﬁeldname starts

uppercase, the system gives you the printout of that fact. Eg for a fact:

$$f = createfact (I eat (he eats beer))

ˆfield( $$f object)

returns a number

(the fact index)

and

ˆfield($$f

Object) returns (he eats beer) as the translation of the fact into text.

Fields include:

subject

verb

object

flagsv,

all‘ (spread onto 3 match vari-

ables, raw (spread onto 3 match variables).

all

just displays a human normal dictionary word, so if the value were actually

plants~1

you’d get just plants whereas raw would return what was actually

there plants~1.

You can also retrieve a ﬁeld via $$f.subject or $$f.verb or $$f.object.

ˆfind ( setname itemname )

given a concept set, ﬁnd the ordered position of the 2nd argument within it.

ˆOutput that index (0-based). Used, for example, to compare two poker hands.

ˆfindmarkedfact ( subject verb mark )

given the arguments, start at subject, follow all facts having the verb, and stop

if you can ﬁnd a fact with the mark given.

ˆfirst ( fact-set-annotated )

Retrieve the ﬁrst fact. You must qualify with what you want from it. Retrieve

means the fact is removed from the set.

ˆfirst(@0subject)

retrieves the

subject ﬁeld of the ﬁrst fact. Other obvious qualiﬁcations are verb, object, fact

(return the index of the fact itself), all (spread all 3 ﬁelds onto a match variable

triple, raw (like all but all displays just a normal human-readable word like plant

whereas raw displays what was actually there, which might have been

plant~1

ˆflushfacts ( factid )

kills all facts created after this one. To use eﬀectively, you need to create an

initial dead fact e.g,

$$marker = ˆcreatefact(junk marker data FACTDEAD)

and then if you want to cancel sentence processing because, for example, you

intend to replace this sentence with a new one (like with pronoun resolu-

tion), you can erase any facts you created while doing this sentence by doing

ˆflushfacts($$marker).

ˆgambittopics ()

ﬁnds user topics (not system topics) with gambits remaining. If you use it in

a fact-set assignment statement, it stores all topics found as facts (topicname

ˆgambittopics topicname

). You can then display them or use them as you

wish E.g.

@1 = ^gambittopics()

^gambit( ^pick(@1)) # randomly issue a gambit

Otherwise, if you don’t use an assignment, it stores into set 0 and fails if no facts

are found.

ˆintersectfacts ( from to )

Sees what facts in the from set are in common with the to set. You specify what

ﬁeld to intersect on by naming a ﬁeld of the to set (or none). Eg.,

^intersectfacts(@0 @1object)

will ﬁnd facts in set 0 whose objects match any in set 1. If you don’t name a

ﬁeld, you have to ﬁnd exact matches on the entire fact. You need to assign the

result to a new fact set, which will contain all matching facts from the from set.

@2 = ^intersectfact(@0 @1object)

ˆkeywordtopics ()

Lists topics and priority values for matching keywords in input. An optional

argument if

gambit

, will ignore topics without available gambits. The verb used

is: ˆkeywordtopics.

ˆlast ( fact-set-annotated )

Retrieve the last fact – see ˆfirst for a more complete explanation.

ˆlength ( word )

puts the length of the word into the output stream. If word is actually a fact set

reference (e.g., @2 ), it returns the count of facts in the set.

ˆmakereal ()

Convert all user facts that are transient into non-transient facts. Probably only

useful when using plans, which generate transient facts representing the state of

the world and you want those planned world facts to become the current real

facts.

ˆnext ( FACT fact-set-annotated )

Allows you to walk a set w/o erasing anything. See

ˆfirst

for more complete

description of annotation, the distinction between next and

ˆfirst

is that next

does NOT remove the fact from the set, but moves on to each fact in turn. You

can reset a set with

^reset(@1)

then loop thru it looking at the subject ﬁeld with

loop() { _0 = next(FACT @1subject) }

ˆpendingtopics ()

List of currently pendings topics (interesting)

ˆpick ( ~concept )

Retrieve a random member of the concept. Pick is also used with factsets to

pick a random fact (analogous to ˆfirst with its more complete description).

ˆquerytopics ( word )

Get topics of which word is a keyword and which are not system topics and

which have gambits (not necessarily unused), returns as fact triples of word, “a”,

topicname. If used in an assignment to a set, it will not fail, but it may return 0

elements. If not used in an assignment, then it will use set @0 and will FAIL if

no topics are found.

ˆremoveproperty ( set flag )

remove this ﬂag from all facts in named set. Typically you would be removing

private marker ﬂags of yours or making transient facts permanent.If set has a

ﬁeld marker (like

@2subject

) then the property is added to all values of that

ﬁeld of facts of that set.

ˆreset ( @1 )

reset a fact set for browsing using ˆnext.

ˆquery ( kind subject verb object )

see writeup earlier.

ˆsave ( set )

mark set to be saved with user data from here on

ˆsort ( set )

sort the set.. doc unﬁnished.

ˆunduplicate ( set )

remove duplicate facts from this set. The destination set will be named in an

assignment statement like:

@1 = ^unduplicate(@0)

Normally this merely removes duplicate facts. If you specify a ﬁeld as well, no

facts having that ﬁeld duplicated will be kept either. Eg

@1 = ^unduplicate(@0object)

ˆuniquefacts ( from to )

Sees what facts in the from set are not in common with the to set. You specify

what ﬁeld to intersect on by naming a ﬁeld of the to set (or none). Eg.,

^uniquefacts(@0 @1object)

will ﬁnd facts in set 0 whose objects do not match any in set 1. If you dont name

a ﬁeld, you have to ﬁnd exact matches on the entire fact not in the 2nd set.

ˆunpackfactref ( set )

Find all facts in set which have facts as ﬁelds and then make THOSE facts be the

facts of the set. The destination set will be named in an assignment statement

like:

@1 = ^unpackfactref(@0)

Chat Script System Functions Manual

ChatScript-System-Functions-Manual

Navigation menu

Versions of this User Manual:

Views

Navigation