Manual

User Manual:

Open the PDF directly: View PDF .
Page Count: 4

get_pair_distances
get_similarity_scores
textmatch
Index

Package ‘textmatch’

March 25, 2019

Title Toolkit for Matching Textual Data and Evaluating Textual Similarity

Version 0.0.0.9000

Description What the package does (one paragraph).

Depends R (>= 3.5.2)

License What license is it under?

Encoding UTF-8

LazyData true

RoxygenNote 6.1.1.9000

Imports dplyr,

data.table,

quanteda

Suggests knitr,

rmarkdown

VignetteBuilder knitr

Rtopics documented:

get_pair_distances...................................... 1

get_similarity_scores .................................... 2

textmatch .......................................... 2

Index 4

get_pair_distances Similarity and distance computation between documents or features

Description

These functions compute distance matrices from a text representation where each row is a document

and each column is a feature to measure distance over based on treatment indicator Z

Usage

get_pair_distances(dat, Z, include = c("cosine", "jaccard", "euclidean",

"mahalanobis", "propensity"), exclude = NULL, docnames = NULL,

verbose = FALSE)

2textmatch

Arguments

ZA logical or binary vector indicating treatment and control for each unit in the

study. TRUE or 1 represents a treatment unit, FALSE of 0 represents a control

unit.

docnames A vector of document names equal in length to the number of documents

xa valid quanteda dfm object

Value

A matrix showing pairwise distances for all potential matches of treatment and control units under

various distance metrics

get_similarity_scores This function calculates an input character vector’s similarity matrix

according to the measures contained in the predictive model.

Description

This function calculates an input character vector’s similarity matrix according to the measures

contained in the predictive model.

Usage

get_similarity_scores(x)

Arguments

xA character vector where each element is a document

Value

A data frame of rows (n * n-1) and columns 16; each column is one of the constituent similarity

measures

textmatch This function runs the main ML model as speciﬁed in Mozer et al.

(2018)

Description

This function runs the main ML model as speciﬁed in Mozer et al. (2018)

Usage

textmatch(x, outcome = "matrix")

Arguments

xA character vector where each element is a document

textmatch 3

Value

An n by n matrix where n is the length of parameter x. Each entry is a standardized similarity score.

Examples

textmatch(c("I am a dog", "I am a cat", "The rain in Spain falls mainly on the plain."),

outcome = "matrix")

Index

get_pair_distances,1

get_similarity_scores,2

textmatch,2

Manual

Navigation menu

Versions of this User Manual:

Views

Navigation