Sig Tool Users Guide

User Manual:

Open the PDF directly: View PDF .
Page Count: 2

RAWS SIG Selector Tool

Purpose: The RAWS SIG Selector Tool is designed to help users determine which stations should be

grouped together in a Significant Interest Group (SIG) based on the strength of their statistical

correlations.

Data:

 Data for each RAWS has been collected from the DRI CEFA website:

https://www.wfas.net/nfdrs2016/maps/

 This data was imported into FFP 5.0

 ERC-Y values were calculated and imported into the online SIG Selector Tool for years 2005-2017

User Interface Screens:

There are three user interface screens: RAWS Map, ERC Climatology Graph and Correlation Matrix. The

correlation matrix is color coded based on the R-squared values between each set of stations. The RAWS

markers on the map are color coded according to their best fit statistical grouping. These color

groupings are also used on the ERC Climatology Graphs. See page 2 for rough details on statistical

methods used.

Step by step instructions:

 Using the map display, navigate to the desired dispatch boundary or state

 Click inside the dispatch/state boundary

 The graph and matrix will populate once the dispatch/state is selected

 The number of statistical groupings can be adjusted by using the drop down menu and selecting

the submit button.

Map (RAWS and Dispatch Boundaries)

ERC Climatology Graph

Correlation Matrix

Statistical Methods

Presently, the stations are grouped according to their monthly mean ERC values over the last 10 years utilizing

an algorithm known as hierarchical clustering. We presently force the minimum number of clusters to be "3" for

each dispatch area -- however, the final default grouping methodology will be as follows:

1. Perform dimensionality reduction of each station's 12 reported monthly mean ERC values down to 2 or 3 so

that it will be possible to visualize the potential clustering opportunities in 2D or 3D (pivot charts are difficult to

analyze in many respects). This will be accomplished with Principal Components Analysis. So far, 2 principal

components have managed to explain 90 percent of the variance of the underlying data.

2. Compute the gap-statistic for each dispatch area utilizing the 2 principal components obtained from the

dimensionality reduction performed above. The highest gap-statistic will correspond to our naive estimation of

the default number of clusters "k" to compute for each dispatch. Interpretation of this result will be completely

subjective even though we are utilizing an objectively consistent methodology to obtain it. It's provided merely

to set the defaults.

3. Use a k-Means Clustering algorithm to group / cluster similar stations together within each dispatch area,

where the default "k" value is the max gap-statistic computed previously.

4. For Power-Users Only: To inform / elucidate the user's subsequent decision to either reduce or enlarge the

number of clusters, the Within-Clusters-Sum-of-Squares (WCSS) will be computed for each possible number of

a clusters (minimum of 1 to a maximum of N, where N is the total number of stations under consideration in the

dispatch area) which could be plotted as an elbow graph to help power users discern the trade-off point where

increasing or decreasing the number of clusters occurs. Or, perhaps the more preferable option would be

include plots for conducting silhouette analysis parameterized for different values of k.

Sig Tool Users Guide

Navigation menu

Versions of this User Manual:

Views

Navigation