Sig Tool Users Guide
User Manual:
Open the PDF directly: View PDF .
Page Count: 2
Download | |
Open PDF In Browser | View PDF |
RAWS SIG Selector Tool Purpose: The RAWS SIG Selector Tool is designed to help users determine which stations should be grouped together in a Significant Interest Group (SIG) based on the strength of their statistical correlations. Data: Data for each RAWS has been collected from the DRI CEFA website: https://www.wfas.net/nfdrs2016/maps/ This data was imported into FFP 5.0 ERC-Y values were calculated and imported into the online SIG Selector Tool for years 2005-2017 User Interface Screens: There are three user interface screens: RAWS Map, ERC Climatology Graph and Correlation Matrix. The correlation matrix is color coded based on the R-squared values between each set of stations. The RAWS markers on the map are color coded according to their best fit statistical grouping. These color groupings are also used on the ERC Climatology Graphs. See page 2 for rough details on statistical methods used. Map (RAWS and Dispatch Boundaries) ERC Climatology Graph Correlation Matrix Step by step instructions: Using the map display, navigate to the desired dispatch boundary or state Click inside the dispatch/state boundary The graph and matrix will populate once the dispatch/state is selected The number of statistical groupings can be adjusted by using the drop down menu and selecting the submit button. Statistical Methods Presently, the stations are grouped according to their monthly mean ERC values over the last 10 years utilizing an algorithm known as hierarchical clustering. We presently force the minimum number of clusters to be "3" for each dispatch area -- however, the final default grouping methodology will be as follows: 1. Perform dimensionality reduction of each station's 12 reported monthly mean ERC values down to 2 or 3 so that it will be possible to visualize the potential clustering opportunities in 2D or 3D (pivot charts are difficult to analyze in many respects). This will be accomplished with Principal Components Analysis. So far, 2 principal components have managed to explain 90 percent of the variance of the underlying data. 2. Compute the gap-statistic for each dispatch area utilizing the 2 principal components obtained from the dimensionality reduction performed above. The highest gap-statistic will correspond to our naive estimation of the default number of clusters "k" to compute for each dispatch. Interpretation of this result will be completely subjective even though we are utilizing an objectively consistent methodology to obtain it. It's provided merely to set the defaults. 3. Use a k-Means Clustering algorithm to group / cluster similar stations together within each dispatch area, where the default "k" value is the max gap-statistic computed previously. 4. For Power-Users Only: To inform / elucidate the user's subsequent decision to either reduce or enlarge the number of clusters, the Within-Clusters-Sum-of-Squares (WCSS) will be computed for each possible number of a clusters (minimum of 1 to a maximum of N, where N is the total number of stations under consideration in the dispatch area) which could be plotted as an elbow graph to help power users discern the trade-off point where increasing or decreasing the number of clusters occurs. Or, perhaps the more preferable option would be include plots for conducting silhouette analysis parameterized for different values of k.
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : No Page Count : 2 Language : en-US Tagged PDF : Yes Author : Law, Shelby -FS Creator : Microsoft® Word 2013 Create Date : 2018:09:07 11:41:14-06:00 Modify Date : 2018:09:07 11:41:14-06:00 Producer : Microsoft® Word 2013EXIF Metadata provided by EXIF.tools