Visualizing the effect of observations in cluster analysis is provided after each cluster analysis. When running you are prompted for various input.
The following gives two examples of using the S code using (1) raw data, and (2) the distance matrix provided in upper, lower triangular form or full matrix.
> source("clst_w02.s")
--- Type the Data File Name : village2.dat
---------------------------------------------
(1) Raw Data
(2) Distance Matrix
---------------------------------------------
Type the Number : 2
---------------------------------------------
(1) Upper Triangular Matrix
(2) Lower Triangular Matrix
(3) Full Matrix
---------------------------------------------
Type the Number(Default=1): 1
--- Number of rows : 25
* If you had selected the raw data, you would see the following messages:
--- Number of columns : 6
: Standardize the Variables(Y/n)? : Y
-------------------------------------------
< Standard Type >
1. Z-Score 2. 0-1 transform
-------------------------------------------
Select (Default:1) : 1
-------------------------------------------
1. Euclidean
2. Squared Euclidean
3. Maximum : maximum difference
4. Manhattan : sum of absolute difference
5. binary : proportion of non-zeros
-------------------------------------------
Select Distance Measure(Default:1) :
* Next, regardless of whether you selected raw or distance matrix you
will get prompted about saving results (labelling observations or
saving the MST):
I.D file(If not want, only RETURN): === Save MST Result(.mst) === Type File name (if not, only RETURN) :* Now you will see the 2-dimensional MDS plot overlaid by the MST. You can rotate the coordinates of this plot, if you wish, around x-axis, y-axis, or (0,0), then you select "y" when you meet the message:
Want to Rotate (N/y) ? : y
* Next, is the exploratory stage: select the agglomerative method, the
number of clusters, and whether to remove an observation or not. When you select
cluster method, first cluster process is displayed.
-------------------------------
1. Single Cluster Analysis
2. Complete Cluster Analysis
3. Average Cluster Analysis
4. Centroid Cluster Analysis
5. Median Cluster Analysis
6. Ward's Method
7. Compare Two Methods
9. Exit
-------------------------------
Select : The number is ---- : 1
=== Single Cluster Process =====
fobs. sobs. distance
[1,] 2 7 0.7071068
[2,] 21 24 0.7348469
[3,] 13 17 0.7483315
[4,] 21 25 0.7483315
[5,] 2 5 0.7615773
[6,] 1 2 0.7615773
[7,] 1 6 0.7615773
[8,] 1 8 0.7874008
[9,] 15 16 0.8000000
[10,] 20 21 0.8000000
[11,] 13 15 0.8485281
[12,] 9 11 0.8602325
[13,] 13 14 0.8602325
[14,] 19 20 0.8602325
[15,] 1 10 0.8831761
[16,] 1 9 0.8944272
[17,] 1 12 0.8944272
[18,] 1 18 0.8944272
[19,] 1 4 0.9055385
[20,] 1 13 0.9055385
[21,] 1 3 0.9165151
[22,] 1 23 0.9165151
[23,] 19 22 0.9165151
[24,] 1 19 0.9273618
No. of Clst(2-24)(To Stop, only RETURN): 2
* When you type the number of clusters here, you can see the plot
showing "Single Cluster with MDS and MST", and also the results of
single cluster analysis are displayed in Main Window as follows:
Cluster for Single-link (G= 1 ) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 23 (G= 2 ) 19 20 21 22 24 25 No. of Clst(2-24)(To Stop, only RETURN): 3 Cluster for Single-link (G= 1 ) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 23 (G= 2 ) 19 20 21 24 25 (G= 3 ) 22 No. of Clst(2-24)(To Stop, only RETURN):* When you select the number of clusters, "Single-link with MDS and MST" is displayed interactively in graphics window as follows.
* When you only type RETURN, you will get prompted about removing observations.
Remove observations (Y/n) : y Press the obs's with Left Button. After finishing, press Right Button. To stop, only press Right Button* You can select several observations, and when you select obs. 11 with Left Button, and exit with Right Button you will get the following messages.
Selected obs's: 11 Wait !* This will produce two plots in the graphics window, displaying the results with and without observation 11. It will then ask you to enter the number of clusters again, and you can see how the deleted observation changes the result.
No. of Clst(2-23)(To stop, only RETURN): 2 Cluster for Single-link (G= 1 ) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 23 (G= 2 ) 19 20 21 22 24 25 Cluster for Single-link: Removed obs. 11 (G= 1 ) 1 2 3 4 5 6 7 8 9 10 (G= 2 ) 12 13 14 15 16 17 18 19 20 21 22 23 24 25 No. of Clst(2-23)(To stop, only RETURN): 3 Cluster for Single-link (G= 1 ) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 23 (G= 2 ) 19 20 21 24 25 (G= 3 ) 22 Cluster for Single-link: Removed obs. 11 (G= 1 ) 1 2 3 4 5 6 7 8 9 10 (G= 2 ) 13 14 15 16 17 18 19 20 21 22 23 24 25 (G= 3 ) 12 No. of Clst(2-23)(To stop, only RETURN): Removing another obs's ? (Y/n) : n* Two plots in graphics window are displayed as follows when obs. 11 is removed :
* When you type RETURN, and exit the process of removing observations, the main menu of cluster analysis is displayed again, and so you can do other cluster analysis sequentially or exit.
------------------------------- 1. Single Cluster Analysis 2. Complete Cluster Analysis 3. Average Cluster Analysis 4. Centroid Cluster Analysis 5. Median Cluster Analysis 6. Ward's Method 7. Compare Two Methods 9. Exit ------------------------------- Select : The number is ---- :