A New Suppression-based Possibilistic Fuzzy c-means Clustering Algorithm

Possibilistic fuzzy c-means (PFCM) is one of the most widely used clustering algorithm that solves the noise sensitivity problem of Fuzzy c-means (FCM) and coincident clusters problem of possibilistic c-means (PCM). Though PFCM is a highly reliable clustering algorithm but the efficiency of the algorithm can be further improved by introducing the concept of suppression. Suppression-based algorithms employ the winner and non-winner based suppression technique on the datasets, helping in performing better classification of real-world datasets into clusters. In this paper, we propose a suppression-based possibilistic fuzzy c-means clustering algorithm (SPFCM) for the process of clustering. The paper explores the performance of the proposed methodology based on number of misclassifications for various real datasets and synthetic datasets and it is found to perform better than other clustering techniques in the sequel, i.e., normal as well as suppression-based algorithms. The SPFCM is found to perform more efficiently and converges faster as compared to other clustering techniques.


Introduction
Data mining is the technique of processing through enormous data sets to find patterns and associations that can be used to classify and solve problems related to data analysis.It focuses on predicting the future and discovering patterns in the data using specialised machine learning and statistical models [1][2][3].In the realm of data mining, clustering is one of the most essential strategy for assigning most-similar objects in one cluster and dissimilar objects in the other cluster based on some user defined similarity metrics.Clustering analysis is employed in various applications such as pattern recognition, image segmentation, sentiment analysis, etc [4][5][6][7].These clustering methods can be used in different routing based algorithms in the field of wireless sensor networks and helps to improve the energy efficiency by clustering the WSN into different regions [8].There are two basic clustering approaches that have been developed, based on the related pattern between a data vector and distinct clusters: crisp clustering methods [9] and fuzzy clustering methods [10][11][12][13].Crisp clustering approaches, such as hard c-means (HCM), are based on hard partition (hard division) of the data.These approaches show the membership degree of a data vector to a class by utilising '1' to symbolise that it belongs to a set and '0' to represent that it does not belong to a set, based on the classic set theory.As a result, they assign each data vector into exactly one cluster, which has the advantages of being simple in concept and execution.However, because the membership assignment is based on the two-value logic function in classic set theory, they have the problem of producing unsatisfactory clustering results.On the other hand, fuzzy clustering approach allows data points to belong to more than one cluster simultaneously by using multiple membership values.One of the most widely used fuzzy clustering algorithms is Fuzzy c-means (FCM) by Dunn [11] and Bezdek [12] where one data point can belong to more than one cluster based on the degree of membership.The fuzzy theory of clustering can more objectively reflect the real world than the crisp clustering analysis, and it has been used in a variety of applications.However, FCM employs a membership constraint where sum of membership of each data point is one which makes FCM more sensitive to noise and outliers.The constraint mandates that the sum of a data point's membership degrees to all clusters be 1, which allows the FCM to generate memberships that emphasise the relative property over the absolute property between a data vector and the clusters.However, in FCM noise vectors/ outliers, are allocated large membership degrees, having a negative impact on cluster centre computation.
Krishnapuram and Keller [13] proposed possibilistic cmeans (PCM) clustering by relaxing the constraints on the membership degree and introducing possibilistic membership known as typicality, that better describe the absolute distances between data points and clusters.As a result, noise vectors/outliers, will be given very modest probabilistic membership degrees, drastically reducing the impact of noise points on the outcomes and overcoming FCM's noise sensitivity.However, the PCM pays a price for its weaker membership constraint as it tends to produce coincident clusters.It has other disadvantages of highly sensitive to initialization and problem of setting parameters.Fan et al. [14] proposed an improved model of FCM by incorporating a suppressed competitive learning mechanism into FCM, known as suppressed fuzzy c-means (SFCM) clustering.This method modifies the memberships of a data point to all the clusters in such a way that the highest membership (the winner) is prized and all others are suppressed, without disturbing the original order among them.Simultaneously, the SFCM utilizes a suppression rate α ϵ [0,1] to control the suppression strength.The SFCM with a reasonable parameter setting has a higher convergence speed and better performance than that of the FCM.A similar suppression based approach is employed in the typicality matrix in the PCM algorithm to generate the SPCM algorithm [15], which in turn has also achieved similar results as the SFCM with better convergence speed and performance than PCM.
To overcome the drawbacks of FCM and PCM, Pal et al. [16][17] proposed Possibilistic Fuzzy c-means (PFCM) clustering approach that generates membership and typicality values while clustering the unlabelled data.PFCM is a combination of both the FCM algorithm and the PCM algorithm.PFCM fixes the noise sensitivity of FCM and the coincident clusters problem of PCM.In the PFCM algorithm, the noise data has an influence on the estimation of centroids and converges slowly over the large datasets.To overcome these problems, this paper proposes a suppression-based possibilistic fuzzy-c means (SPFCM) clustering algorithm that combines the suppression approach with the possibilistic and probabilistic part of PFCM.SPFCM is obtained by applying suppressed competition learning mechanism into PFCM.The suppression mechanism is applied to both the membership matrix (FCM part) and the typicality matrix (PCM part).Further, we compare the performance of the proposed SPFCM algorithm on synthetic and several real datasets with different clustering approaches.The proposed method strengthens the susceptibilities present in the SFCM and SPCM.
Further, the paper is organized as follows; section 2 describes preliminary work.Section 3 discusses the proposed SPFCM algorithm, and Section 4 consists of the experimental results and discussion, followed by conclusions in section 5.

Preliminary Work
The behaviour of the clustering approaches is governed by the mathematical formulation of the objective function.This section discusses different clustering approaches related to the proposed work.

Fuzzy c-means (FCM) algorithm
FCM is one of the most widely used fuzzy clustering algorithms [10].The objective function of the FCM algorithm is defined as follows: ( ) where, the total number of the pattern in the dataset is denoted by n and c defines the number of clusters, Minimizing the objective function with respect to  The main drawback of FCM is the presence of membership constraint that makes it sensitive to noise points and outliers.Also, it is ineffective in detecting clusters of different shapes other than spherical [15][16].Since the degree of belongingness for data is not always represented best using FCM membership, a possibilistic approach called PCM was introduced.

Possibilistic c-means (PCM) algorithm
Krishnapuram and Keller [10] relaxed the column sum constraint of FCM and offered a possibilistic approach to clustering to overcome the noise sensitivity problem of FCM.The objective function of Possibilistic c-means (PCM) is defined as follows: Where i  is the parameter with the positive value and the membership of FCM is replaced by typicality ik t .The first term requires that the distance between data points and the cluster centres be as small as possible, while the second term requires that ik t be as high as possible, avoiding the simple solution.The cluster centres of PCM are updated in the same way as in FCM, but PCM typicality metrics is changed as follows:

Supressed Fuzzy c-means (SFCM)
Fan et al. [14] proposed the suppressed fuzzy c-means clustering (SFCM) technique to solve the slow convergence rate problem of FCM for large datasets.By changing the membership matrix in each iteration, the SFCM model brings a suppressed competitive learning mechanism into FCM as follows: The membership In SFCM, for each iteration, each input vector  To overcome the problem associated with the SFCM, Yu et.al. [15] introduced the suppression mechanism with the possibilistic approach of the clustering as Suppressed possibilistic c-means algorithm (SPCM).Here suppression constraints are applied on the typicality matrix of the PCM rather than the membership matrix of the FCM algorithm.This approach proved to perform better as compared to SFCM but suffers from the problem of coincident cluster.

Proposed Supressed Possibilistic Fuzzy c-means (SPFCM)
In this paper, we apply Fan et.al. [14] and Yu et.al. [15] suppression concept to the PFCM clustering algorithm and build a new suppressed possibilistic fuzzy c-means (SPFCM) clustering algorithm.Pal et.al. [16] proposed PFCM algorithm to obtain a stronger candidate for fuzzy clustering.This method combines the Fuzzy and Possibilistic approaches, resulting in two types of memberships: (1) a Fuzzy membership ik  that measures the relative degree of sharing of a point among clusters and (2) a possibilistic ik t membership that measures the absolute degree of typicality of a point in a given cluster.PFCM algorithm is less sensitive to outliers and at the same time can avoid overlapping clusters.The objective function of the PFCM algorithm is as follows: Subject to membership constraint as given in Eq. ( ) ( ) Further, the suppressed mechanism is introduced with the PFCM approach.The proposed SPFCM algorithm competes across clusters for typicality degrees and membership degrees and suppresses subordinate factors during the update phase.
The typicality metrics is updated as: The typicality wk t with the highest value of the given data point k x is selected among all the c clusters.The cluster w is considered as the closest cluster to the point k x and declared as the winner.The typicality wk t is referred to as winner membership, while the other typicalities ik t , where w i  are referred as non-winning typicalities as shown in Eq. 12.
The membership matrix is updated as

Experimental results and discussion
In this section, comprehensive experiments are conducted to test the performance of proposed SPFCM.For the experiments, six datasets including two synthetic data sets and four real data sets from the UCI repository are used.Firstly, the clustering results of SPFCM are compared with several classical and state-of-the-art fuzzy clustering methods.Finally, a discussion part is given to summary the superiority, stability, and reliability of proposed method.The following typical parameters have been considered:  = 2,  = 0.00001, maximum iterations= 100.

Artificial Datasets
The first simulation experiment involves the DUNN dataset which consists of one small and one big cluster of square shape.Fig. 2  S1: Initialize the number of clusters c, the partition matrix, such that  (0), the typicality matrix  (0), the termination tolerance  > 0 and the user-defined constants.S2: Calculate the cluster prototypes using Eq. ( 11) S3: Update the partition matrix by using Eq. ( 10) S4: Update the typicality matrix using Eq. ( 9) S5: Introduce suppression in typicality matrix using Eq. ( 12) S6: Introduce suppression in partition matrix using Eq. ( 13) S7: Repeat the steps from S2 until the improvement of the objective function between two consecutive iterations is less than the termination tolerance є.

Number of clusters: 2 clusters (2-dimensional data)
The second simulation experiment involves the GAUSSIAN dataset where a Gaussian random number generator is used to create a dataset containing two clusters with outliers.The effect of noise/outliers can be seen in the Fig. 3 on different clustering algorithms.The outlier points shifts the partition space towards the outliers in case of FCM, SFCM and PFCM.The results shown by PCM are good when given good initial values from the FCM results with one misclassified data point.The results shown by our proposed method SPFCM is better than FCM, SFCM, PFCM and SPCM as it properly partitions the feature space and yields good clusters just like PCM.
In addition to the producing good quality clusters, we have also compared the convergence rate (No of Iterations=ℓ) of different clustering algorithms.It can be seen from Table 2, though FCM takes the least number of iterations, but the misclassified data points are the highest.Although the misclassifications are the same in PCM and our proposed method, but our proposed algorithm converges in 20 iterations as compared to 36 iterations in PCM.Also as compared to PFCM, our proposed method produces good partitions of two clusters using lesser number of iterations. .

Real Datasets
Experiments were performed on a number of real datasets taken from UCI machine learning repository [20] including: wine, seed, glass and abalone.The datasets are selected with the variations in the features, shape of the clusters, and size of the dataset.Huang's accuracy metric [21] was used to evaluate the clustering results: where The wine dataset consists of three types of wine found in Italy and their chemical analysis.The three categories are represented in 178 samples and consist of 13 attributes obtained as a result of the chemical analysis.Table 3 shows the accuracy percentage for the SPFCM above 72% with error percentage around 27% meanwhile all the other algorithms have accuracy percentage below the proposed method.The misclassification number of the data points is 49 which are the lowest among all the algorithms.The seed dataset has measurements of seven geometric parameters of wheat kernels.These kernels are divided into three categories: Kama, Rosa and Canadian.It is observed in Table 3 that the accuracy percentage for the proposed SPFCM algorithm is the highest at 90.95% clearly showing that it surpasses other algorithm in performance.The error percentage is also recorded close to 9% being the lowest among all.The misclassification number is 19 representing the efficiency of the proposed SPFCM.Glass dataset includes 214 instances, describing 7 categories of glass using 10 features.Table 3 shows the percentage accuracy of the proposed SPFCM algorithm for glass dataset to be over 88%, highest among the entire clustering algorithm used, both the error percentage and misclassifications are also minimal as compared to the other algorithms.The abalone dataset is widely used to test the performance of the clustering.Abalone dataset consists of physical measurements of abalones, which are large, edible sea snails.The dataset is divided into three categories, 8 features and a total of 4,177 samples.Table 3 illustrates proposed SPFCM performs well on large dataset of abalone with the maximum accuracy and minimum misclassification error as compared with different clustering algorithms.

Evaluation of convergence rate
The convergence rate of the proposed SPFCM is compared with different clustering algorithms on the real datasets to determine the efficiency of the proposed technique.The objective values for various clustering algorithms are plotted on each set of iterations to show the number of iteration taken by an algorithm for attaining the convergence.In the case of proposed algorithm SPFCM, the objective value decreases monotonically and converges in lesser number of iterations.Fig. 5 shows convergence of the different clustering algorithms with the number of iterations on the wine dataset.It can be observed from Fig. 5 that SPCM and SFCM converge slowly as compared to their conventional counter-parts whereas only our proposed SPFCM converges faster than its conventional counter-part PFCM.Fig. 6 shows the convergence of the glass dataset with the number of iterations.As shown in Fig. 6(f), the objective function for SPFCM not only converges faster than other algorithms but the value of the objective function is also much less compared to others.This proves the efficiency of the proposed SPFCM over the other clustering algorithms in attaining the convergence with minimum number of iterations.

Effect of Suppressed Parameter (  ) on SPFCM Clustering
This section evaluates the performance of the proposed SPFCM algorithm on the different values of the suppressed parameter  .Fig. 7 shows the variation in the accuracy achieved with the change in the  value on the benchmarking real datasets.The value of  ranges between 0 and 1, and it handles the extent of suppression performed.The y-axis represents the percentage of accuracy achieved and the x-axis represents the value of  .In case of the glass dataset, the average accuracy obtained is decreasing as the value of  is increased

Conclusions
In this paper, we proposed a suppressed possibilistic fuzzy c-means (SPFCM) clustering approach that incorporates suppression mechanism with possibilistic and fuzzy approach of PFCM.The non-winner typicalities and memberships are suppressed to improve the effect of winner membership and typicalities to develop the clustering approach with better performance.Experiments are conducted on several synthetic and real datasets to prove the effectiveness of the proposed algorithm.In terms of clustering accuracy, error and misclassifications, the proposed SPFCM shows appreciable improvements as compared to the existing clustering algorithm.The proposed SPFCM clustering proved to be more comprehensive and have important theoretical and application values for the research over the PFCM and the other clustering approaches.

v
and m defines the degree of the fuzziness of the resulting partitions.The membership function and center are given by minimizing the objective function of FCM under the probability constraint: ik


and setting it to zero, we get the equation for membership value, i.e., Minimizing the objective function with respect to cluster center i v , the equation for cluster centers i.e., wk  with the highest value of the given data point k x is selected among all c clusters.The cluster w is considered as the closest cluster to the point k x and declared as the winner.The membership wk  is referred to as winner membership, while the other memberships ik  , where w i  are referred as non- winning memberships as shown in Eq. 7. the suppression is controlled by the parameter fcm  , which ranges from 0 to 1.When 0 = fcm  , all non-winner memberships are set to 0, and the SFCM algorithm becomes identical to the hard clustering algorithm.There is no suppression when 1 = fcm  , and the SFCM algorithm is the same as the FCM algorithm.The SFCM with a reasonable suppression rate fcm  can improve the convergence speed of the FCM while maintaining good clustering accuracy.

4 vas k d 4 was 4 
competition, the cluster whose prototype is situated at the shortest distance from k x wins.Fuzzy membership of any data point k x with respect to any nonwinner cluster is suppressed, while all suppressed parts are given to the winner cluster to preserve the probabilistic constraint: effect of suppression caused on the distance of the winner data point from the cluster centre.While the distance of the non-winner from the cluster centre remains the same, the distance of the winner data point is shortened.In Fig.1, i v are the data points around a particular cluster centre k x and ik d is the individual Euclidean distance between the data point and cluster centre.Since was the point closest to the cluster centre k x the smallest, it was subjected to suppression, i.e, the Euclidean distance and the distance of other data points will remain same.However FCM and SFCM suffer from the noise sensitivity problem due to presence of membership constraint.EAI Endorsed Transactions on Scalable Information Systems 01 2023 -04 2023 | Volume 10 | Issue 3 | e3

Figure 1 :
Figure 1: Effect on distance caused by suppression the relative significance of the fuzzy and the possibilistic membership in the objective function. of the fuzziness.The minimization of the objective function defines membership, typicality and cluster centres as follows:


with the highest value of the given data point k x is selected among all c clusters is selected.The cluster w is considered as the closest EAI Endorsed Transactions on Scalable Information Systems 01 2023 -04 2023 | Volume 10 | Issue 3 | e3 cluster to the point k x and declared as the winner.The membership wk  is referred to as winner membership, while the other memberships ik  , where w i  are referred as non-winning memberships as shown in Eq. 13. suppression and the algorithm will behave like PFCM.The basic steps of the SPFCM algorithm are given below:

Fig. 4
Fig. 4 represents the graphs of convergence of different suppression based clustering algorithms as compared to their conventional counterparts

Figure 6 .
Figure 5. Wine Dataset displays the original DUNN data set and Table 1 lists the centroid locations of different *V is the calculated centre using the above mentioned algorithms, ℓ represents the iteration number.

Table 3 .
Comparative analysis of SPFCM algorithm with other fuzzy clustering methods in terms of accuracy, misclassifications, and error.