Search Results/Filters    

Filters

Year

Banks


Expert Group



Full-Text


Issue Info: 
  • Year: 

    2020
  • Volume: 

    17
  • Issue: 

    2 (44)
  • Pages: 

    85-100
Measures: 
  • Citations: 

    0
  • Views: 

    500
  • Downloads: 

    0
Abstract: 

Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in this paper we propose a method to use weighting in the ensemble clustering problem. The accuracies of base clusterings are estimated using an algorithm from crowdsourcing literature called agreement/disagreement method (AD). This method exploits the agreements or disagreements between different labelers for estimating their accuracies. It assumes different labelers have labeled a set of samples, so each two persons have an agreement ratio in their labeled samples. Under some independence assumptions, there is a closed-form formula for the agreement ratio between two labelers based on their accuracies. The AD method estimates the labelers’ accuracies by minimizing the difference between the parametric agreement ratio from the closed-form formula and the agreement ratio from the labels provided by labelers. To adapt the AD method to the clustering problem, an agreement between two clusterings are defined as having the same opinion about a pair of samples. This agreement can be as either being in the same cluster or being in different clusters. In other words, if two clusterings agree that two samples should be in the same or different clusters, this is considered as an agreement. Then, an optimization problem is solved to obtain the base clusterings’ accuracies such that the difference between their available agreement ratios and the expected agreements based on their accuracies is minimized. To generate the base clusterings, we use four different settings including different clustering algorithms, different distance measures, distributed features, and different number of clusters. The used clustering algorithms are mean shift, k-means, mini-batch k-means, affinity propagation, DBSCAN, spectral, BIRCH, and agglomerative clustering with average and ward metrics. For distance measures, we use correlation, city block, cosine, and Euclidean measures. In distributed features setting, the k-means algorithm is performed for 40%, 50%, … , and 100% of randomly selected features. Finally, for different number of clusters, we run the k-means algorithm by k equals to 2 and also 50%, 75%, 100%, 150%, and 200% of true number of clusters. We add the estimated weights by the AD algorithm to two famous ensemble clustering methods, i. e., Cluster-based Similarity Partitioning Algorithm (CSPA) and Hyper Graph Partitioning Algorithm (HGPA). In CSPA, the similarity matrix is computed by taking a weighted average of the opinions of different clusterings. In HGPA, we propose to weight the hyperedges by different values such as the estimated clustering accuracies, size of clusters, and the silhouette of clusterings. The experiments are performed on 13 real and artificial datasets. The reported evaluation measures include adjusted rand index, Fowlkes-Mallows, mutual index, adjusted mutual index, normalized mutual index, homogeneity, completeness, v-measure, and purity. The results show that in the majority of cases, the proposed weighted-based method outperforms the unweighted ensemble clustering. In addition, the weighting is more effective in improving the HGPA algorithm than CSPA. For different weighting methods proposed for HGPA algorithm, the best average results are obtained when we use the accuracies estimated by the AD method to weight the hyperedges, and the worst results are obtained when using the normalized silhouette measure for weighting. Finally, among different methods for generating base clusterings, the best results in weighted HGPA are obtained when we use different clustering algorithms to come up with different base clusterings.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 500

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2011
  • Volume: 

    3
  • Issue: 

    1
  • Pages: 

    34-41
Measures: 
  • Citations: 

    0
  • Views: 

    764
  • Downloads: 

    0
Abstract: 

Introduction: identifying of influencing factors on educational performance has specific importance, about the role of these factors, there are numerous controversies. So, this study was carried out in order to assessing the predictors of academic performance in midwifery alumni of kashan university of medical sciences.Methods: in this cross sectional study all the midwifery alumni of kaums from 1995 until the end of 2003 (n=76) were selected via census method. The questionnaire including demographic data, admission criteria, educational background and academic performance status was used. Data were analyzed by using independent Mann-Whitney U, and stepwise multivariate linear regression analysis.Results: Findings show that being native (p<0.002), lack of the temporary transfer background to other universities (p<0.038), lower age (p<0.01) and higher high school grade point average (p<0.0001) had the positive effect on academic performance. Regression analysis showed that %43.6 of the observed variance in the status of academic performance is determined by 2 variables including HGPA and residence place (F=26.242, p<0.0001), and HGPA explains %35 of the total variance alone.Conclusion: with regard to the predictor role of educational background and residence place in academic performance, it's proposed to consider HGPA and native selection in the acceptance of university students.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 764

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2010
  • Volume: 

    3
  • Issue: 

    3
  • Pages: 

    127-131
Measures: 
  • Citations: 

    0
  • Views: 

    815
  • Downloads: 

    0
Abstract: 

Aims: Identifying the influencing factors on educational performance has specific importance. According to controversies about the role of effective factors, this study was carried out in order to assess the predictors of academic performance in nursing alumni of Kashan University of Medical Sciences. Methods: In this cross sectional study, all 482 nursing alumni of Kashan University of Medical Sciences from 1986 to the end of 2003 were selected via census method. The questionnaire including demographic data, admission criteria, educational background and academic performance status was used. Data were analyzed using independent T-test and stepwise multivariate linear regression. Results: Female gender, using of areas quota, lower age and higher high school grade point average (HGPA) had the positive effect on academic performance (p<0.0001). According to regression analysis, 1.27% of the observed variance in the status of academic performance was determined by 4 variables including HGPA, gender, transfer from other universities and quota type (F=38.72; p<0.0001); HGPA explained 22.4% of the total variance alone. Conclusion: With regard to the predictor role of educational background, admission criteria and demographic data in academic performance; it’s essential the authorities plan and proceed to improve academic performance of nursing alumni.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 815

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2019
  • Volume: 

    15
  • Issue: 

    4 (38)
  • Pages: 

    17-30
Measures: 
  • Citations: 

    0
  • Views: 

    253
  • Downloads: 

    0
Abstract: 

Clustering is the process of division of a dataset into subsets that are called clusters, so that objects within a cluster are similar to each other and different from objects of the other clusters. So far, a lot of algorithms in different approaches have been created for the clustering. An effective choice (can combine) two or more of these algorithms for solving the clustering problem. Ensemble clustering combines results of existing clusterings to achieve better performance and higher accuracy. Instead of combining all of existing clusterings, recent decade researchers show, if only a set of clusterings is selected based on quality and diversity, the result of ensemble clustering would be more accurate. This paper proposes a new method for ensemble clustering based on quality and diversity. For this purpose, firstly first we need a lot of different base clusterings to combine them. Different base clusterings are generated by k-means algorithm with random k in each execution. After the generation of base clusterings, they are put into different groups according to their similarities using a new grouping method. So that clusterings which are similar to each other are put together in one group. In this step, we use normalized mutual information (NMI) or adjusted rand index (ARI) for computing similarities and dissimilarities between the base clustering. Then from each group, a best qualified clustering is selected via a voting based method. In this method, Cluster-validity-indices were used to measure the quality of clustering. So that all members of the group are evaluated by the Cluster-validity-indices. In each group, clustering that optimizes the most number of Cluster-validity-indices is selected. Finally, consensus functions combine all selected clustering. Consensus function is an algorithm for combining existing clusterings to produce final clusters. In this paper, three consensus functions including CSPA, MCLA, and HGPA have used for combining clustering. To evaluate proposed method, real datasets from UCI repository have used. In experiment section, the proposed method is compared with the well-known and powerful existing methods. Experimental results demonstrate that proposed algorithm has better performance and higher accuracy than previous works.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 253

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
litScript
telegram sharing button
whatsapp sharing button
linkedin sharing button
twitter sharing button
email sharing button
email sharing button
email sharing button
sharethis sharing button