| تعداد نشریات | 11 |
| تعداد شمارهها | 237 |
| تعداد مقالات | 2,398 |
| تعداد مشاهده مقاله | 3,830,613 |
| تعداد دریافت فایل اصل مقاله | 2,775,225 |
VelvetFlow: An engineering pipeline for robust multi-density clustering | ||
| Journal of Discrete Mathematics and Its Applications | ||
| مقاله 3، دوره 10، شماره 4، اسفند 2025، صفحه 333-358 اصل مقاله (3.42 M) | ||
| نوع مقاله: Full Length Article | ||
| شناسه دیجیتال (DOI): 10.22061/jdma.2025.12039.1131 | ||
| نویسندگان | ||
| Hossein Eyvazi* ؛ Mohammad Badzohreh؛ Seyed Ali Shahrokhi | ||
| Department of Computer Science, University of Tarbiat Modares, Tehran, I. R. Iran | ||
| تاریخ دریافت: 20 اردیبهشت 1404، تاریخ بازنگری: 15 تیر 1404، تاریخ پذیرش: 18 مرداد 1404 | ||
| چکیده | ||
| Problem. Real-world datasets seldom respect a single density scale: tight blobs, elongated ribbons, and isolated points often coexist. Classical algorithms such as DBSCAN or \textit{k}-means require domain-specific parameter tuning and provide only ad-hoc support for anomaly detection. Solution. We introduce VelvetFlow, an engineering pipeline that turns a set of well-understood building blocks into a cohesive, end-to-end workflow for multi-density clustering \emph{and} principled outlier detection. The pipeline is composed of three reusable stages: (i) \emph{Contextual-density splitting} assigns every point to a high- or low-density partition using a single neighbourhood size $k$. (ii) \emph{Density-aware clustering} applies a Jaccard-guided \textit{FusedNeighbor}+DBSCAN routine to the sparse partition and HDBSCAN to the dense partition-without introducing new hyper-parameters. (iii) \emph{Scaled-MST verification} re-examines the complete $k$-NN graph, flags weakly connected components, and validates them with a $k$-NN gate; this step recovers small remote clusters while filtering genuine anomalies. | ||
| کلیدواژهها | ||
| multi-density clustering؛ outlier detection؛ HDBSCAN؛ DBSCAN؛ MST؛ fused neighbor | ||
| مراجع | ||
|
[1] M. Ankerst, M. M. Breunig, H.-P. Kriegel, J. Sander, OPTICS: Ordering points to identify the clustering structure, ACM SIGMOD Record 28(2) (1999) 49–60. https://doi.org/ 10.1145/ 304181.304187 [2] W. Atwa, A. A. Almazroi, E. A. Aldhahr, N. F. Janbi, Active semi-supervised clustering algorithm for multi-density datasets, International Journal of Advanced Computer Science & Applications 15 (2024). https://doi.org/10.14569/IJACSA.2024.0151052 [3] D. Birant, A. Kut, ST-DBSCAN: An algorithm for clustering spatial-temporal data, Data & Knowledge Engineering 60(1) (2007) 208–221. https://doi.org/10.1016/j.datak.2006.01.013 [4] M. M. Breunig, H.-P. Kriegel, R. T. Ng, J. Sander, LOF: identifying density-based local outliers, In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (2000) 93–104. https://doi.org/10.1145/342009.335388 [5] W. Durani, D. Mautz, C. Plant, C. Bohm, DMDHC: Discovery of multi-density hierarchical clus- ¨ ter structures, In Proceedings of the 2025 SIAM International Conference on Data Mining (SDM) (2025) 261–269. https://doi.org/10.1137/1.9781611978520.25 [6] P. A. Estevez, M. Tesmer, C. A. Perez, J. M. Zurada, Normalized mutual information feature se- ´ lection, IEEE Transactions on Neural Networks 20(2) (2009) 189–201. https://doi.org/ 10.1109/ TNN.2008.2005601 [7] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD) (1996) 226–231. https://dl.acm.org/doi/10.5555/3001460.3001507 [8] H. Eyvazi, A. Rajaei, Accelerated DBSCAN via parallel, density-aware multi-objective genetic optimization, Journal of Mathematical Modeling (2025) 809–822. https://doi.org/ 10.22124/ jmm.2025.29702.2648 [9] J. Li, C. Wang, F. J. Verbeek, T. Schultz, H. Liu, MS2OD: Outlier detection using minimum spanning tree and medoid selection, Machine Learning: Science and Technology 5(1) (2024) 015025. https://doi.org/10.1088/2632-2153/ad2492 [10] Y. Li, J. Wang, H. Zhao, C. Wang, Q. Shao, Adaptive DBSCAN clustering and GASA optimization for underdetermined mixing matrix estimation in fault diagnosis of reciprocating compressors, Sensors 24(1) (2023) 167. https://doi.org/10.3390/s24010167 [11] G. T. Madhubhashini, Challenges faced by provincial television journalists in Sri Lanka, The Journal of Development Communication 35 (2024) 73–79. https:// jdc.journals.unisel.edu.my/ index.php/ jdc/article/view/261 [12] L. McInnes, J. Healy, S. Astels, HDBSCAN: Hierarchical density based clustering, Journal of Open Source Software 2(11) (2017) 205. https://doi.org/10.21105/joss.00205 [13] J. Qian, Y. Zhou, X. Han, Y. Wang, MDBSCAN: A multi-density DBSCAN based on relative density, Neurocomputing 576 (2024) 127329. https://doi.org/10.1016/j.neucom.2024.127329 [14] J. M. Santos, M. Embrechts, On the use of the adjusted Rand index as a metric for evaluating supervised classification, In International Conference on Artificial Neural Networks (2009) 175– 184. https://doi.org/10.1007/978-3-642-04277-5 18 [15] F. Tombari, S. Salti, L. Di Stefano, Unique signatures of histograms for local surface description, In Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part III (2010) 356–369. https://doi.org/10.1007/978- 3-642-15558-1 26 [16] H. Wang, J. Zhang, Y. Shen, S. Wang, B. Deng, W. Zhao, Improved density peak clustering with a flexible manifold distance and natural nearest neighbors for network intrusion detection, Scientific Reports 15(1) (2025) 8510. https://doi.org/10.1038/s41598-025-92509-4 [17] Z. Wang, Z. Ye, Y. Du, Y. Mao, Y. Liu, Z. Wu, J. Wang, AMD-DBSCAN: An adaptive multi-density DBSCAN for datasets of extremely variable density, In 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA) (2022) 1–10. https://doi.org/10.1109/DSAA54385.2022.10032412 [18] X. Wei, M. Peng, H. Huang, Y. Zhou, An overview on density peaks clustering, Neurocomputing 554 (2023) 126633. https://doi.org/10.1016/j.neucom.2023.126633 [19] W. Zang, X. Liu, L. Ma, M. Sun, J. Che, Y. Zhao, Y. Wang, D. Wang, X. Liu, DPC-MFP: An adaptive density peaks clustering algorithm with multiple feature points, Neurocomputing 618 (2025) 129060. https://doi.org/10.1016/j.neucom.2024.129060 [20] Y. Zou, Z. Wang, X. Wang, T. Lv, A clustering algorithm based on local relative density, Electronics 14(3) (2025) 481. https://doi.org/10.3390/electronics14030481 | ||
|
آمار تعداد مشاهده مقاله: 226 تعداد دریافت فایل اصل مقاله: 310 |
||