Department of Engineering, Bozorgmehr University of Qaenat, Qaen, South Khorasan, Iran.
تاریخ دریافت: 30 دی 1404،
تاریخ بازنگری: 07 خرداد 1405،
تاریخ پذیرش: 21 خرداد 1405
چکیده
Background and Objectives: Modern operating systems struggle to manage threads with dynamic resource demands, as traditional schedulers rely on reactive heuristics that often misclassify thread behavior. This paper introduces a proactive thread classification methodology that predicts resource-bound categories by analyzing kernel event streams in real time. Methods: Our proposed five-step pipeline includes: (1) kernel event collection using LTTng, (2) system call categorization into a seven-category taxonomy covering 57 system calls, (3) PID/TID labeling based on resource usage, (4) feature extraction from the first five events, and (5) predictive modeling with multiple machine learning classifiers. Results: Our evaluation of six machine learning models, including Random Forest, LightGBM, Stacked Ensemble, MLP, CNN-BiLSTM, and BERT demonstrates that Random Forest delivers the optimal balance of high predictive performance (93.4% precision, 92.5% recall) and low inference latency (178 µs), outperforming both other ensemble methods and computationally expensive deep learning architectures. When applied to a real-world dataset [30], this optimized methodology achieves 89% precision in thread classification, which directly translates to significant system-level improvements: a 41% reduction in tail latency for interactive applications and sustained 93% CPU utilization for cpu-bound tasks. Conclusion: This paper demonstrates the efficacy of a novel, proactive thread classification methodology that accurately predicts a thread's future resource-bound category within a critically short 100 µs window from its execution start. By instrumenting a five-step pipeline, the approach successfully translates fine-grained system call sequences into predictive signatures for resource constraints, such as identifying I/O-bound threads from read/write patterns. This early detection capability provides a timely and actionable foundation for operating system schedulers to preemptively optimize thread prioritization and resource allocation, thereby enhancing overall system performance and responsiveness.