1. Combining KNN and Sentence Transformers for Article Categorization |
1.1 Abstract |
1.2 Document |
This paper presents a novel approach to categorizing news articles using a combination of K-Nearest Neighbors (KNN) classification and Sentence Transformers. The primary aim of this study is to develop a method that efficiently organizes large volumes of news articles based on their titles, thereby enhancing the accessibility of information and improving user experience. We leverage the SentenceTransformers framework for vectorizing article titles into numerical embeddings, which are then classified using a custom KNN algorithm designed for tensor data. This approach capitalizes on the principle that articles with similar titles often share the same category. Our results demonstrate a significant accuracy score, indicating the efficacy of our method in categorizing news articles. The simplicity of our model, compared to more complex NLP systems, offers a computationally efficient alternative while maintaining a high level of accuracy. The paper also discusses the limitations of our approach, particularly in domain invariance, and suggests potential areas for future improvement. This study contributes to the field of natural language processing by demonstrating the application of traditional machine learning techniques in addressing modern-day challenges in information categorization