Evaluating embedding models for text classification in apartment management

Changro Lee

doi:10.3846/ijspm.2025.23637

DOI: https://doi.org/10.3846/ijspm.2025.23637

Abstract

The recent proliferation of embedding models has enhanced the accessibility of textual data classification. However, the crucial challenge is evaluating and selecting the most effective embedding model for a specific domain from a vast number of options. In this study, we address this challenge by assessing the performance of embedding models based on their effectiveness in downstream tasks. We analyze consultation records maintained by an apartment management body in South Korea, and convert this textual data into numerical representations using various embedding models. The vectorized text is then categorized using a k-means clustering algorithm. The downstream task, specifically, the classification of consultation records, is evaluated using a quantitative metric (Silhouette score) and qualitative approaches (domain-specific knowledge and visual inspection). The qualitative approaches yield more reliable results than the quantitative approach. These findings are expected to be valuable for the various stakeholders in property management.

Keyword : embedding model, text data, clustering, domain-specific knowledge, apartment management

How to Cite

Lee, C. (2025). Evaluating embedding models for text classification in apartment management. International Journal of Strategic Property Management, 29(2), 93–101. https://doi.org/10.3846/ijspm.2025.23637

Published in Issue

Apr 23, 2025

Abstract Views

23

PDF Downloads

12

This work is licensed under a Creative Commons Attribution 4.0 International License.

References

Abe, K., Yokoi, S., Kajiwara, T., & Inui, K. (2022, November). Why is sentence similarity benchmark not predictive of application-oriented task performance? In Proceedings of the 3rd Workshop on Evaluation and Comparison of NLP Systems (pp. 70–87). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.eval4nlp-1.8

Aizawa, A. (2003). An information-theoretic perspective of tf–idf measures. Information Processing & Management, 39(1), 45–65. https://doi.org/10.1016/S0306-4573(02)00021-3

Bajal, E., Katara, V., Bhatia, M., & Hooda, M. (2022). A review of clustering algorithms: Comparison of DBSCAN and K-mean with oversampling and t-SNE. Recent Patents on Engineering, 16(2), 17–31. https://doi.org/10.2174/1872212115666210208222231

Byun, W. J. (2016). Improving transparency in apartment management. Practice & Theory of Civil Law, 19(2), 79–107. https://doi.org/10.21132/minsa.2016.19.2.03

Dash, T., Chitlangia, S., Ahuja, A., & Srinivasan, A. (2022). A review of some techniques for inclusion of domain-knowledge into deep neural networks. Scientific Reports, 12(1), Article 1040. https://doi.org/10.1038/s41598-021-04590-0

Eun, N., Kwak, D., Chae, H., & Jee, E. (2015). Roles of housing management support center and development plan. Journal of the Korean Housing Association, 26(6), 169–180. https://doi.org/10.6107/JKHA.2015.26.6.169

García-Ferrero, I., Agerri, R., & Rigau, G. (2021, November). Benchmarking meta-embeddings: What works and what does not. In Findings of the Association for Computational Linguistics: EMNLP 2021 (pp. 3957–3972). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.findings-emnlp.333

Guja, A., Siwiak, M., & Siwiak, M. (2024). Staring data analytics with generative AI and Python. Manning Publications.

Hyun, S. H., & Lee, E. K. (2021). Determining factors of multi-family housing management policy. Letter of Korean Policy Sciences, 25(3), 35–62. https://doi.org/10.31553/kpsr.2021.9.25.3.35

Janecek, A., Gansterer, W., Demel, M., & Ecker, G. (2008, September). On the relationship between feature selection and classification accuracy. In New challenges for feature selection in data mining and knowledge discovery (pp. 90–105). PMLR.

Jaskowiak, P. A., Costa, I. G., & Campello, R. J. (2022). The area under the ROC curve as a measure of clustering quality. Data Mining and Knowledge Discovery, 36(3), 1219–1245. https://doi.org/10.1007/s10618-022-00829-0

Kaufman, L., & Rousseeuw, P. J. (2009). Finding groups in data: An introduction to cluster analysis. John Wiley & Sons.

Kilimci, Z. H., & Akyokuş, S. (2019, September). The evaluation of word embedding models and deep learning algorithms for Turkish text classification. In 2019 4th International Conference on Computer Science and Engineering (UBMK) (pp. 548–553). IEEE. https://doi.org/10.1109/UBMK.2019.8907027

Kim, S. H. (2024). Investigating noise-between-floors crimes and their characteristics. Public Security Research, 38(3), 95–128.

Korean Statistical Information Service. (2021). Population and housing census. Daejeon City.

Li, B., Zhou, H., He, J., Wang, M., Yang, Y., & Li, L. (2020). On the sentence embeddings from pre-trained language models. arXiv.

Li, Z., Zhang, X., Zhang, Y., Long, D., Xie, P., & Zhang, M. (2023). Towards general text embeddings with multi-stage contrastive learning. arXiv.

Liu, Y., Chen, W., Liu, H., Zhang, Y., Zhang, M., & Qu, H. (2024). Biologically plausible sparse temporal word representations. IEEE Transactions on Neural Networks and Learning Systems, 35(11), 16952–16959. https://doi.org/10.1109/TNNLS.2023.3290004

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv.

Morissette, L., & Chartier, S. (2013). The k-means clustering technique: General considerations and implementation in Mathematica. Tutorials in Quantitative Methods for Psychology, 9(1), 15–24. https://doi.org/10.20982/tqmp.09.1.p015

Pareek, J., & Jacob, J. (2021). Data compression and visualization using PCA and T-SNE. In Advances in Information Communication Technology and Computing: Proceedings of AICTC 2019 (pp. 327–337). Springer Singapore. https://doi.org/10.1007/978-981-15-5421-6_34

Platzer, A. (2013). Visualization of SNPs with t-SNE. PloS ONE, 8(2), Article e56883. https://doi.org/10.1371/journal.pone.0056883

Ramos, J. (2003). Using TF-IDF to determine word relevance in document queries. Proceedings of the First Instructional Conference on Machine Learning, 242(1), 29–48.

Rodrawangpai, B., & Daungjaiboon, W. (2022). Improving text classification with transformers and layer normalization. Machine Learning with Applications, 10, Article 100403. https://doi.org/10.1016/j.mlwa.2022.100403

Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65. https://doi.org/10.1016/0377-0427(87)90125-7

Senel, L. K., Schick, T., & Schütze, H. (2022). CoDa21: Evaluating language understanding capabilities of NLP models with context-definition alignment. arXiv.

Shahapure, K. R., & Nicholas, C. (2020, October). Cluster quality analysis using silhouette score. In 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA) (pp. 747–748). IEEE. https://doi.org/10.1109/DSAA49011.2020.00096

Shin, Y. J., & Lee, C. H. (2022). Factors influencing management expenses and long-term repair plans: Evidence from apartments in Busan. Tax Accounting Research, 72, 31–50.

Song, K., Tan, X., Qin, T., Lu, J., & Liu, T.-Y. (2020). MPNet: Masked and permuted pre-training for language understanding. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Eds.), Advances in neural information processing systems (Vol. 33, pp. 16857–16867). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2020/file/c3a690be93aa602ee2dc0ccab5b7b67e-Paper.pdf

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in neural information processing systems (Vol. 30). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

Yacouby, R., & Axman, D. (2020, November). Probabilistic extension of precision, recall, and F1 score for more thorough evaluation of classification models. In Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems (pp. 79–91). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.eval4nlp-1.9

Yang, B., Yih, W. T., He, X., Gao, J., & Deng, L. (2014). Embedding entities and relations for learning and inference in knowledge bases. arXiv.