The semantic power of text content as a flow of a vector field of embeddings

Сташків, Віктор; Хамарчук, Андрій; Чорнописький, Кирило; Шумейко, Владислав; Чорняк, Максим; Ярош, Каріна; Церковнюк, Валентина; Пастух, Олег; Stashkiv, Viktor; Khamarchuk, Andrii; Chornopyskyi, Kyrylo; Shumeiko, Vladyslav; Chorniak, Maksym; Yarosh, Karina; Tserkovniuk, Valentyna; Pastukh, Oleh

doi:https://doi.org/10.33108/visnyk_tntu2025.04. 110

Empreu aquest identificador per citar o enllaçar aquest ítem: http://elartu.tntu.edu.ua/handle/lib/51959

Registre complet de metadades

Camp DC	Valor	Lengua/Idioma
dc.contributor.author	Сташків, Віктор
dc.contributor.author	Хамарчук, Андрій
dc.contributor.author	Чорнописький, Кирило
dc.contributor.author	Шумейко, Владислав
dc.contributor.author	Чорняк, Максим
dc.contributor.author	Ярош, Каріна
dc.contributor.author	Церковнюк, Валентина
dc.contributor.author	Пастух, Олег
dc.contributor.author	Stashkiv, Viktor
dc.contributor.author	Khamarchuk, Andrii
dc.contributor.author	Chornopyskyi, Kyrylo
dc.contributor.author	Shumeiko, Vladyslav
dc.contributor.author	Chorniak, Maksym
dc.contributor.author	Yarosh, Karina
dc.contributor.author	Tserkovniuk, Valentyna
dc.contributor.author	Pastukh, Oleh
dc.date.accessioned	2026-03-23T16:17:36Z	-
dc.date.available	2026-03-23T16:17:36Z	-
dc.date.created	2025-12-23
dc.date.issued	2025-12-23
dc.date.submitted	2025-08-19
dc.identifier.citation	The semantic power of text content as a flow of a vector field of embeddings / Viktor Stashkiv, Andrii Khamarchuk, Kyrylo Chornopyskyi, Vladyslav Shumeiko, Maksym Chorniak, Karina Yarosh, Valentyna Tserkovniuk, Oleh Pastukh // Scientific Journal of TNTU. — Tern. : TNTU, 2025. — Vol 120. — No 4. — P. 110–119.
dc.identifier.issn	2522-4433
dc.identifier.uri	http://elartu.tntu.edu.ua/handle/lib/51959	-
dc.description.abstract	Зростаючий обсяг текстової інформації вимагає передових методів оцінювання ефективності контенту та його семантичної структури. Існуючі техніки опрацювання природної мови (NLP) часто не надають метрик для вимірювання внутрішньої «семантичної інтенсивності» або концептуальної узгодженості. Ця стаття представляє «семантичну силу» – нову кількісну характеристику, розроблену для аналізу концептуальної структури та смислової насиченості текстів на основі принципів теорії поля. Методологія базується на теоремі Остроградського-Гауса та операторі дивергенції, встановлюючи звʼязок між локальними семантичними властивостями тексту (на основі векторних ембедингів LaBSE) та їхнім глобальним впливом. Підхід включає обчислення семантичного центроїда як точки найбільшої концентрації смислу та кількісну оцінку семантичної сили за допомогою моделі, що враховує обернено-квадратичний спад впливу векторів. Для подальшого аналізу застосовуються кластеризація методом Gaussian Mixture Models та візуалізація за допомогою методу головних компонент (PCA). Експерименти, проведені на філософських текстах видатних мислителів Нового часу, таких як Готфрід Вільгельм Лейбніц, Рене Декарт та Іммануїл Кант, продемонстрували чіткі та значущі відмінності у значеннях семантичної сили (0.6010, 0.5633 та 0.5787 відповідно) та у сформованих патернах кластеризації (2, 7 та 2 кластери). Результати показують, що ці показники не лише є числовими характеристиками, а й корелюють з відомими особливостями інтелектуального стилю та методології кожного з авторів. Таким чином, «семантична сила» виступає як потужний і об’єктивний інструмент для оцінювання глибинних когнітивних та семантичних характеристик тексту, відкриваючи потенційні можливості для широкого спектру застосувань у філології, когнітивістиці, комп’ютерній лінгвістиці та інших суміжних галузях.
dc.description.abstract	The growing volume of textual data demands advanced methods for evaluating both content effectiveness and semantic structure. While current Natural Language Processing (NLP) techniques offer powerful tools, they often lack metrics for quantifying intrinsic semantic intensity or conceptual coherence. This paper introduces «semantic power» – a novel quantitative measure designed to analyze the conceptual structure and semantic richness of texts, grounded in principles of field theory. The proposed methodology draws on the Ostrogradsky–Gauss theorem and the divergence operator, establishing a theoretical link between local semantic properties of a text (derived from LaBSE vector embeddings) and their global influence. The approach involves computing a semantic centroid, representing the point of highest meaning concentration, and measuring semantic power using a model that assumes an inverse-square decay of vector influence. For further analysis, Gaussian Mixture Model (GMM) clustering id applied, and Principal Component Analysis (PCA) is used for dimensionality reduction and visualization. Experiments on philosophical texts by key Early Modern thinkers – G. W. Leibniz, R. Descartes, and I. Kant – reveal distinct and meaningful variations in semantic power (0.6010, 0.5633, and 0.5787, respectively) and in the resulting clustering patterns (2, 7, and 2 clusters). These findings suggest that semantic power is not merely a numerical descriptor but one that correlates with established intellectual styles and methodological orientations of the authors. As such, semantic power emerges as a powerful and objective metric for assessing the deep cognitive and semantic dimensions of textual content, with potential applications in philology, cognitive science, and computational linguistics and related disciplines.
dc.format.extent	110-119
dc.language.iso	en
dc.publisher	ТНТУ
dc.publisher	TNTU
dc.relation.ispartof	Вісник Тернопільського національного технічного університету, 4 (120), 2025
dc.relation.ispartof	Scientific Journal of the Ternopil National Technical University, 4 (120), 2025
dc.relation.uri	https://doi.org/10.1609/aaai.v33i01.33017370
dc.relation.uri	https://doi.org/10.1007/s11192-021-03984-1
dc.relation.uri	https://doi.org/10.1007/s44196-023-00337-z
dc.relation.uri	https://doi.org/10.1145/3308558.3313516
dc.relation.uri	https://doi.org/10.18653/v1/N18-1136
dc.relation.uri	https://doi.org/10.1016/j.knosys.2023.111303
dc.relation.uri	https://doi.org/10.54569/aair.1142568
dc.relation.uri	https://doi.org/10.3390/app122110792
dc.relation.uri	https://doi.org/10.48175/IJARSCT-3029
dc.relation.uri	https://doi.org/10.18653/v1/2024.semeval-1.124
dc.relation.uri	https://doi.org/10.18653/v1/2022.acl-long.62
dc.relation.uri	https://doi.org/10.1109/TASLP.2020.3012062
dc.relation.uri	https://doi.org/10.1016/j.ipm.2023.103529
dc.relation.uri	https://elartu.tntu.edu.ua/handle/lib/22368
dc.relation.uri	https://doi.org/10.1007/978-3-030-27947-9_18
dc.subject	текстовий аналіз
dc.subject	опрацювання природної мови
dc.subject	семантична сила
dc.subject	вектори- ембединги
dc.subject	семантичний простір
dc.subject	дивергенція
dc.subject	кластеризація
dc.subject	теорія поля
dc.subject	великі мовні моделі
dc.subject	трансформери
dc.subject	text analysis
dc.subject	natural language processing
dc.subject	semantic power
dc.subject	vector embeddings
dc.subject	semantic space
dc.subject	divergence
dc.subject	clustering
dc.subject	field theory
dc.subject	large language models
dc.subject	transformers
dc.title	The semantic power of text content as a flow of a vector field of embeddings
dc.title.alternative	Семантична сила текстового контенту як потік поля векторів-ембедингів
dc.type	Article
dc.rights.holder	© Тернопільський національний технічний університет імені Івана Пулюя, 2025
dc.coverage.placename	Тернопіль
dc.coverage.placename	Ternopil
dc.format.pages	10
dc.subject.udc	004.82
dc.relation.referencesen	1. Yao L., Mao C., & Luo Y. (2019) Graph convolutional networks for text classification. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 7370–7377. https://doi.org/10.1609/aaai.v33i01.33017370
dc.relation.referencesen	2. Kozlowski D., Dusdal J., Pang J., & Zilian A. (2021). Semantic and relational spaces in science of science: Deep learning models for article vectorisation. Scientometrics. https://doi.org/10.1007/s11192-021-03984-1
dc.relation.referencesen	3. Liu B., Guan W., Yang C., Fang Z., & Lu Z. (2023) Transformer and graph convolutional network for text classification. International Journal of Computational Intelligence Systems, 16 (1). https://doi.org/10.1007/s44196-023-00337-z
dc.relation.referencesen	4. Wang B., Li Q., Melucci M., & Song D. (2019). Semantic hilbert space for text representation learning. U The world wide web conference. ACM Press. https://doi.org/10.1145/3308558.3313516
dc.relation.referencesen	5. Vyas Y., Niu X., & Carpuat M. (2018). Identifying semantic divergences in parallel text without annotations. U Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: Human language technologies, volume 1 (long papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/N18-1136
dc.relation.referencesen	6. Zeng D., Zha E., Kuang J., & Shen Y. (2024) Multi-label text classification based on semantic-sensitive graph convolutional network. Knowledge-Based Systems, 284, 111303. https://doi.org/10.1016/j.knosys.2023.111303
dc.relation.referencesen	7. Tekgöz H., İlhan Omurca S., Koç K. Y., Topçu U., & Çeli̇k O. (2022). Semantic similarity comparison between production line failures for predictive maintenance. Advances in Artificial Intelligence Research. https://doi.org/10.54569/aair.1142568
dc.relation.referencesen	8. Premalatha M., Viswanathan V., & Čepová L. (2022) Application of semantic analysis and LSTM- GRU in developing a personalized course recommendation system. Applied Sciences, 12 (21), 10792. https://doi.org/10.3390/app122110792
dc.relation.referencesen	9. Narendra G. O. & Hashwanth S. (2022) Named entity recognition based resume parser and summarizer. International Journal of Advanced Research in Science, Communication and Technology, 728–735. https://doi.org/10.48175/IJARSCT-3029
dc.relation.referencesen	10. Venkatesh D., & Raman S. (2024). BITS pilani at semeval-2024 task 1: Using text-embedding-3-large and labse embeddings for semantic textual relatedness. U Proceedings of the 18th international workshop on semantic evaluation (semeval-2024). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.semeval-1.124
dc.relation.referencesen	11. Feng F., Yang Y., Cer D., Arivazhagan N., & Wang W. (2022) Language-agnostic BERT sentence embedding. U Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: Long papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.62
dc.relation.referencesen	12. Kesiraju S., Plchot O., Burget L., & Gangashetty S. V. (2020) Learning document embeddings along with their uncertainties. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, 2319–2332. https://doi.org/10.1109/TASLP.2020.3012062
dc.relation.referencesen	13. Hu C., Wu T., Liu S., Liu C., Ma T., & Yang F. (2024) Joint unsupervised contrastive learning and robust GMM for text clustering. Information Processing & Management, 61 (1), 103529. https://doi.org/10.1016/j.ipm.2023.103529
dc.relation.referencesen	14. Chesanovsky I., & Levhunets D. (2017). Representation of narrow-band radio signals with angular modulation in trunked radio systems using the principal component analysis. Scientific Journal of the Ternopil National Technical University, 86 (2), 117–121. https://elartu.tntu.edu.ua/handle/lib/22368
dc.relation.referencesen	15. Musil T. (2019). Examining structure of word embeddings with PCA. У Text, speech, and dialogue. Springer International Publishing. https://doi.org/10.1007/978-3-030-27947-9_18
dc.identifier.doi	https://doi.org/10.33108/visnyk_tntu2025.04. 110
dc.contributor.affiliation	Тернопільський національний технічний університет імені Івана Пулюя, Тернопіль, Україна
dc.contributor.affiliation	Ternopil Ivan Puluj National Technical University, Ternopil, Ukraine
dc.citation.journalTitle	Вісник Тернопільського національного технічного університету
dc.citation.volume	120
dc.citation.issue	4
dc.citation.spage	110
dc.citation.epage	119
dc.identifier.citation2015	The semantic power of text content as a flow of a vector field of embeddings / Stashkiv V. та ін. // Scientific Journal of TNTU, Ternopil. 2025. Vol 120. No 4. P. 110–119.
dc.identifier.citationenAPA	Stashkiv, V., Khamarchuk, A., Chornopyskyi, K., Shumeiko, V., Chorniak, M., & Yarosh, K. (2025). The semantic power of text content as a flow of a vector field of embeddings. Scientific Journal of the Ternopil National Technical University, 120(4), 110-119. TNTU..
dc.identifier.citationenCHICAGO	Stashkiv V., Khamarchuk A., Chornopyskyi K., Shumeiko V., Chorniak M., Yarosh K., Tserkovniuk V., Pastukh O. (2025) The semantic power of text content as a flow of a vector field of embeddings. Scientific Journal of the Ternopil National Technical University (Tern.), vol. 120, no 4, pp. 110-119.
Apareix a les col·leccions:	Вісник ТНТУ, 2025, № 4 (120)

Arxius per aquest ítem:

Arxiu	Descripció	Mida	Format
TNTUSJ_2025v120n4_Stashkiv_V-The_semantic_power_of_110-119.pdf		3,16 MB	Adobe PDF	Veure/Obrir
TNTUSJ_2025v120n4_Stashkiv_V-The_semantic_power_of_110-119__COVER.png		1,3 MB	image/png	Veure/Obrir

Mostrar el registre simplificat de l'ítem

Els ítems de DSpace es troben protegits per copyright, amb tots els drets reservats, sempre i quan no s’indiqui el contrari.