Clustering Keywords to Identify Concepts in Texts: An Analysis of Research Articles in Applied Linguistics

Main Article Content

Punjaporn Pojanapunya

Abstract

Keyword analysis is one of the most widely used methods in corpus linguistics. The method is used to generate keywords which provide an indication of concepts in texts or a corpus. Keyword analysis tools commonly produce resulting keywords presented as a list which rather poorly indicates what the corpus is about since it typically requires analysts’ knowledge on conceptual associations between keywords. Therefore, common follow-up methods of keyword analysis are to examine concordances, collocational patterns, and some other patterns of associations between keywords and contexts. This study focuses on the association within a group of keywords by constructing a representation of a keyword list as keyword clusters. The keywords for an analysis were generated from two corpora; the target corpus was collected from research articles in applied linguistics and the comparative corpus was a collection of research in pure and applied sciences. The relationship between the top 30 keywords was identifed using mutual information scores of all possible pairs of the keywords within a span of 20 and these scores were used as input for creating keyword clusters. The representations of the 30 keywords as a list and clusters are presented and discussed.

Article Details

How to Cite
Pojanapunya, P. (2018). Clustering Keywords to Identify Concepts in Texts: An Analysis of Research Articles in Applied Linguistics. REFLections, 22, 55–70. https://doi.org/10.61508/refl.v22i0.112328
Section
Research articles