Data Mining Using K-Means Clustering Algorithm for Grouping Countries of Origin of Foreign Tourist

Main Article Content

Herliyani Hasanah
Nugroho Arif Sudibyo
Rhezka Mahendra Galih


Indonesia has enormous potential to develop the tourism sector. The role of the tourism sector in Indonesia's economic development is increasingly important. The contribution has been made by the tourism sector through foreign exchange earnings, regional income, regional development, investment, and employment increment as well as business development across various areas in Indonesia. One of the government's targets in the tourism sector is to increase foreign tourist visits. Grouping or clustering the countries of origin of the tourists need to be done to help the government in determining strategies. This study uses the K-means clustering algorithm to classify the data on the country of origin of tourists and evaluate the clusters using silhouette score for determining the appropriate number of clusters. The result of the silhouette score shows that K = 2 has a value of 0.8, which is the best cluster that can be used to classify data on the country of origin of tourists. Based on the test results of the clusters, both of the clusters were then identified as cluster 1 for the category of low visitors with 206 members and cluster 2 for the category of high visitors with 6 members, namely Malaysia, Singapore, China, Other Asia, Timor Leste, and Australia. The results of the clustering process are expected to be input data for further performance, namely mapping the right marketing strategy for the countries visiting Indonesia so as to increase foreign tourist visits to Indonesia.

Article Details