Harnessing Machine Learning and Ensemble Models for Tourism Potential Zone Prediction for the Assam State of India

Authors

  • Shrinwantu Raha Department of Geography, Cooch Behar Panchanan Barma University, Cooch Behar, West Bengal, Pin: 736101
  • Shasanka Kumar Gayen Department of Geography, Cooch Behar Panchanan Barma University, Cooch Behar, West Bengal, Pin: 736101
  • Sayan Deb Department of Geography, Bhairab Ganguly College, Belgharia, Kolkata 7,000,56

DOI:

https://doi.org/10.11113/jagst.v4n2.92

Keywords:

Tourism potentiality, ROC-AUC, Conditional Inference Tree, Bagged CART, Random Forest, TPZ Ensemble model, Analytic Hierarchy Process

Abstract

Although several popular tourist destinations exist in Assam, India, its charm remains enigmatic. This research was aimed at predicting the tourism potential zone (TPZ) for the state of Assam using five machine learning models (i.e., Conditional Inference Tree, Bagged CART, Random Forest, Random Forest with Conditional Inference Tree, and Gradient Boosting models) and one ensemble model. A 5-step methodology was implemented to conduct this research. First, a tourism inventory database was prepared using Google Earth Imagery, and a rapid field investigation was performed using the global positioning system and nonparticipant observation technique. A total of 365 tourism points were present in the inventory, 70% (224 points) of which were used for the training set and 30% (124 points) for the validation set. Tourism conditioning factors, such as relief, aspect, viewshed, forest area, wetland, coefficient of variation of rainfall, reserve forest, population density, population growth rate, literacy rate, and road–railway density, were used as independent variables in the modeling process. The TPZ was predicted using the above machine learning models, and finally, a new TPZ ensemble model was proposed by combining all the models. The result showed that all machine learning models performed well in terms of prediction accuracy, and the ensemble model outperformed other models by achieving the highest area under the curve (97.6%), Kappa (0.82), and accuracy (0.93) values. The findings from this research using machine learning and ensemble methods can provide accurate and significant information for decision-makers to develop tourism in the region.

Downloads

Published

2024-08-31

How to Cite

Raha, S., Gayen, S. K., & Deb, S. (2024). Harnessing Machine Learning and Ensemble Models for Tourism Potential Zone Prediction for the Assam State of India. Journal of Advanced Geospatial Science & Technology, 4(2), 29–78. https://doi.org/10.11113/jagst.v4n2.92