Web data mining for monitoring business export orientation
Abstract
The World Wide Web (WWW) has become the largest repository of information in the world, providing a data stream that grows at the same time as the scope of the Internet does in society. As with most Information and Communication Technologies (ICTs), its digital nature makes it easy for computer programs to analyze it and discover information. This is why it is being increasingly explored as a source of new indicators of technology, economics and development. Web-based indicators can be made available on a real-time basis, unlike delayed official data releases. In this paper, we examine the viability of monitoring firm export orientation from automatically retrieved web variables. Our focus on exports is consistent with the role of internationalization in economic development. To evaluate our approach, we first checked to what extent web variables are capable of predicting firm export orientation. Once these new variables are validated, their automated retrieval is assessed by comparing the predictive performance of two nowcast models: one considering the manually retrieved web variables, the other considering the automatically retrieved ones. Our results evidence that i) web-based variables are good predictors for firm export orientation, and ii) the process of extracting and analyzing such variables can be entirely automated with no significant loss of performance. This way, it is possible to nowcast not only the export orientation of a firm, but also of an economic sector or of a region.
First published online: 16 Mar 2017
Keyword : automatic indicators, Big Data, corporate websites, export, monitoring, nowcasting, web data mining
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Andersson, S.; Gabrielsson, J.; Wictor, I. 2004. International activities in small firms: examining factors influencing the internationalization and export growth of small firms, Canadian Journal of Administrative Sciences/Revue Canadienne des Sciences de l’Administration 21(1): 22–34. https://doi.org/10.1111/j.1936-4490.2004.tb00320.x
Andersson, M.; Lööf, H.; Johansson, S. 2008. Productivity and international trade: firm level evidence from a small open economy, Review of World Economics 144(4): 774–801. https://doi.org/10.1007/s10290-008-0169-5
Arora, S. K.; Li, Y.; Youtie, J.; Shapira, P. 2015. Using the wayback machine to mine websites in the social sciences: a methodological resource, Journal of the Association for Information Science and Technology 67(8): 1904–1915. https://doi.org/10.1002/asi.23503
Arora, S. K.; Youtie, J.; Shapira, P.; Gao, L.; Ma, T. T. 2013. Entry strategies in an emerging technology: a pilot web-based study of graphene firms, Scientometrics 95(3): 1189–1207. https://doi.org/10.1007/s11192-013-0950-7
Askitas, N.; Zimmermann, K. F. 2015. Health and well-being in the great recession, International Jour¬nal of Manpower 36(1): 26–47. https://doi.org/10.1108/IJM-12-2014-0260
Baldauf, A.; Cravens, D. W.; Wagner, U. 2000. Examining determinants of export performance in small open economies, Journal of World Business 35(1): 61–79. https://doi.org/10.1016/S1090-9516(99)00034-6
Bánbura, M.; Giannone, D.; Modugno, M.; Reichlin, L. 2013. Now-casting and the real-time data flow.European Central Bank Working Paper Series, Vol. 1564.
Bangwayo-Skeete, P. F.; Skeete, R. W. 2015. Can Google data improve the forecasting performance of tourist arrivals? Mixed-data sampling approach, Tourism Management 46: 454–464. https://doi.org/10.1016/j.tourman.2014.07.014
Bennett, R. 1997. Export marketing and the Internet: experiences of Website use and perceptions of export barriers among UK businesses, International Marketing Review 14(5): 324–344. https://doi.org/10.1108/02651339710184307
Bernard, A. B.; Jensen, B. J. 1995. Exporters, jobs, and wages in U.S. manufacturing: 1976–1987, Brook¬ings Papers on Economic Activity: Microeconomics 1995: 67–119. https://doi.org/10.2307/2534772
Berthon, P. R.; Pitt, L. F.; Plangger, K.; Shapiro, D. 2012. Marketing meets Web 2.0, social media, and creative consumers: implications for international marketing strategy, Business Horizons 55(3): 261–271. https://doi.org/10.1016/j.bushor.2012.01.007
Blazquez, D.; Domenech, J. 2014. Inferring export orientation from corporate websites, Applied Eco¬nomics Letters 21(7): 509–512. https://doi.org/10.1080/13504851.2013.872752
Bojnec, Š.; Fertö, I. 2009. Impact of the Internet on manufacturing trade, Journal of Computer Informa¬tion Systems 50(1): 124–132.
Bojnec, Š.; Fertö, I. 2010. Internet and international food industry trade, Industrial Management and Data Systems 110(5): 744–761. https://doi.org/10.1108/02635571011044768
Bonaccorsi, A. 1992. On the relationship between firm size and export intensity, Journal of International Business Studies 23(4): 605–635. https://doi.org/10.1057/palgrave.jibs.8490280
Choi, H.; Varian, H. R. 2009. Predicting the present with Google Trends [online], [cited 20 May 2014]. Available from Internet: http://google.com/googleblogs/pdfs/google_predicting_the_present.pdf
Clarke, G. R. G.; Wallsten, S. J. 2006. Has the Internet increased trade? Developed and developing country evidence, Economic Inquiry 44(3): 465–484. https://doi.org/10.1093/ei/cbj026
Cohen, J.; Cohen, P.; West, S. G.; Aiken, L. S. 2002. Applied multiple regression/correlation analysis for the behavioral sciences. 3rd Edition. Routledge.
Da, Z.; Engelberg, J.; Gao, P. 2011. In search of attention, Journal of Finance 66(5): 1461–1499. https://doi.org/10.1111/j.1540-6261.2011.01679.x
Dholakia, R. R.; Kshetri, N. 2004. Factors impacting the adoption of the Internet among SMEs, Small Business Economics 23(4): 311–322. https://doi.org/10.1023/B:SBEJ.0000032036.90353.1f
Domenech, J.; de la Ossa, B.; Pont, A.; Gil, J. A.; Martinez, M.; Rubio, A. 2012. An intelligent system for retrieving economic information from corporate websites, in IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), IEEE, 4–7 December 2012, Macau, China, 573–578. https://doi.org/10.1109/WI-IAT.2012.92
Edelman, B. 2012. Using Internet data for economic research, Journal of Economic Perspectives 26(2): 189–206. https://doi.org/10.1257/jep.26.2.189
Einav, L.; Levin, J. D. 2013. The data revolution and economic analysis, Innovation Policy and the Economy 14(1): 1–24. https://doi.org/10.1086/674019
Escobar-Rodríguez, T.; Carvajal-Trujillo, E. 2013. An evaluation of Spanish hotel websites: informa¬tional vs. relational strategies, International Journal of Hospitality Management 33: 228–239. https://doi.org/10.1016/j.ijhm.2012.08.008
Fernández, Z.; Nieto, M. J. 2006. Impact of ownership on the international involvement of SMEs, Jour¬nal of International Business Studies 37(3): 340–351. https://doi.org/10.1057/palgrave.jibs.8400196
Freund, C. L.; Weinhold, D. 2004. The effect of the Internet on international trade, Journal of Interna¬tional Economics 62(1): 171–189. https://doi.org/10.1016/S0022-1996(03)00059-X
Girma, S.; Greenaway, D.; Kneller, R. 2004. Does exporting increase productivity? A microeconometric analysis of matched firms, Review of International Economics 12(5): 855– 866. https://doi.org/10.1111/j.1467-9396.2004.00486.x
Heimeriks, G.; van den Besselaar, P.; Frenken, K. 2008. Digital disciplinary differences: an analysis of computer-mediated science and “Mode 2” knowledge production, Research Policy 37(9): 1602–1615. https://doi.org/10.1016/j.respol.2008.05.012
Ibeh, K. I. N.; Luo Y.; Dinnie, K. 2005. E-branding strategies of internet companies: some preliminary insights from the UK, Journal of Brand Management 12(5): 355–373. https://doi.org/10.1057/palgrave.bm.2540231
Ingwersen, P. 1998. The calculation of web impact factors, Journal of Documentation 54(2): 236–243. https://doi.org/10.1108/EUM0000000007167
Instituto Nacional de Estadística (INE) 2012. Encuesta sobre el uso de TIC y comercio electrónico en las empresas [online], [cited 16 March 2015]. Available from Internet: http://www.ine.es/jaxi/menu.do?type=pcaxis&path=/t09/e02&file=inebase
Kažemikaitiene, E.; Bilevičiene, T. 2008. Problems of involvement of disabled persons in e. government, Technological and Economic Development of Economy 14(2): 184–196. https://doi.org/10.3846/1392-8619.2008.14.184-196
Lee, J. K.; Morrison, A. M. 2010. A comparative study of website performance, Journal of Hospitality and Tourism Technology 1(1): 50–67. https://doi.org/10.1108/17579881011023016
Libaers, D.; Hicks, D.; Porter, A. L. 2010. A taxonomy of small firm technology commercialization, Industrial and Corporate Change 25(3): 371–405. https://doi.org/10.1093/icc/dtq039
Llopis, J.; Gonzalez, R.; Gasco, J. 2010. Web pages as a tool for a strategic description of the Spanish largest firms, Information Processing and Management 46(3): 320–330. https://doi.org/10.1016/j.ipm.2009.06.004
Majocchi, A.; Bacchiocchi, E.; Mayrhofer, U. 2005. Firm size, business experience and export intensity in SMEs: a longitudinal approach to complex relationships, International Business Review 14(6): 719–738. https://doi.org/10.1016/j.ibusrev.2005.07.004
Meroño-Cerdan, A. L.; Soto-Acosta, P. 2007. External Web content and its influence on organizational performance, European Journal of Information Systems 16(1): 66–80. https://doi.org/10.1057/palgrave.ejis.3000656
Miskinis, A.; Reinbold, B. 2010. Investments of German MNEs into production networks in central European and Baltic states, Technological and Economic Development of Economy 16(4): 717–735. https://doi.org/10.3846/tede.2010.44
Moat, H. S.; Curme, C.; Stanley, E. H.; Preis, T. 2014. Anticipating Stock Market Movements with Google and Wikipedia, in D. Matrasulov, H. E. Stanley (Ed.) 2014. Nonlinear Phenomena in Complex Sys¬tems: From Nano to Macro Scale. Springer, 310 p. https://doi.org/10.1007/978-94-017-8704-8_4
Molina-Morales, X. F.; Martínez-Fernández, T. M.; Torlò, V. J. 2011. The dark side of trust: the benefits, costs and optimal levels of trust for innovation performance, Long Range Planning 44(2): 118–133. https://doi.org/10.1016/j.lrp.2011.01.001
Motiwalla, L.; Khan, R. M.; Xu, S. 2005. An intra- and inter-industry analysis of e-business effectiveness, Information and Management 42(5): 651–667. https://doi.org/10.1016/j.im.2003.12.001
Murphy, J.; Hashim, N. H.; O’Connor, P. 2007. Take me back: validating the wayback machine, Journal of Computer-Mediated Communication 13(1): 60–75. https://doi.org/10.1111/j.1083-6101.2007.00386.x
Murphy, J.; Scharl, A. 2007. An investigation of global versus local online branding, International Mar¬keting Review 24(3): 297–312. https://doi.org/10.1108/02651330710755302
Nassimbeni, G. 2001. Technology, innovation capacity, and the export attitude of small manufacturing firms: a logit/tobit model, Research Policy 30(2): 245–262. https://doi.org/10.1016/S0048-7333(99)00114-6
Overbeeke, M.; Snizek, W. E. 2005. Websites and corporate culture: a research note, Business and Society 44(3): 346–356. https://doi.org/10.1177/0007650305275748
Pla-Barber, J.; Alegre, J. 2007. Analysing the link between export intensity, innovation and firm size in a science-based industry, International Business Review 16(3): 275–293. https://doi.org/10.1016/j.ibusrev.2007.02.005
Preis, T.; Reith, D.; Stanley, E. H. 2010. Complex dynamics of our economic life on different scales: insights from search engine query data, Philosophical Transactions Of The Royal Society A-Mathe¬matical Physical And Engineering Sciences 368: 5707–5719. https://doi.org/10.1098/rsta.2010.0284
Roche, X. 2014. HTTrack. [online], [cited 23 May 2014]. Available from Internet: http://www.httrack.com
Samiee, S. 2008. Global marketing effectiveness via alliances and electronic commerce in business-to-business markets, Industrial Marketing Management 37(1): 3–8. https://doi.org/10.1016/j.indmarman.2007.09.003
Scaglione, M.; Schegg, R.; Murphy, J. 2009. Website adoption and sales performance in Valais’ hospital¬ity industry, Technovation 29(9): 625–631. https://doi.org/10.1016/j.technovation.2009.05.011
Scharnhorst, A.; Wouters, P. 2006. Web indicators – a new generation of S&T indicators?, International Journal of Scientometrics, Informetrics and Bibliometrics 10(1).
Sinkovics, N.; Sinkovics, R. R.; Jean R.-J. “B.” 2013. The internet as an alternative path to internationaliza¬tion?, International Marketing Review 30(2): 130–155. https://doi.org/10.1108/0265133131
Smith, A. G. 1999. A tale of two web spaces: comparing sites using web impact factors, Journal of Documentation 55(5): 577–592.
Spence, M. M. 2003. Evaluating export promotion programmes: U.K. overseas trade missions and export performance, Small Business Economics 20(1): 83–103. https://doi.org/10.1023/A:1020200621988
Varian, H. R. 2014. Big data: new tricks for econometrics, Journal of Economic Perspectives 28(2): 3–28. https://doi.org/10.1257/jep.28.2.3
Vaughan, L.; Hysen, K. 2002. Relationship between links to journal Web sites and impact factors, Aslib Proceedings 54(6): 356–361. https://doi.org/10.1108/00012530210452555
Vaughan, L.; Romero-Frias, E. 2010. Web hyperlink patterns and the financial variables of the global banking industry, Journal of Information Science 36(4): 530–541. https://doi.org/10.1177/0165551510373961
Vaughan, L. 2014. Discovering business information from search engine query data, Online Information Review 38(4): 562–574. https://doi.org/10.1108/OIR-08-2013-0190
Vivekanandan, K.; Rajendran, R. 2006. Export marketing and the World Wide Web: perceptions of export barriers among tirupur knitwear apparel exporters – an empirical analysis, Journal of Elec¬tronic Commerce Research 7(1): 27–40.
Wholey, J. S.; Hatry, H. P. 1992. The Case for Performance Monitoring, Public Administration Review 52(6): 604–610. https://doi.org/10.2307/977173
Wilkinson, D.; Harries, G.; Thelwall, M.; Price, L. 2003. Motivations for academic web site interlink¬ing: evidence for the Web as a novel source of information on informal scholarly communication, Journal of Information Science 29(1): 49–56. https://doi.org/10.1177/016555150302900105
Youtie, J.; Hicks, D.; Shapira, P.; Horsley, T. 2012. Pathways from discovery to commercialisation: using web sources to track small and medium-sized enterprise strategies in emerging nanotechnologies, Technology Analysis and Strategic Management 24(10): 981–995. https://doi.org/10.1080/09537325.2012.724163
Zeng, R.; Zeng, S.; Xie, X.; Tam, C.; Wan, T. 2012. What motivates firms from emerging economies to go internationalization?, Technological and Economic Development of Economy 18(2): 280–298. https://doi.org/10.3846/20294913.2012.677588