The use of the Python language in web scraping

Silvia Komara, Michal Páleš, Faculty of Economic Informatics, University of Economics in Bratislava, Slovak Republic

Type of article: informative article

Pages: 55 – 68


The paper focuses on presenting the basic attributes of web scraping in the context of currently used terms such as new sources of statistics, big data, machine learning, artificial intelligence, Business Intelligence, etc. It describes the Python language’s options for downloading data from the Internet and modules in which this process can be executed. It is also specifically dedicated to connecting the field of machine learning with web scraping. In a practical demonstration, we present the functionality of the Python language for scraping data from the PDF documents.

