The use of the Python language in web scraping

15th July 2023

3/2023 informative articles, opinions, reviews, interviews, information

Silvia Komara, Michal Páleš, Faculty of Economic Informatics, University of Economics in Bratislava, Slovak Republic

Type of article: informative article

Pages: 55 – 68

Abstract

The paper focuses on presenting the basic attributes of web scraping in the context of currently used terms such as new sources of statistics, big data, machine learning, artificial intelligence, Business Intelligence, etc. It describes the Python language’s options for downloading data from the Internet and modules in which this process can be executed. It is also specifically dedicated to connecting the field of machine learning with web scraping. In a practical demonstration, we present the functionality of the Python language for scraping data from the PDF documents.

Issue for download
PDF (3.1 MB, 405 downloads)

Number of views: 115

author Michal PÁLEŠ, author Silvia KOMARA, machine learning, PDF document, Python language, web scraping

The use of the Python language in web scraping

Abstract

STATISTICAL OFFICE OF THE SR

INFORMATION SERVICE