{"id":3653,"date":"2023-07-15T09:13:37","date_gmt":"2023-07-15T07:13:37","guid":{"rendered":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/the-use-of-the-python-language-in-web-scraping\/"},"modified":"2023-07-17T14:09:21","modified_gmt":"2023-07-17T12:09:21","slug":"the-use-of-the-python-language-in-web-scraping","status":"publish","type":"post","link":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/the-use-of-the-python-language-in-web-scraping\/?lang=en","title":{"rendered":"The use of the Python language in web scraping"},"content":{"rendered":"<p>Silvia Komara, Michal P\u00e1le\u0161, Faculty of Economic Informatics, University of Economics in Bratislava, Slovak Republic<\/p>\n<p><!--more--><br \/>\nType of article: informative article<\/p>\n<p>Pages: 55 &#8211; 68<\/p>\n<h5>Abstract<\/h5>\n<p>The paper focuses on presenting the basic attributes of web scraping in the context of currently used terms such as new sources of statistics, big data, machine learning, artificial intelligence, Business Intelligence, etc. It describes the Python language&#8217;s options for downloading data from the Internet and modules in which this process can be executed. It is also specifically dedicated to connecting the field of machine learning with web scraping. In a practical demonstration, we present the functionality of the Python language for scraping data from the PDF documents.<\/p>\n<div>\n<span style=\"line-height: 2.1em; padding-left: 0px; color: #1e4e9d;\">Issue for download<\/span><br \/>\n<a class=\"download-pdf\" onclick=\"window.open('https:\/\/ssad.statistics.sk\/SSaD\/?dl_id=357','new','');return false\"  href=\"https:\/\/ssad.statistics.sk\/SSaD\/?dl_id=357\"><i class=\"far fa-file-pdf\"><\/i> PDF<\/a><span class='download-text'> (3,1 MB, 381 downloads)\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Silvia Komara, Michal P\u00e1le\u0161, Faculty of Economic Informatics, University of Economics in Bratislava, Slovak Republic<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[1644],"tags":[83,1553,1663,1664,1661,1662],"class_list":["post-3653","post","type-post","status-publish","format-standard","hentry","category-3-2023-informative-articles-opinions-reviews-interviews-information","tag-author-michal-pales","tag-author-silvia-komara","tag-machine-learning","tag-pdf-document","tag-python-language","tag-web-scraping-en"],"_links":{"self":[{"href":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/wp-json\/wp\/v2\/posts\/3653","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/wp-json\/wp\/v2\/comments?post=3653"}],"version-history":[{"count":1,"href":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/wp-json\/wp\/v2\/posts\/3653\/revisions"}],"predecessor-version":[{"id":3675,"href":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/wp-json\/wp\/v2\/posts\/3653\/revisions\/3675"}],"wp:attachment":[{"href":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/wp-json\/wp\/v2\/media?parent=3653"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/wp-json\/wp\/v2\/categories?post=3653"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ssad.statistics.sk\/SSaD\/index.php\/wp-json\/wp\/v2\/tags?post=3653"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}