Matches in SemOpenAlex for { <https://semopenalex.org/work/W2964752624> ?p ?o ?g. }
Showing items 1 to 70 of
70
with 100 items per page.
- W2964752624 abstract "For the following problems: the semi-structure information on the web pages of the video website is complicated and the utilization rate is low, the data collection efficiency of the single machine crawler is low, this paper proposed a Scrapy-based distributed crawler system for crawling semi-structure information at high speed. The traditional single crawler proposed by this paper developed an improved scheme of distributed extension. In this scheme, the Scrapy-Redis distributed component and Redis database were introduced into the Scrapy framework, and the semi-structured information crawling and standardized storage strategy was set up, and Scrapy-based distributed crawler system for crawling semi-structure information at high speed was implemented. This paper verified the system by crawling video site Youku, SOHU, Tencent, iQIYI TV drama information. The experimental results showed that the crawling speed of the distributed crawler is increased by 84.53%, 88.95%, 93.05% and 100% respectively compared with that of the single machine crawler." @default.
- W2964752624 created "2019-08-13" @default.
- W2964752624 creator A5006468993 @default.
- W2964752624 creator A5025315990 @default.
- W2964752624 creator A5049231118 @default.
- W2964752624 date "2018-12-01" @default.
- W2964752624 modified "2023-10-16" @default.
- W2964752624 title "Research on Scrapy-Based Distributed Crawler System for Crawling Semi-structure Information at High Speed" @default.
- W2964752624 cites W2073406688 @default.
- W2964752624 cites W2142211218 @default.
- W2964752624 cites W2491319049 @default.
- W2964752624 cites W2793295129 @default.
- W2964752624 doi "https://doi.org/10.1109/compcomm.2018.8781062" @default.
- W2964752624 hasPublicationYear "2018" @default.
- W2964752624 type Work @default.
- W2964752624 sameAs 2964752624 @default.
- W2964752624 citedByCount "6" @default.
- W2964752624 countsByYear W29647526242020 @default.
- W2964752624 countsByYear W29647526242021 @default.
- W2964752624 crossrefType "proceedings-article" @default.
- W2964752624 hasAuthorship W2964752624A5006468993 @default.
- W2964752624 hasAuthorship W2964752624A5025315990 @default.
- W2964752624 hasAuthorship W2964752624A5049231118 @default.
- W2964752624 hasConcept C100368936 @default.
- W2964752624 hasConcept C105702510 @default.
- W2964752624 hasConcept C134306372 @default.
- W2964752624 hasConcept C136764020 @default.
- W2964752624 hasConcept C13743948 @default.
- W2964752624 hasConcept C173576120 @default.
- W2964752624 hasConcept C21959979 @default.
- W2964752624 hasConcept C23123220 @default.
- W2964752624 hasConcept C33923547 @default.
- W2964752624 hasConcept C41008148 @default.
- W2964752624 hasConcept C61096286 @default.
- W2964752624 hasConcept C71924100 @default.
- W2964752624 hasConcept C73340581 @default.
- W2964752624 hasConcept C77088390 @default.
- W2964752624 hasConcept C77618280 @default.
- W2964752624 hasConceptScore W2964752624C100368936 @default.
- W2964752624 hasConceptScore W2964752624C105702510 @default.
- W2964752624 hasConceptScore W2964752624C134306372 @default.
- W2964752624 hasConceptScore W2964752624C136764020 @default.
- W2964752624 hasConceptScore W2964752624C13743948 @default.
- W2964752624 hasConceptScore W2964752624C173576120 @default.
- W2964752624 hasConceptScore W2964752624C21959979 @default.
- W2964752624 hasConceptScore W2964752624C23123220 @default.
- W2964752624 hasConceptScore W2964752624C33923547 @default.
- W2964752624 hasConceptScore W2964752624C41008148 @default.
- W2964752624 hasConceptScore W2964752624C61096286 @default.
- W2964752624 hasConceptScore W2964752624C71924100 @default.
- W2964752624 hasConceptScore W2964752624C73340581 @default.
- W2964752624 hasConceptScore W2964752624C77088390 @default.
- W2964752624 hasConceptScore W2964752624C77618280 @default.
- W2964752624 hasLocation W29647526241 @default.
- W2964752624 hasOpenAccess W2964752624 @default.
- W2964752624 hasPrimaryLocation W29647526241 @default.
- W2964752624 hasRelatedWork W1506122440 @default.
- W2964752624 hasRelatedWork W188882231 @default.
- W2964752624 hasRelatedWork W2019080882 @default.
- W2964752624 hasRelatedWork W2025869112 @default.
- W2964752624 hasRelatedWork W2093310712 @default.
- W2964752624 hasRelatedWork W2114053503 @default.
- W2964752624 hasRelatedWork W2154110426 @default.
- W2964752624 hasRelatedWork W2274831913 @default.
- W2964752624 hasRelatedWork W2783570127 @default.
- W2964752624 hasRelatedWork W3092782196 @default.
- W2964752624 isParatext "false" @default.
- W2964752624 isRetracted "false" @default.
- W2964752624 magId "2964752624" @default.
- W2964752624 workType "article" @default.