Matches in SemOpenAlex for { <https://semopenalex.org/work/W2032222016> ?p ?o ?g. }
Showing items 1 to 65 of
65
with 100 items per page.
- W2032222016 abstract "It is has long been anecdotally known that web archives and search engines favor Western and English-language sites. In this paper we quantitatively explore how well indexed and archived are Arabic language web sites. We began by sampling 15,092 unique URIs from three different website directories: DMOZ (multi-lingual), Raddadi and Star28 (both primarily Arabic language). Using language identification tools we eliminated pages not in the Arabic language (e.g., English language versions of Al-Jazeera sites) and culled the collection to 7,976 definitely Arabic language web pages. We then used these 7,976 pages and crawled the live web and web archives to produce a collection of 300,646 Arabic language pages. We discovered: 1) 46% are not archived and 31% are not indexed by Google (www.google.com), 2) only 14.84% of the URIs had an Arabic country code top-level domain (e.g., .sa) and only 10.53% had a GeoIP in an Arabic country, 3) having either only an Arabic GeoIP or only an Arabic top-level domain appears to negatively impact archiving, 4) most of the archived pages are near the top level of the site and deeper links into the site are not well-archived, 5) the presence in a directory positively impacts indexing and presence in the DMOZ directory, specifically, positively impacts archiving." @default.
- W2032222016 created "2016-06-24" @default.
- W2032222016 creator A5009372635 @default.
- W2032222016 creator A5039613772 @default.
- W2032222016 creator A5085719625 @default.
- W2032222016 date "2015-06-21" @default.
- W2032222016 modified "2023-09-25" @default.
- W2032222016 title "How Well Are Arabic Websites Archived?" @default.
- W2032222016 cites W1999478104 @default.
- W2032222016 cites W2019194162 @default.
- W2032222016 cites W2049961212 @default.
- W2032222016 cites W2057767944 @default.
- W2032222016 cites W2099126271 @default.
- W2032222016 cites W2136963875 @default.
- W2032222016 cites W2137187346 @default.
- W2032222016 cites W2137443370 @default.
- W2032222016 cites W2156893245 @default.
- W2032222016 cites W2157004265 @default.
- W2032222016 cites W2293827470 @default.
- W2032222016 cites W2295141584 @default.
- W2032222016 cites W68332980 @default.
- W2032222016 doi "https://doi.org/10.1145/2756406.2756912" @default.
- W2032222016 hasPublicationYear "2015" @default.
- W2032222016 type Work @default.
- W2032222016 sameAs 2032222016 @default.
- W2032222016 citedByCount "12" @default.
- W2032222016 countsByYear W20322220162016 @default.
- W2032222016 countsByYear W20322220162017 @default.
- W2032222016 countsByYear W20322220162021 @default.
- W2032222016 countsByYear W20322220162022 @default.
- W2032222016 crossrefType "proceedings-article" @default.
- W2032222016 hasAuthorship W2032222016A5009372635 @default.
- W2032222016 hasAuthorship W2032222016A5039613772 @default.
- W2032222016 hasAuthorship W2032222016A5085719625 @default.
- W2032222016 hasBestOaLocation W20322220162 @default.
- W2032222016 hasConcept C136764020 @default.
- W2032222016 hasConcept C138885662 @default.
- W2032222016 hasConcept C204321447 @default.
- W2032222016 hasConcept C41008148 @default.
- W2032222016 hasConcept C41895202 @default.
- W2032222016 hasConcept C96455323 @default.
- W2032222016 hasConceptScore W2032222016C136764020 @default.
- W2032222016 hasConceptScore W2032222016C138885662 @default.
- W2032222016 hasConceptScore W2032222016C204321447 @default.
- W2032222016 hasConceptScore W2032222016C41008148 @default.
- W2032222016 hasConceptScore W2032222016C41895202 @default.
- W2032222016 hasConceptScore W2032222016C96455323 @default.
- W2032222016 hasLocation W20322220161 @default.
- W2032222016 hasLocation W20322220162 @default.
- W2032222016 hasOpenAccess W2032222016 @default.
- W2032222016 hasPrimaryLocation W20322220161 @default.
- W2032222016 hasRelatedWork W1929967858 @default.
- W2032222016 hasRelatedWork W2368651715 @default.
- W2032222016 hasRelatedWork W2611614995 @default.
- W2032222016 hasRelatedWork W2748952813 @default.
- W2032222016 hasRelatedWork W2768035686 @default.
- W2032222016 hasRelatedWork W2789919619 @default.
- W2032222016 hasRelatedWork W2900897701 @default.
- W2032222016 hasRelatedWork W2901613113 @default.
- W2032222016 hasRelatedWork W2915524904 @default.
- W2032222016 hasRelatedWork W3107474891 @default.
- W2032222016 isParatext "false" @default.
- W2032222016 isRetracted "false" @default.
- W2032222016 magId "2032222016" @default.
- W2032222016 workType "article" @default.