Matches in SemOpenAlex for { <https://semopenalex.org/work/W2294913508> ?p ?o ?g. }
Showing items 1 to 53 of
53
with 100 items per page.
- W2294913508 abstract "Tracking the evolution of websites has become fundamental to the understanding of today's Internet. The automatic reasoning of how and why websites change has become essential to developers and businesses alike, in particular because the manual reasoning has become impractical due to the sheer number of modifications that websites undergo during their operational lifetime, including but not limited to rotating advertisements, personalized content, insertion of new content, or removal of old content. Prior work in the area of change detection, such as XyDiff, X-Diff or AT&T's internet difference engine, focused mainly on ``diffing'' XML-encoded literary documents or XML-encoded databases. Only some previous work investigated the differences that must be taken into account to accurately extract the difference between HTML documents for which the markup language does not necessarily describe the content but is used to describe how the content is displayed instead. Additionally, prior work identifies all changes to a website, even those that might not be relevant to the overall analysis goal, in turn, they unnecessarily burden the analysis engine with additional workload. In this paper, we introduce a novel analysis framework, the Delta framework, that works by (i) extracting the modifications between two versions of the same website using a fuzzy tree difference algorithm, and (ii) using a machine-learning algorithm to derive a model of relevant website changes that can be used to cluster similar modifications to reduce the overall workload imposed on an analysis engine. Based on this model for example, the tracked content changes can be used to identify ongoing or even inactive web-based malware campaigns, or to automatically learn semantic translations of sentences or paragraphs by analyzing websites that are available in multiple languages. In prior work, we showed the effectiveness of the Delta framework by applying it to the detection and automatic identification of web-based malware campaigns on a data set of over 26 million pairs of websites that were crawled over a time span of four months. During this time, the system based on our framework successfully identified previously unknown web-based malware campaigns, such as a targeted campaign infecting installations of the Discuz!X Internet forum software." @default.
- W2294913508 created "2016-06-24" @default.
- W2294913508 creator A5022177364 @default.
- W2294913508 creator A5075685499 @default.
- W2294913508 creator A5084107603 @default.
- W2294913508 date "2014-04-07" @default.
- W2294913508 modified "2023-09-23" @default.
- W2294913508 title "Relevant change detection" @default.
- W2294913508 cites W1502492243 @default.
- W2294913508 cites W1549289012 @default.
- W2294913508 cites W2048616039 @default.
- W2294913508 cites W2152593687 @default.
- W2294913508 cites W2169557227 @default.
- W2294913508 doi "https://doi.org/10.1145/2567948.2578039" @default.
- W2294913508 hasPublicationYear "2014" @default.
- W2294913508 type Work @default.
- W2294913508 sameAs 2294913508 @default.
- W2294913508 citedByCount "7" @default.
- W2294913508 countsByYear W22949135082015 @default.
- W2294913508 countsByYear W22949135082016 @default.
- W2294913508 countsByYear W22949135082017 @default.
- W2294913508 countsByYear W22949135082021 @default.
- W2294913508 countsByYear W22949135082022 @default.
- W2294913508 crossrefType "proceedings-article" @default.
- W2294913508 hasAuthorship W2294913508A5022177364 @default.
- W2294913508 hasAuthorship W2294913508A5075685499 @default.
- W2294913508 hasAuthorship W2294913508A5084107603 @default.
- W2294913508 hasConcept C154945302 @default.
- W2294913508 hasConcept C203595873 @default.
- W2294913508 hasConcept C41008148 @default.
- W2294913508 hasConceptScore W2294913508C154945302 @default.
- W2294913508 hasConceptScore W2294913508C203595873 @default.
- W2294913508 hasConceptScore W2294913508C41008148 @default.
- W2294913508 hasFunder F4320337345 @default.
- W2294913508 hasFunder F4320337388 @default.
- W2294913508 hasFunder F4320338281 @default.
- W2294913508 hasLocation W22949135081 @default.
- W2294913508 hasOpenAccess W2294913508 @default.
- W2294913508 hasPrimaryLocation W22949135081 @default.
- W2294913508 hasRelatedWork W1515964938 @default.
- W2294913508 hasRelatedWork W1516342924 @default.
- W2294913508 hasRelatedWork W1979721949 @default.
- W2294913508 hasRelatedWork W2323335281 @default.
- W2294913508 hasRelatedWork W2400404242 @default.
- W2294913508 hasRelatedWork W2748952813 @default.
- W2294913508 hasRelatedWork W2808681611 @default.
- W2294913508 hasRelatedWork W2899084033 @default.
- W2294913508 hasRelatedWork W2946378017 @default.
- W2294913508 hasRelatedWork W4290613755 @default.
- W2294913508 isParatext "false" @default.
- W2294913508 isRetracted "false" @default.
- W2294913508 magId "2294913508" @default.
- W2294913508 workType "article" @default.