Matches in SemOpenAlex for { <https://semopenalex.org/work/W2922433192> ?p ?o ?g. }
Showing items 1 to 70 of
70
with 100 items per page.
- W2922433192 abstract "Database users are easily overwhelmed by the sheer size of data found in large-scale scientific and financial databases. Exploring these databases to make sense of the explored data and to discover interesting insights (i.e., data exploration) has been, and still is, a hideous and labour-intensive task, especially for non-expert users with no solid background of the underlying data. Some three decades ago, the database research community noticed the limitation of traditional DBMS in supporting users for data exploration tasks. Since then, the research community has proposed and designed various effective and efficient data exploration techniques to assist users in extracting interesting insights from their data. An instance of these techniques is the Query Refinement technique. In query refinement techniques, users’ queries are assumed to be imprecise, i.e., the returned result does not meet some user- efined constraints. Accordingly, the goal of query refinement techniques is to automatically refine these imprecise queries to maximize users’ satisfaction with the results. In particular, the predicates of the queries are carefully modified so that the returned results satisfy the user-defined constraints. Since users’ constraints on the queries results are diverse and miscellaneous, this thesis focuses on two specific forms of constraints in exploring relational and sequential data, namely, 1) user-defined aggregate constraints on the result, and 2) user-defined correlation constraints of time series data. These constraints are common in real world applications because they represent an upper level view of the result that is easier to understand and digest than the raw result itself. This thesis addresses the limitations of current query refinement techniques that are oblivious to the similarity of the refined queries to the users’ initial queries. Specifically, users’ initial (and imprecise) queries are defined as anchor points for which the similarity of its corresponding refined queries are computed over the whole refinement space. Consequently, the similarity- ware query refinement problem is formulated as a search problem, which aims to balance the trade-off between minimizing the deviation from satisfying a constraint on the query result, and maximizing the similarity of the refined query to the initial one. Searching for a trade-off between satisfying a constraint on the result of a query and maximizing the similarity introduces various challenges. A common challenge shared by many query refinement problems is that finding an optimal trade- ff involves inspecting and examining a huge search space of candidate refined queries, possibly exponential. Further, evaluating candidate queries in these possibly exponential spaces to decide whether they are optimal or not incurs expensive computational and I/O costs. Hence, simply applying exhaustive solutions is not adequate since they hinder users’ exploration tasks and worsen the response time. In this thesis, we discuss in detail our three key contributions, which address the challenges above in the context of query refinement for aggregate and correlation constraints. Firstly, we formally define the Similarity-aware, Aggregate-based Query Refinement problem, in which users specify aggregate constraints on the result and prefer refined queries that are similar to their initial ones. Then, we consider the special case of aggregate constraints, in which users specify cardinality constraints on their queries results. For that special case, we propose innovative Similarity-aware Query Refinement schemes (SAQR) which employ pruning techniques to avoid unnecessary evaluations of candidate refined queries that are considered unpromising. We also show the applicability of SAQR in a web- ased application (ORange) which utilizes SAQR schemes for refining selected areas based on cardinality constraints. Secondly, we address the general case of aggregate constraints, in which multiple constraints can be defined using SQL standard aggregate operators sum, avg, min, max. We present EAGER schemes for this general case and propose efficient approximation and optimization techniques to elevate the shortcomings of aggregates loose bounds that are used in pruning unpromising candidate queries. Moreover, by comparison with related algorithms using real world datasets, we show the efficiency gains of our schemes under different experimental parameters. Thirdly, we formulate the Similarity-aware, Correlation-based Query Refinement problem, in which users’ queries are refined to satisfy their pairwise correlation constraints of time series data. We show the computational hardness of this problem, and propose the RELATE scheme to address the associated challenges by utilizing the incremental property of correlation. Further, we propose two-level pruning techniques for the RELATE scheme to minimize the associated computational and I/O costs. These two techniques enable RELATE to avoid exhaustively traversing the search space by pruning unqualified candidate queries, and avoid computing pairwise correlation of every time series pair wherever possible. We demonstrate by experiments the performance gains of RELATE against state-of-the-art algorithm with real and synthetic datasets." @default.
- W2922433192 created "2019-03-22" @default.
- W2922433192 creator A5069995995 @default.
- W2922433192 date "2018-06-28" @default.
- W2922433192 modified "2023-09-23" @default.
- W2922433192 title "Similarity-aware query refinement for data exploration" @default.
- W2922433192 doi "https://doi.org/10.14264/uql.2018.416" @default.
- W2922433192 hasPublicationYear "2018" @default.
- W2922433192 type Work @default.
- W2922433192 sameAs 2922433192 @default.
- W2922433192 citedByCount "0" @default.
- W2922433192 crossrefType "dissertation" @default.
- W2922433192 hasAuthorship W2922433192A5069995995 @default.
- W2922433192 hasBestOaLocation W29224331922 @default.
- W2922433192 hasConcept C124101348 @default.
- W2922433192 hasConcept C148840519 @default.
- W2922433192 hasConcept C159985019 @default.
- W2922433192 hasConcept C162324750 @default.
- W2922433192 hasConcept C187736073 @default.
- W2922433192 hasConcept C192028432 @default.
- W2922433192 hasConcept C192562407 @default.
- W2922433192 hasConcept C23123220 @default.
- W2922433192 hasConcept C2780451532 @default.
- W2922433192 hasConcept C41008148 @default.
- W2922433192 hasConcept C4679612 @default.
- W2922433192 hasConcept C54239708 @default.
- W2922433192 hasConcept C5655090 @default.
- W2922433192 hasConcept C77088390 @default.
- W2922433192 hasConceptScore W2922433192C124101348 @default.
- W2922433192 hasConceptScore W2922433192C148840519 @default.
- W2922433192 hasConceptScore W2922433192C159985019 @default.
- W2922433192 hasConceptScore W2922433192C162324750 @default.
- W2922433192 hasConceptScore W2922433192C187736073 @default.
- W2922433192 hasConceptScore W2922433192C192028432 @default.
- W2922433192 hasConceptScore W2922433192C192562407 @default.
- W2922433192 hasConceptScore W2922433192C23123220 @default.
- W2922433192 hasConceptScore W2922433192C2780451532 @default.
- W2922433192 hasConceptScore W2922433192C41008148 @default.
- W2922433192 hasConceptScore W2922433192C4679612 @default.
- W2922433192 hasConceptScore W2922433192C54239708 @default.
- W2922433192 hasConceptScore W2922433192C5655090 @default.
- W2922433192 hasConceptScore W2922433192C77088390 @default.
- W2922433192 hasLocation W29224331921 @default.
- W2922433192 hasLocation W29224331922 @default.
- W2922433192 hasOpenAccess W2922433192 @default.
- W2922433192 hasPrimaryLocation W29224331921 @default.
- W2922433192 hasRelatedWork W121900809 @default.
- W2922433192 hasRelatedWork W1634892471 @default.
- W2922433192 hasRelatedWork W1984538115 @default.
- W2922433192 hasRelatedWork W2032167079 @default.
- W2922433192 hasRelatedWork W2259181357 @default.
- W2922433192 hasRelatedWork W2266730682 @default.
- W2922433192 hasRelatedWork W2276853748 @default.
- W2922433192 hasRelatedWork W2279567417 @default.
- W2922433192 hasRelatedWork W2398141323 @default.
- W2922433192 hasRelatedWork W2399054738 @default.
- W2922433192 hasRelatedWork W2477134663 @default.
- W2922433192 hasRelatedWork W2522191305 @default.
- W2922433192 hasRelatedWork W2527917511 @default.
- W2922433192 hasRelatedWork W29392426 @default.
- W2922433192 hasRelatedWork W2951711691 @default.
- W2922433192 hasRelatedWork W3118647117 @default.
- W2922433192 hasRelatedWork W3213621322 @default.
- W2922433192 hasRelatedWork W581540525 @default.
- W2922433192 hasRelatedWork W72933480 @default.
- W2922433192 hasRelatedWork W2186818439 @default.
- W2922433192 isParatext "false" @default.
- W2922433192 isRetracted "false" @default.
- W2922433192 magId "2922433192" @default.
- W2922433192 workType "dissertation" @default.