Matches in SemOpenAlex for { <https://semopenalex.org/work/W2794582549> ?p ?o ?g. }
- W2794582549 abstract "Shannon's entropy is a clear lower bound for statistical compression. The situation is not so well understood for dictionary-based compression. A plausible lower bound is $b$, the least number of phrases of a general bidirectional parse of a text, where phrases can be copied from anywhere else in the text. Since computing $b$ is NP-complete, a popular gold standard is $z$, the number of phrases in the Lempel-Ziv parse of the text, which is the optimal one when phrases can be copied only from the left. While $z$ can be computed in linear time with a greedy algorithm, almost nothing has been known for decades about its approximation ratio with respect to $b$. In this paper we prove that $z=O(blog(n/b))$, where $n$ is the text length. We also show that the bound is tight as a function of $n$, by exhibiting a text family where $z = Omega(blog n)$. Our upper bound is obtained by building a run-length context-free grammar based on a locally consistent parsing of the text. Our lower bound is obtained by relating $b$ with $r$, the number of equal-letter runs in the Burrows-Wheeler transform of the text. We proceed by observing that Lempel-Ziv is just one particular case of greedy parses, meaning that the optimal value of $z$ is obtained by scanning the text and maximizing the phrase length at each step, and of ordered parses, meaning that there is an increasing order between phrases and their sources. As a new example of ordered greedy parses, we introduce {em lexicographical} parses, where phrases can only be copied from lexicographically smaller text locations. We prove that the size $v$ of the optimal lexicographical parse is also obtained greedily in $O(n)$ time, that $v=O(blog(n/b))$, and that there exists a text family where $v = Omega(blog n)$." @default.
- W2794582549 created "2018-04-06" @default.
- W2794582549 creator A5007324430 @default.
- W2794582549 creator A5030296647 @default.
- W2794582549 creator A5080743153 @default.
- W2794582549 date "2018-03-26" @default.
- W2794582549 modified "2023-09-28" @default.
- W2794582549 title "On the Approximation Ratio of Ordered Parsings" @default.
- W2794582549 cites W1485941238 @default.
- W2794582549 cites W1489909987 @default.
- W2794582549 cites W1532033082 @default.
- W2794582549 cites W1573092629 @default.
- W2794582549 cites W165790702 @default.
- W2794582549 cites W1878541814 @default.
- W2794582549 cites W1965853364 @default.
- W2794582549 cites W1973608346 @default.
- W2794582549 cites W1976803949 @default.
- W2794582549 cites W1979656308 @default.
- W2794582549 cites W1995875735 @default.
- W2794582549 cites W1997274137 @default.
- W2794582549 cites W2005097301 @default.
- W2794582549 cites W2007791040 @default.
- W2794582549 cites W2017661493 @default.
- W2794582549 cites W2022126655 @default.
- W2794582549 cites W2029948740 @default.
- W2794582549 cites W2043646600 @default.
- W2794582549 cites W2046446159 @default.
- W2794582549 cites W2062386565 @default.
- W2794582549 cites W2064184672 @default.
- W2794582549 cites W2067974452 @default.
- W2794582549 cites W2077770566 @default.
- W2794582549 cites W2099111195 @default.
- W2794582549 cites W2101881908 @default.
- W2794582549 cites W2111487449 @default.
- W2794582549 cites W2113004376 @default.
- W2794582549 cites W2118703123 @default.
- W2794582549 cites W2119924552 @default.
- W2794582549 cites W2130956967 @default.
- W2794582549 cites W2132809979 @default.
- W2794582549 cites W2141931308 @default.
- W2794582549 cites W2150056094 @default.
- W2794582549 cites W2150359208 @default.
- W2794582549 cites W2158874082 @default.
- W2794582549 cites W2159084616 @default.
- W2794582549 cites W2161488606 @default.
- W2794582549 cites W2164107415 @default.
- W2794582549 cites W2171532727 @default.
- W2794582549 cites W226134553 @default.
- W2794582549 cites W2346750882 @default.
- W2794582549 cites W2517241835 @default.
- W2794582549 cites W2522260339 @default.
- W2794582549 cites W2533248932 @default.
- W2794582549 cites W2551896619 @default.
- W2794582549 cites W2554482899 @default.
- W2794582549 cites W2593896499 @default.
- W2794582549 cites W2770606097 @default.
- W2794582549 cites W2790578235 @default.
- W2794582549 cites W2890245531 @default.
- W2794582549 cites W2952152294 @default.
- W2794582549 cites W2952587812 @default.
- W2794582549 cites W2963756574 @default.
- W2794582549 cites W2963784695 @default.
- W2794582549 cites W2737129053 @default.
- W2794582549 hasPublicationYear "2018" @default.
- W2794582549 type Work @default.
- W2794582549 sameAs 2794582549 @default.
- W2794582549 citedByCount "1" @default.
- W2794582549 countsByYear W27945825492018 @default.
- W2794582549 crossrefType "posted-content" @default.
- W2794582549 hasAuthorship W2794582549A5007324430 @default.
- W2794582549 hasAuthorship W2794582549A5030296647 @default.
- W2794582549 hasAuthorship W2794582549A5080743153 @default.
- W2794582549 hasConcept C114614502 @default.
- W2794582549 hasConcept C118615104 @default.
- W2794582549 hasConcept C134306372 @default.
- W2794582549 hasConcept C14036430 @default.
- W2794582549 hasConcept C151730666 @default.
- W2794582549 hasConcept C154945302 @default.
- W2794582549 hasConcept C186644900 @default.
- W2794582549 hasConcept C2779343474 @default.
- W2794582549 hasConcept C33923547 @default.
- W2794582549 hasConcept C41008148 @default.
- W2794582549 hasConcept C77553402 @default.
- W2794582549 hasConcept C78458016 @default.
- W2794582549 hasConcept C86803240 @default.
- W2794582549 hasConceptScore W2794582549C114614502 @default.
- W2794582549 hasConceptScore W2794582549C118615104 @default.
- W2794582549 hasConceptScore W2794582549C134306372 @default.
- W2794582549 hasConceptScore W2794582549C14036430 @default.
- W2794582549 hasConceptScore W2794582549C151730666 @default.
- W2794582549 hasConceptScore W2794582549C154945302 @default.
- W2794582549 hasConceptScore W2794582549C186644900 @default.
- W2794582549 hasConceptScore W2794582549C2779343474 @default.
- W2794582549 hasConceptScore W2794582549C33923547 @default.
- W2794582549 hasConceptScore W2794582549C41008148 @default.
- W2794582549 hasConceptScore W2794582549C77553402 @default.
- W2794582549 hasConceptScore W2794582549C78458016 @default.
- W2794582549 hasConceptScore W2794582549C86803240 @default.
- W2794582549 hasOpenAccess W2794582549 @default.
- W2794582549 hasRelatedWork W1615394536 @default.