Matches in SemOpenAlex for { <https://semopenalex.org/work/W3033351097> ?p ?o ?g. }
- W3033351097 abstract "Datasets that are terabytes in size are increasingly common, but computer bottlenecks often frustrate a complete analysis of the data. While more data are better than less, diminishing returns suggest that we may not need terabytes of data to estimate a parameter or test a hypothesis. But which rows of data should we analyze, and might an arbitrary subset of rows preserve the features of the original data? This paper reviews a line of work that is grounded in theoretical computer science and numerical linear algebra, and which finds that an algorithmically desirable sketch, which is a randomly chosen subset of the data, must preserve the eigenstructure of the data, a property known as a subspace embedding. Building on this work, we study how prediction and inference can be affected by data sketching within a linear regression setup. We show that the sketching error is small compared to the sample size effect which a researcher can control. As a sketch size that is algorithmically optimal may not be suitable for prediction and inference, we use statistical arguments to provide 'inference conscious' guides to the sketch size. When appropriately implemented, an estimator that pools over different sketches can be nearly as efficient as the infeasible one using the full sample." @default.
- W3033351097 created "2020-06-12" @default.
- W3033351097 creator A5028536256 @default.
- W3033351097 creator A5056976753 @default.
- W3033351097 date "2019-07-03" @default.
- W3033351097 modified "2023-09-25" @default.
- W3033351097 title "An Econometric Perspective on Algorithmic Subsampling" @default.
- W3033351097 cites W1485437584 @default.
- W3033351097 cites W1493892051 @default.
- W3033351097 cites W1506895146 @default.
- W3033351097 cites W1595409123 @default.
- W3033351097 cites W1691300750 @default.
- W3033351097 cites W1800271513 @default.
- W3033351097 cites W1945899805 @default.
- W3033351097 cites W1974234363 @default.
- W3033351097 cites W1981313592 @default.
- W3033351097 cites W1981573335 @default.
- W3033351097 cites W1990055659 @default.
- W3033351097 cites W1997200791 @default.
- W3033351097 cites W2007399394 @default.
- W3033351097 cites W2037757210 @default.
- W3033351097 cites W2042465463 @default.
- W3033351097 cites W2045390367 @default.
- W3033351097 cites W2051832567 @default.
- W3033351097 cites W2055892937 @default.
- W3033351097 cites W2075007695 @default.
- W3033351097 cites W2080745194 @default.
- W3033351097 cites W2093813380 @default.
- W3033351097 cites W2101043704 @default.
- W3033351097 cites W2101712072 @default.
- W3033351097 cites W2116780995 @default.
- W3033351097 cites W2121689290 @default.
- W3033351097 cites W2134342155 @default.
- W3033351097 cites W2152356156 @default.
- W3033351097 cites W2157988812 @default.
- W3033351097 cites W2160840682 @default.
- W3033351097 cites W2171810522 @default.
- W3033351097 cites W2342249230 @default.
- W3033351097 cites W2593245224 @default.
- W3033351097 cites W2616345629 @default.
- W3033351097 cites W2625592596 @default.
- W3033351097 cites W2773306503 @default.
- W3033351097 cites W2803952844 @default.
- W3033351097 cites W2808726247 @default.
- W3033351097 cites W2887913724 @default.
- W3033351097 cites W2948501930 @default.
- W3033351097 cites W2962842430 @default.
- W3033351097 cites W2963098024 @default.
- W3033351097 cites W2963323648 @default.
- W3033351097 cites W2963459305 @default.
- W3033351097 cites W3023419127 @default.
- W3033351097 cites W3125714952 @default.
- W3033351097 cites W3139361274 @default.
- W3033351097 cites W826455576 @default.
- W3033351097 hasPublicationYear "2019" @default.
- W3033351097 type Work @default.
- W3033351097 sameAs 3033351097 @default.
- W3033351097 citedByCount "1" @default.
- W3033351097 countsByYear W30333510972019 @default.
- W3033351097 crossrefType "posted-content" @default.
- W3033351097 hasAuthorship W3033351097A5028536256 @default.
- W3033351097 hasAuthorship W3033351097A5056976753 @default.
- W3033351097 hasConcept C105795698 @default.
- W3033351097 hasConcept C111919701 @default.
- W3033351097 hasConcept C11413529 @default.
- W3033351097 hasConcept C124101348 @default.
- W3033351097 hasConcept C12713177 @default.
- W3033351097 hasConcept C129848803 @default.
- W3033351097 hasConcept C134261354 @default.
- W3033351097 hasConcept C135598885 @default.
- W3033351097 hasConcept C154945302 @default.
- W3033351097 hasConcept C185429906 @default.
- W3033351097 hasConcept C185592680 @default.
- W3033351097 hasConcept C198531522 @default.
- W3033351097 hasConcept C199683683 @default.
- W3033351097 hasConcept C2776214188 @default.
- W3033351097 hasConcept C2779231336 @default.
- W3033351097 hasConcept C32834561 @default.
- W3033351097 hasConcept C33923547 @default.
- W3033351097 hasConcept C41008148 @default.
- W3033351097 hasConcept C43617362 @default.
- W3033351097 hasConcept C77088390 @default.
- W3033351097 hasConcept C80444323 @default.
- W3033351097 hasConceptScore W3033351097C105795698 @default.
- W3033351097 hasConceptScore W3033351097C111919701 @default.
- W3033351097 hasConceptScore W3033351097C11413529 @default.
- W3033351097 hasConceptScore W3033351097C124101348 @default.
- W3033351097 hasConceptScore W3033351097C12713177 @default.
- W3033351097 hasConceptScore W3033351097C129848803 @default.
- W3033351097 hasConceptScore W3033351097C134261354 @default.
- W3033351097 hasConceptScore W3033351097C135598885 @default.
- W3033351097 hasConceptScore W3033351097C154945302 @default.
- W3033351097 hasConceptScore W3033351097C185429906 @default.
- W3033351097 hasConceptScore W3033351097C185592680 @default.
- W3033351097 hasConceptScore W3033351097C198531522 @default.
- W3033351097 hasConceptScore W3033351097C199683683 @default.
- W3033351097 hasConceptScore W3033351097C2776214188 @default.
- W3033351097 hasConceptScore W3033351097C2779231336 @default.
- W3033351097 hasConceptScore W3033351097C32834561 @default.
- W3033351097 hasConceptScore W3033351097C33923547 @default.