Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313227190> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W4313227190 abstract "Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. Delegation of learning has clear benefits, and at the same time raises serious concerns of trust. This work studies possible abuses of power by untrusted learners.We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate “backdoor key,” the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees.•First, we show how to plant a backdoor in any model, using digital signature schemes. The construction guarantees that given query access to the original model and the backdoored version, it is computationally infeasible to find even a single input where they differ. This property implies that the backdoored model has generalization error comparable with the original model. Moreover, even if the distinguisher can request backdoored inputs of its choice, they cannot backdoor a new input—a property we call non-replicability.•Second, we demonstrate how to insert undetectable backdoors in models trained using the Random Fourier Features (RFF) learning paradigm (Rahimi, Recht; NeurIPS 2007). In this construction, undetectability holds against powerful white-box distinguishers: given a complete description of the network and the training data, no efficient distinguisher can guess whether the model is “clean” or contains a backdoor. The backdooring algorithm executes the RFF algorithm faithfully on the given training data, tampering only with its random coins. We prove this strong guarantee under the hardness of the Continuous Learning With Errors problem (Bruna, Regev, Song, Tang; STOC 2021). We show a similar white-box undetectable backdoor for random ReLU networks based on the hardness of Sparse PCA (Berthet, Rigollet; COLT 2013).Our construction of undetectable backdoors also sheds light on the related issue of robustness to adversarial examples. In particular, by constructing undetectable backdoor for an “adversarially-robust” learning algorithm, we can produce a classifier that is indistinguishable from a robust classifier, but where every input has an adversarial example! In this way, the existence of undetectable backdoors represent a significant theoretical roadblock to certifying adversarial robustness." @default.
- W4313227190 created "2023-01-06" @default.
- W4313227190 creator A5000240220 @default.
- W4313227190 creator A5039980118 @default.
- W4313227190 creator A5048299120 @default.
- W4313227190 creator A5070011898 @default.
- W4313227190 date "2022-10-01" @default.
- W4313227190 modified "2023-10-16" @default.
- W4313227190 title "Planting Undetectable Backdoors in Machine Learning Models : [Extended Abstract]" @default.
- W4313227190 cites W1499766499 @default.
- W4313227190 cites W1972792640 @default.
- W4313227190 cites W1973286131 @default.
- W4313227190 cites W1976697079 @default.
- W4313227190 cites W2038761522 @default.
- W4313227190 cites W2061949491 @default.
- W4313227190 cites W2069588275 @default.
- W4313227190 cites W2071520502 @default.
- W4313227190 cites W2074594718 @default.
- W4313227190 cites W2129719496 @default.
- W4313227190 cites W2147947791 @default.
- W4313227190 cites W2148888468 @default.
- W4313227190 cites W2152144666 @default.
- W4313227190 cites W2532798880 @default.
- W4313227190 cites W2934843808 @default.
- W4313227190 cites W2942091739 @default.
- W4313227190 cites W2964164735 @default.
- W4313227190 cites W3082860856 @default.
- W4313227190 cites W3176628912 @default.
- W4313227190 cites W4205765479 @default.
- W4313227190 doi "https://doi.org/10.1109/focs54457.2022.00092" @default.
- W4313227190 hasPublicationYear "2022" @default.
- W4313227190 type Work @default.
- W4313227190 citedByCount "2" @default.
- W4313227190 countsByYear W43132271902023 @default.
- W4313227190 crossrefType "proceedings-article" @default.
- W4313227190 hasAuthorship W4313227190A5000240220 @default.
- W4313227190 hasAuthorship W4313227190A5039980118 @default.
- W4313227190 hasAuthorship W4313227190A5048299120 @default.
- W4313227190 hasAuthorship W4313227190A5070011898 @default.
- W4313227190 hasConcept C119857082 @default.
- W4313227190 hasConcept C134306372 @default.
- W4313227190 hasConcept C143273055 @default.
- W4313227190 hasConcept C154945302 @default.
- W4313227190 hasConcept C199360897 @default.
- W4313227190 hasConcept C2781045450 @default.
- W4313227190 hasConcept C33923547 @default.
- W4313227190 hasConcept C34388435 @default.
- W4313227190 hasConcept C38652104 @default.
- W4313227190 hasConcept C41008148 @default.
- W4313227190 hasConcept C80444323 @default.
- W4313227190 hasConcept C95623464 @default.
- W4313227190 hasConceptScore W4313227190C119857082 @default.
- W4313227190 hasConceptScore W4313227190C134306372 @default.
- W4313227190 hasConceptScore W4313227190C143273055 @default.
- W4313227190 hasConceptScore W4313227190C154945302 @default.
- W4313227190 hasConceptScore W4313227190C199360897 @default.
- W4313227190 hasConceptScore W4313227190C2781045450 @default.
- W4313227190 hasConceptScore W4313227190C33923547 @default.
- W4313227190 hasConceptScore W4313227190C34388435 @default.
- W4313227190 hasConceptScore W4313227190C38652104 @default.
- W4313227190 hasConceptScore W4313227190C41008148 @default.
- W4313227190 hasConceptScore W4313227190C80444323 @default.
- W4313227190 hasConceptScore W4313227190C95623464 @default.
- W4313227190 hasLocation W43132271901 @default.
- W4313227190 hasOpenAccess W4313227190 @default.
- W4313227190 hasPrimaryLocation W43132271901 @default.
- W4313227190 hasRelatedWork W2512018286 @default.
- W4313227190 hasRelatedWork W2556319748 @default.
- W4313227190 hasRelatedWork W2891961174 @default.
- W4313227190 hasRelatedWork W2961085424 @default.
- W4313227190 hasRelatedWork W3092753701 @default.
- W4313227190 hasRelatedWork W3158264953 @default.
- W4313227190 hasRelatedWork W3200179079 @default.
- W4313227190 hasRelatedWork W4249229055 @default.
- W4313227190 hasRelatedWork W4281570223 @default.
- W4313227190 hasRelatedWork W4310989423 @default.
- W4313227190 isParatext "false" @default.
- W4313227190 isRetracted "false" @default.
- W4313227190 workType "article" @default.