Matches in SemOpenAlex for { <https://semopenalex.org/work/W3118138899> ?p ?o ?g. }
- W3118138899 abstract "As progress in AI continues to advance, it is crucial to know how advanced systems will make choices and in what ways they may fail. Machines can already outsmart humans in some domains, and understanding how to safely build ones which may have capabilities at or above the human level is of particular concern. One might suspect that artificially generally intelligent (AGI) and artificially superintelligent (ASI) systems should be modeled as as something which humans, by definition, can't reliably outsmart. As a challenge to this assumption, this paper presents the Achilles Heel hypothesis which states that even a potentially superintelligent system may nonetheless have stable decision-theoretic delusions which cause them to make obviously irrational decisions in adversarial settings. In a survey of relevant dilemmas and paradoxes from the decision theory literature, a number of these potential Achilles Heels are discussed in context of this hypothesis. Several novel contributions are made toward understanding the ways in which these weaknesses might be implanted into a system." @default.
- W3118138899 created "2021-01-05" @default.
- W3118138899 creator A5008237906 @default.
- W3118138899 date "2020-10-12" @default.
- W3118138899 modified "2023-09-27" @default.
- W3118138899 title "Achilles Heels for AGI/ASI via Decision Theoretic Adversaries" @default.
- W3118138899 cites W1509562192 @default.
- W3118138899 cites W1510271585 @default.
- W3118138899 cites W1581742186 @default.
- W3118138899 cites W1651611617 @default.
- W3118138899 cites W1935023620 @default.
- W3118138899 cites W1974491691 @default.
- W3118138899 cites W1974609519 @default.
- W3118138899 cites W1981974807 @default.
- W3118138899 cites W1991449244 @default.
- W3118138899 cites W2013391942 @default.
- W3118138899 cites W2020502430 @default.
- W3118138899 cites W2042146255 @default.
- W3118138899 cites W2063987693 @default.
- W3118138899 cites W2081302688 @default.
- W3118138899 cites W2086115389 @default.
- W3118138899 cites W2094230476 @default.
- W3118138899 cites W2106286564 @default.
- W3118138899 cites W2121863487 @default.
- W3118138899 cites W2133752818 @default.
- W3118138899 cites W2135625884 @default.
- W3118138899 cites W2138152596 @default.
- W3118138899 cites W2157925625 @default.
- W3118138899 cites W2240086230 @default.
- W3118138899 cites W2462906003 @default.
- W3118138899 cites W2497102831 @default.
- W3118138899 cites W2574075983 @default.
- W3118138899 cites W2737838988 @default.
- W3118138899 cites W2766512981 @default.
- W3118138899 cites W2777239763 @default.
- W3118138899 cites W2796284132 @default.
- W3118138899 cites W2799058376 @default.
- W3118138899 cites W2804342109 @default.
- W3118138899 cites W2884501187 @default.
- W3118138899 cites W2900548288 @default.
- W3118138899 cites W2902907165 @default.
- W3118138899 cites W2905472553 @default.
- W3118138899 cites W2948625193 @default.
- W3118138899 cites W2963178695 @default.
- W3118138899 cites W2964281483 @default.
- W3118138899 cites W2964295739 @default.
- W3118138899 cites W2964386990 @default.
- W3118138899 cites W2965797143 @default.
- W3118138899 cites W2996337238 @default.
- W3118138899 cites W3001548133 @default.
- W3118138899 cites W3015001695 @default.
- W3118138899 cites W3034344071 @default.
- W3118138899 cites W3034614314 @default.
- W3118138899 cites W3037626499 @default.
- W3118138899 cites W3081021328 @default.
- W3118138899 cites W3099579437 @default.
- W3118138899 cites W3115754552 @default.
- W3118138899 cites W3118210634 @default.
- W3118138899 cites W3121479744 @default.
- W3118138899 cites W3178799710 @default.
- W3118138899 cites W3200885897 @default.
- W3118138899 cites W3212045844 @default.
- W3118138899 cites W591538471 @default.
- W3118138899 cites W76418698 @default.
- W3118138899 hasPublicationYear "2020" @default.
- W3118138899 type Work @default.
- W3118138899 sameAs 3118138899 @default.
- W3118138899 citedByCount "0" @default.
- W3118138899 crossrefType "posted-content" @default.
- W3118138899 hasAuthorship W3118138899A5008237906 @default.
- W3118138899 hasConcept C111472728 @default.
- W3118138899 hasConcept C112930515 @default.
- W3118138899 hasConcept C118084267 @default.
- W3118138899 hasConcept C138885662 @default.
- W3118138899 hasConcept C144133560 @default.
- W3118138899 hasConcept C154945302 @default.
- W3118138899 hasConcept C15744967 @default.
- W3118138899 hasConcept C162324750 @default.
- W3118138899 hasConcept C166957645 @default.
- W3118138899 hasConcept C2524010 @default.
- W3118138899 hasConcept C2778223634 @default.
- W3118138899 hasConcept C2779343474 @default.
- W3118138899 hasConcept C2986080485 @default.
- W3118138899 hasConcept C33923547 @default.
- W3118138899 hasConcept C37736160 @default.
- W3118138899 hasConcept C41008148 @default.
- W3118138899 hasConcept C539667460 @default.
- W3118138899 hasConcept C73484699 @default.
- W3118138899 hasConcept C94931360 @default.
- W3118138899 hasConcept C95457728 @default.
- W3118138899 hasConceptScore W3118138899C111472728 @default.
- W3118138899 hasConceptScore W3118138899C112930515 @default.
- W3118138899 hasConceptScore W3118138899C118084267 @default.
- W3118138899 hasConceptScore W3118138899C138885662 @default.
- W3118138899 hasConceptScore W3118138899C144133560 @default.
- W3118138899 hasConceptScore W3118138899C154945302 @default.
- W3118138899 hasConceptScore W3118138899C15744967 @default.
- W3118138899 hasConceptScore W3118138899C162324750 @default.
- W3118138899 hasConceptScore W3118138899C166957645 @default.
- W3118138899 hasConceptScore W3118138899C2524010 @default.