SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W137111882> ?p ?o ?g. }

Showing items 1 to 67 of 67 with 100 items per page.

W137111882 abstract "Visual forward models predict future visual data from the previous visual sensory state and a motor command. The adaptive acquisition of visual forward models in robotic applications is plagued by the high dimensionality of visual data which is not handled well by most machine learning and neural network algorithms. Moreover, the forward model has to learn which parts of the visual output are really predictable and which are not. In the present study, a learning algorithm is proposed which solves both problems. It relies on predicting the mapping between the visual input and output instead of directly forecasting visual data. The mapping is learnt by matching corresponding regions in the visual input and output while exploring different visual surroundings. Unpredictable regions are detected by the lack of any clear correspondence. The proposed algorithm is applied successfully to a robot camera head with additional distortion of the camera images by a retinal mapping. 1 Visuomotor Prediction Sensorimotor control is an important research topic in many disciplines, among them cognitive science and robotics. These fields tackle the questions how complex motor skills can be acquired by biological organisms or robots, and how sensory and motor processing are interrelated to each other. So-called “internal models” help to clarify ideas of sensorimotor processing on a functional level [8, 13]. “Inverse models” or controllers generate motor commands based on the current sensory state and the desired one; “forward models” (FWM) predict future sensory states as outcome of motor commands applied in the current sensory state. The present study focuses on the anticipation of visual data by FWMs. The anticipation of sensory consequences in the nervous system of biological organisms is supposed to be involved in several sensorimotor processes: First, many motor actions rely on feedback control, but sensory feedback is generally too slow. Here, the output of FWMs can replace sensory feedback [9]. Second, FWMs may be used in the planning process for complex motor actions [12]. Third, FWMs are part of a controller learning scheme called “distal supervised learning” [7]. Fourth, FWMs can help to seperate self-induced sensory effects (which are predicted) from externally induced sensory effects (which stand out from the predicted background) [2]. Fifth, it is suggested that perception relies on the anticipation of the consequences of motor actions which could be applied in the current situation. For the anticipation, FWMs are needed [10]. Regarding the fourth function mentioned above, a classical example is the reafference principle suggested by Holst and Mittelstaedt [6]. It explains why (self-induced) eye movements do not evoke the impression that the world around us is moving. As long as the predicted movement of the retinal image (caused by the eye movement) coincides with the actual movement, the effect of this movement is canceled out in the visual perception. In fields like robotics or artificial life, studies using FWMs for motor control focus mainly on navigation or obstacle avoidance tasks with mobile robots. The sensory input to the FWMs are rather low-dimensional data from distance sensors or laser range finders (e.g.: [12, 14]), optical flow fields [3], or preprocessed visual data with only a few remaining dimensions [5]. We are especially interested in the learning of FWMs in the visual domain, and its application to robot models. In our understanding, visual FWMs predict representations of entire visual scenes. In the nervous system, this could be the relatively unprocessed representation in the primary visual cortex or more complex representations generated in higher visual areas. Regarding robot models, the high-dimensional sensory input and output space of visual FWMs poses a tough challenge to any machine learning or neural network algorithm. Moreover, there might be unpredictable regions in the FWM output (because parts of the visual surrounding only become visible after execution of the motor command). In the present study, we suggest a learning algorithm which solves both problems in the context of robot “eye” movements. In doing so, our main goal is to demonstrate a new efficient learning algorithm for image prediction. 2 Visual Forward Model for Camera Movements In our robot model, we attempt to predict the visual consequences of eye movements. In the model, the eye is replaced by a camera which is mounted on a pantilt unit. Prediction of visual data is carried out on the level of camera images. In analogy to the sensor distribution on the human retina, a retinal mapping is carried out which decreases the resolution of the camera images from center to border. We use this mapping to make the prediction task more difficult; we do not intend to develop, implement, or test a model of the human visual pathway. The input of the visual FWM is a “retinal image” at time step t (called “input image” in the following) and a motor command mt. The output is a prediction of the retinal image at the next time step t + 1 (called “output image” in the following; see left part of Fig. 1). The question is how such an adaptive visual FWM can be implemented and trained by exploration of the environment. A straight-forward approach is the use of function approximators which predict the intensity of single pixels. For every pixel 〈xOut, yOut〉 of the output image, a specific forward model FWM〈xOut,yOut〉 is acquired which forecasts the intensity of this pixel (see right part of Fig. 1). Together, the predictions of these single FWMs form the output image as in Fig. Fig. 1. Left: Visual forward model (FWM). Right: Single component of a visual forward model predicting the intensity of a single pixel 〈xOut, yOut〉 of the output image. Fig. 2. Left: Mapping model (MM). Right: Validator model (VM) (for details see text). 1 (left). Unfortunately, this simple approach suffers from the high dimensionality of the input space (the retinal image at time step t is part of the input), and does not produce satisfactory learning results [4]. Hence, in this study we pursue a different approach. Instead of forecasting pixel intensities directly, our solution is based on a “back” prediction of where a pixel of the output image was in the input image before the camera’s movement. The necessary mapping model (MM) is depicted in Fig. 2: As input, it receives the motor command mt and the location of a single pixel 〈xOut, yOut〉 of the output image; as output it estimates the previous location 〈xIn, ŷIn〉 of the corresponding pixel (or region) in the input image. The overall output image is constructed by iterating through all of its pixels and computing each pixel intensity as I 〈xOut,yOut〉 = I In 〈xIn,ŷIn〉 (using bilinear interpolation). 1 Moreover, an additional validator model (VM) generates a signal v〈xOut,yOut〉 indicating whether it is possible at all for the MM to generate a valid output for the current input. This is necessary because even for small camera movements parts of the output image are not present in the input image. In this way, the overall FWM (Fig. 1, left) is implemented by the combined application of a mapping and a validator model. The basic idea of the learning algorithm for the MM is outlined in the following for a specific mt and 〈xOut, yOut〉. During learning, the motor command is carried out in different environmental settings. Each time, both the actual input and output image are known afterwards, thus the intensity I 〈xOut,yOut〉 is known as well. It is possible to determine which of the pixels of the input image show a similar intensity. These pixels are candidates for the original position 〈xIn, yIn〉 of the pixel 〈xOut, yOut〉 before the movement. Over many trials, the pixel in the input image which matches most often is the most likely candidate for 〈xIn, yIn〉 1 In this study, pixel intensities of the retinal input and output images are threedimensional vectors in RGB color space. and chosen as MM output 〈xIn, ŷIn〉. When none of the pixels matches often enough, the MM output is marked as non-valid (output of VM)." @default.
W137111882 created "2016-06-24" @default.
W137111882 creator A5051039216 @default.
W137111882 creator A5083490672 @default.
W137111882 date "2006-01-01" @default.
W137111882 modified "2023-09-26" @default.
W137111882 title "Learning a Visual Forward Model for a Robot Camera Head" @default.
W137111882 cites W1529447734 @default.
W137111882 cites W1554663460 @default.
W137111882 cites W162449978 @default.
W137111882 cites W2022950268 @default.
W137111882 cites W2043369301 @default.
W137111882 cites W2043968544 @default.
W137111882 cites W2097861969 @default.
W137111882 cites W2105209710 @default.
W137111882 cites W2114414717 @default.
W137111882 cites W2147677349 @default.
W137111882 cites W2160328559 @default.
W137111882 cites W235553588 @default.
W137111882 hasPublicationYear "2006" @default.
W137111882 type Work @default.
W137111882 sameAs 137111882 @default.
W137111882 citedByCount "0" @default.
W137111882 crossrefType "journal-article" @default.
W137111882 hasAuthorship W137111882A5051039216 @default.
W137111882 hasAuthorship W137111882A5083490672 @default.
W137111882 hasConcept C114793014 @default.
W137111882 hasConcept C121684516 @default.
W137111882 hasConcept C127313418 @default.
W137111882 hasConcept C154945302 @default.
W137111882 hasConcept C2780312720 @default.
W137111882 hasConcept C31972630 @default.
W137111882 hasConcept C41008148 @default.
W137111882 hasConceptScore W137111882C114793014 @default.
W137111882 hasConceptScore W137111882C121684516 @default.
W137111882 hasConceptScore W137111882C127313418 @default.
W137111882 hasConceptScore W137111882C154945302 @default.
W137111882 hasConceptScore W137111882C2780312720 @default.
W137111882 hasConceptScore W137111882C31972630 @default.
W137111882 hasConceptScore W137111882C41008148 @default.
W137111882 hasLocation W1371118821 @default.
W137111882 hasOpenAccess W137111882 @default.
W137111882 hasPrimaryLocation W1371118821 @default.
W137111882 hasRelatedWork W129352229 @default.
W137111882 hasRelatedWork W1984935025 @default.
W137111882 hasRelatedWork W1987294859 @default.
W137111882 hasRelatedWork W1995136543 @default.
W137111882 hasRelatedWork W1999452074 @default.
W137111882 hasRelatedWork W2019919826 @default.
W137111882 hasRelatedWork W2138665783 @default.
W137111882 hasRelatedWork W2172977663 @default.
W137111882 hasRelatedWork W2461011248 @default.
W137111882 hasRelatedWork W2463892271 @default.
W137111882 hasRelatedWork W2517806397 @default.
W137111882 hasRelatedWork W2753877097 @default.
W137111882 hasRelatedWork W2765873593 @default.
W137111882 hasRelatedWork W2876218587 @default.
W137111882 hasRelatedWork W3006052612 @default.
W137111882 hasRelatedWork W3036496243 @default.
W137111882 hasRelatedWork W3081008878 @default.
W137111882 hasRelatedWork W3173338009 @default.
W137111882 hasRelatedWork W42063686 @default.
W137111882 hasRelatedWork W63009755 @default.
W137111882 isParatext "false" @default.
W137111882 isRetracted "false" @default.
W137111882 magId "137111882" @default.
W137111882 workType "article" @default.