Multiple action switching in embodied agents evolved for referential communication
Jorge Iván Campos Bravo and Tom Froese
Action switching in embodied agents is not a trivial problem and it becomes harder to solve when the agents have to switch behaviors during a single trial on a task solving problem that requires that the actions must be performed in the right time to achieve a certain goal. Izquierdo and Buhrmann (2008) showed that we can evolve an agent to do two simple tasks using a single continuous-time recurrent neural network (Beer, 1995). The agent must know in which body is in and accomplish the task with the interaction with his environment. The interaction with the environment is crucial in the action selection process, we have to sense all the information that we can get from the environment before do something. If we want to cross the street we must first see both sides to know if we can cross over. Agmon and Beer (2014) proved that we can evolve agents that can switch between two behaviors during a single trial to solve a more complex problem. The agent must change the behavior according with his environment interaction and his internal states. The evolution of referential communication (Williams et al, 2008) in embodied agents showed that we can evolve agents with no dedicated communication channels that can pass crucial information to solve a task. The communication process emerges from a coordinated behavior between the agents. Can we make a model that solves the main problem of communication with coordinated behavior using only a single continuous-time recurrent neural network? The T-maze test used in animal cognition experiments let us a binary choice between two paths to follow. Blynel and Floreano (2003) evolved a CTRNN to solve the T-Maze test using reinforced learning. In the experiment the agent must locate and “remember” the location of the goal in a T shaped environment. The only sensor that has information about the location of the reward-zone is the floor sensor that can discriminate between a bright and a dark floor. In this case, the signal is fixed and can only be bright or dark. In the model that we propose we have the sender constrained in an area of a one dimensional environment and the receiver that can move freely on the environment. We have two targets and the sender must interact with the receiver and the receiver must go to the correct target in each trial. The sender is the only one that knows the position of the correct target in the environment. We will use the same structural copy of a CTRNN for both agents. This problem becomes harder and because the reference signal is not fixed but must co-evolve along with the rest of behavior the agent have to switch between the actions of sending and receiving depending the body they are in. Once they switch between the two roles, the receiver must switch again between receiving information and get to the correct target.
Agmon, E. and Beer, R. D. 2014. The evolution and analysis of action switching in embodied agents Adaptive Behavior, vol. 22, n. 1, pp. 3–20.
Beer, R. D. (1995). A dynamical systems perspective on agentenvironment interaction. Artificial Intelligence, 72:173–215.
Blynel, J., & Floreano, D.. (2003). Exploring the T-Maze: Evolving Learning-Like Robot Behaviors Using CTRNNs. Applications of Evolutionary Computing, pp. 593-604. Springer.
Izquierdo, E.J. and Buhrmann, T. (2008) Analysis of a dynamical recurrent neural network evolved for two qualitatively different tasks: Walking and chemotaxis. In S. Bullock, J. Noble, R. A. Watson, and M. A. Bedau (Eds.) Proc. of the 11th Int. Conf. on Artificial Life. MIT Press, Cambridge, MA.
Williams, P.L., Beer, R.D., and Gasser, M. (2008). Evolving referential communication in embodied dynamical agents. In S. Bullock, J. Noble, R. Watson and M.A. Bedau (Eds.), Artificial Life XI: Proceedings of the Eleventh International Conference on the Simulation and Synthesis of Living Systems, pp. 702-709. MIT Press.