Sunday, 10 February 2019

Part 3: Artificial cognitive systems, Reinforcement Learning

In Part 1 we discussed the logical approach of cognitive systems and how can we model an AI agent by giving it the logical rules and the knowledge to deduct the output. Next, to let the agent learns by itself, we presented machine learning approach that can find correlation between input data set and their given output, as in Part 2.
For an interactive system, it is difficult to give all the data set of the possible experiences. The agent needs to try and collect experiences by itself. The approach that allows the agent to try and fail to learn from its experiences is the reinforcement learning. Deep Reinforcement Learning combines both the supervised approach of finding the correlation between input and output data set and the trial and error approach. In DRL, the input includes the state that the agent collected by experiences. The output could either be the action to take or an information that can help selecting the action.
One method is Deep Q-Learning. This method calculates the Q-value for each action and then selects the action that yields the highest Q-value. This approach uses the Bellman equation for calculating the Q-value and the neural network to find the correlation.
Another approach uses deep learning for generating the policy using the policy gradient. This method is used for continuous action-space.
We modeled a deep Q-learning agent to drive in a physics game simulator OpenDS. The agent receives a scenario message that describes the state of the car in the road. The agent selects the action in a maneuver message to control the driving. For a lane keeping task, the state reflects the distance from the side of the road, the lane heading as an angel between the car heading and the road, the curvature of the road and the steering angel.

No comments:

Post a comment