Qira Simulation

Welcome to the Qira simulation.

To run the program open up the 3d Viewer tab and either click the +1 to run step by step or click the runner icon to run continuously.

To view the states of agent Qira and how they are changing click the Green Cube in the 3D viewer and it will open up the inspector where the different states are listed out.

Important states

reward - records -1 if Qira moves backwards +1 if Qira moves forwards and 0 if Qira does not move.

actions- all the possible actions Qira can take

action - the action that Qira took

q_state - the current state Qira is in

q_table - Table of weights that has the same indexes as the actions list. This table is used to determine what action qira should take next a higher score means a better weight.

epsilon - the hyperparameter to determine if the agent should randomly choose the next action or if should use q-learning to determine the next action. The higher the epsilon value the better chance it will randomly choose the next state. decrements as episodes increase.

learning rate - the adjustment to make to the q value

discount value - The degree to which the agent should discount future rewards in favor of immediate ones.

How to increase efficiency:

In the folder globals.json there are the epsilon, learning rate and discount values.

By changing the epsilon to a lower value the agent will start learning on its own faster instead of making random disicions.

By changing the learning rate to a higher value the agent will wieght decisions higher in the q_table the closer to the initial start of the program

By changing the discount factor to a higher value the q value will be weighted higher according to the next possible q action.

Experiment with different parameters to find the fastest q learning model.