Question 1 | Notion

The algorithm will pass each immediate next state to a function which will return some value. The action/ future state which gets the max value is selected as the next move.

Thus the goal is to maximize some function.

Things that would influence agent's movement

Score
Distance from food
- [x] lesser the distance from food, higher is the will to move in that direction
- [x] if len(food) is getting reduced, more reward
Distance from ghosts (implemented even though not required)
- [x] The farther away from ghosts the better
- [ ] Punish for being too close to ghosts ?
Distance from power pellet (not implemented as not required)
- [ ] Reward for moving close to the power pellet
Scared time of ghosts (not implemented as not required)
- [ ] Don't consider Distance from ghosts if this is not null