Project Bella

Week 11 :)))

As bad as it sounds, I didn't get to do much this past week: I spend all of last Thursday completing a math exam, all of last Friday completing the final stage of the windmill fort for Wolverine Soft Studios, Saturday through most of Tuesday working on project 4 of EECS485 and more studio work, and the remainder of Tuesday grading for 494. So I got to training after 12:30 AM this morning.

On my first attempt at training the agents to connect ice slabs to their nests, the agents were receiving the direction to their nest, the direction they were facing, the distance to their nest, whether or not they were frozen, whether or not they were aggressive, and information from five raycasts regarding the tags of objects in front of them. No matter how many times I trained these guys, they would always choose to run right into a wall, and stay there.

I then tried to experiment with passing the data from an actual camera to the agents' brains. I got this idea from the original FoodCollector ML-Agents example; there is a second scene in this project called "VisualFoodCollector", in which the agents use cameras instead of raycasts. There was no trainer config file for this, though, so I tried to combine that of another visual ML-Agents example and the raycast-FoodCollector example to make my VisualIceSlabGame config file.

Not only did these agents train MUCH MUCH slower (naturally; I went to take a shower and found that they hadn't even hit fifty thousand iterations by the time I returned) and set my laptop ablaze (not literally), they also did the exact same nonsense as the first set of agents: spawned into the game area, ran to the wall, and just stood there for the remainder of the training episode.

I wondered what could be causing them to do this. I thought maybe the cameras were just giving them a bunch of high-dimensional color information but they didn't really know what it was they were looking at. There's a white thing in front of me - is that an ice slab? My nest? a piece of food? The wall? They didn't know. So I tried to combine the two methods: provide both raycast information AND a camera, with the hopes that they would tie what tags they were seeing through the raycasts with the actual visual information they were receiving through the cameras.

Throughout all this, I made sure to ONLY reward them whenever an ice slab connected to their nest, or if they were pushing an ice slab in the direction of their nest. I didn't want the rewards or punishments of attacking other agents or being attacking to muddle the training. To my delight, it was by this method that they stopped running face-first into the wall. Now I'll see them stop and smell the ice slabs, wander a little this way and that, and best of all, now and then try their best to push an ice slab toward their nest. I'm consistently seeing them get better and better at performing this complex task, but the training is very slow (every time the AddReward function is called, my computer literally freezes to update the brain before returning to training).

Since I didn't even get to fully finish training this week, I'll finish it up this week and move the brains over to the final build.