Great catch!! There are a lot of similarities. Deepmind tends to use Frozen Pretrained models as foundations (as I covered in my article about Flamingo). Specifically wrt to Gato, the authors did mention it is geared towards Robotic Control. I don't think the model is trained for an MoE style partial activation. Nor does it incorporate multiple senses for the same task. Theoretically this gives Pathways models a much deeper understanding of the task