I’m curious on your thoughts of having more layers instead of a pure 2 level HL->LL framework. It seems like humans do something like this with the cortex -> motor cortex -> brain stem/spinal cord. It’s interesting to see that Figure adopted this kind of hierarchy, any thoughts on the pros/cons on splitting the layered control architecture even more?
Also for what it’s worth I would vote for an IsaacSim implementation since it might be easier to have an RL pipeline that’s already kind of bundled together with active developer support than piecing together your own RL stack, sim, evals, etc. But idk it is always satisfying to piece together something from scratch haha
Agreed - more hierarchical layers mean more potentially performance-limiting interfaces, but also potentially better performance and debug ability. Another way to think about it is what happens if the higher level is switched off (maybe it needs to rethink or “reason”, in LLM terms)—it is nice if the lower levels can prevent safety issues by at least taking care of balance and other safety concerns independently.
Is there something that runs offline, but after the initial training phase?
For example, a robot sends data about a new environment after a few hours to an offline computer, which then provides some feedback back to the robot as it continues in that environment.
Similar to how auto companies provide "OTA software updates" to your car even after you bought it to fix/improve something.
Great question! Yes, there are approaches in machine learning for fine tuning or adapting models with new data - e.g. the fine-tuning process, LoRa or low-rank adaptation, meta-learning. This kind of incremental training is centralized (the model provider would do it), but there’s also a concept of federated learning, where it could be done with private data at customer sites. However, all of these are relatively niche, to my knowledge, in the context of the large foundation models.
I’m curious on your thoughts of having more layers instead of a pure 2 level HL->LL framework. It seems like humans do something like this with the cortex -> motor cortex -> brain stem/spinal cord. It’s interesting to see that Figure adopted this kind of hierarchy, any thoughts on the pros/cons on splitting the layered control architecture even more?
Also for what it’s worth I would vote for an IsaacSim implementation since it might be easier to have an RL pipeline that’s already kind of bundled together with active developer support than piecing together your own RL stack, sim, evals, etc. But idk it is always satisfying to piece together something from scratch haha
Agreed - more hierarchical layers mean more potentially performance-limiting interfaces, but also potentially better performance and debug ability. Another way to think about it is what happens if the higher level is switched off (maybe it needs to rethink or “reason”, in LLM terms)—it is nice if the lower levels can prevent safety issues by at least taking care of balance and other safety concerns independently.
Yeah I was thinking of that too! Especially after seeing Matt Mason's Inner Robot blogpost [https://mtmason.com/the-inner-robot/] and some more neuroscience evidence of discrete functional structures (even at the high level) [https://www.cambridge.org/zw/universitypress/subjects/life-sciences/animal-behaviour/divided-brains-biology-and-behaviour-brain-asymmetries?format=PB]. I'm not fully convinced by arguments from pure learning people claiming that any engineered structure will always become the bottleneck, it seems clear that biological counterparts do it for efficiency of digesting complex information flow -> actions.
I'm looking forward to your third installment!
From what I understand:
Skill Acquisition -> Runs offline during Training
Motor Adaptation -> Runs on the robot's compute
Is there something that runs offline, but after the initial training phase?
For example, a robot sends data about a new environment after a few hours to an offline computer, which then provides some feedback back to the robot as it continues in that environment.
Similar to how auto companies provide "OTA software updates" to your car even after you bought it to fix/improve something.
Great question! Yes, there are approaches in machine learning for fine tuning or adapting models with new data - e.g. the fine-tuning process, LoRa or low-rank adaptation, meta-learning. This kind of incremental training is centralized (the model provider would do it), but there’s also a concept of federated learning, where it could be done with private data at customer sites. However, all of these are relatively niche, to my knowledge, in the context of the large foundation models.