Re: 3 cameras: You can probably get away with 2 wrist cameras (this is what generalists do) but 1 head camera + 2 wrist cameras is what people often train with. I consider wrist cameras a clever way to replace touch since robots don’t have it yet. You can also try 1 Head Camera + Torque Feedback and see how it goes, our API supports it.
Again, we only support Linux right now so you can run it from any Linux box. As said, ZED cameras require Nvidia GPUs so ZED boxes are the perfect compute module for the application (linux + GPU)
Again, we would love to! As we train Xol internally on tasks, we will open source the weights.
Re: The stand is stable but has locking wheels. Yes, you can place it anywhere as per your requirement. The mounting holes are in the datasheet.
Github link agreed, we’ll fix it now!
And yes, building such models properly from which an agent can choose would be amazing. We’ll do that as we get more Xol into people’s hands.
<a href