Heterogenous compute isn’t a new concept. We’ve had it in phones and datacenters for quite a while – CPUs complemented by GPUs, DSPs and perhaps other specialized processors. But each of these compute engines has a very specific role, each driven by its own software (or training in the case of AI accelerators). You write software for the CPU, you write different software for the GPU and so on. Which makes sense, but it’s not general-purpose acceleration for a unified code-set. Could there be an equivalent in heterogenous compute to the multi-threading we use every day in multi-core compute?