The recognized slowing of Moore’s law is creating significant new compute opportunities. The compute landscape is changing to an environment of heterogeneous platforms across the data center and edge.
Problems of the Past
Compute dominated by x86 and ARM is being augmented with GPUs, FPGAs and custom ML accelerators. However, SW developers do not have the platform to deliver acceleration.
The CacheQ Solution of Today
The CacheQ Ultravisor enables SW developers to easily develop, deploy and orchestrate applications across the heterogeneous distributed compute architectures delivering significant increases in application performance and at the same time dramatically reducing development time.
Performance improvements using FPGAs is all about accelerating loops.
If N is the number of iterations of a loop and C is the number of cycles the execution time is roughly (N*C)/(clock rate). On a fully pipelined implementation the execution time is (N+C)/(clock rate). QCC automatically pipelines loops. The fully pipelined loop is integrated with an application specific many port cached memory architecture ensuring high bandwidth data movement. In addition, with a command line option loops can be unrolled to deliver significantly higher performance bound by hardware resource constraints.
Developing solutions for FPGAs has been the domain of HW developers. Software applications must be extensively modified to achieve the desired performance goals. In most cases this is a six to nine month process. CacheQ delivers a complete development and deployment platform for SW developers. Partitioning between host and accelerators is handled by the platform. Performance simulation, profiling, resource estimates and memory configuration are supported prior to implementation. All the capabilities required to support many ported dynamically allocated memory are available to the developer. A complete development platform requiring limited code modifications dramatically shortens development time and improves system