Accelerated Computing at Scale

Know More

The Past

There was No Acceleration Development Platform

Compute dominated by x86 and ARM is being augmented with GPUs, FPGAs and custom ML accelerators. However, SW developers do not have the platform to deliver acceleration, until now.

Today

SW Developers Easily Develop & Deploy Applications

The CacheQ Ultravisor enables SW developers to easily develop, deploy and orchestrate applications across the heterogeneous distributed compute architectures delivering significant increases in application performance and at the same time dramatically reducing development time.

Proven Results

Up to 100x plus performance gain

Greater than 15x reduction in development time

Performance improvements using FPGAs is all about accelerating loops.

If N is the number of iterations of a loop and C is the number of cycles the execution time is roughly (N*C)/(clock rate). On a fully pipelined FPGA implementation the execution time is (N+C)/(clock rate). QCC automatically pipelines loops. The fully pipelined loop is integrated with an application specific many port cached memory architecture ensuring high bandwidth data movement. In addition, with a command line option loops can be unrolled to deliver significantly higher performance bound by hardware resource constraints.

Developing solutions for FPGAs has been the domain of HW developers. Software applications must be extensively modified to achieve the desired performance goals. In most cases this is a nine to twelve month process. CacheQ delivers a complete development and deployment platform for SW developers.  Partitioning between host and accelerators is handled by the platform. Performance simulation, profiling, resource estimates and memory configuration are supported prior to implementation. All the capabilities required to support many ported dynamically allocated memory are available to the developer. A complete development platform requiring limited code modifications dramatically shortens development time and improves system quality.

    The Cacheq ultravisor Flow

    The QCC development platform accepts HLL (C source or object) as input and through a number of steps generates an optimized partitioned accelerated executable. The output is an x86 executable and a file that is loaded into an FPGA.

    The Technology

    This is a new platform developed and defined with very specific objectives. It should feel like a SW development platform. Companies and pundits suggest HLS was developed to fill this void. In reality, HLS is a higher level of abstraction for HW engineers. It is not a tool for software developers. QCC and supporting hardware was specifically developed to streamline acceleration development in a fashion that meets the needs and knowledge of SW developers. There are a number of key differences between QCC and HLS.

    Request a Demo

    Request a demo to learn more about our acceleration and distributed computing solution.

    13 + 13 =