It’s been nearly four years since Imagination Technologies acquired MIPS Technologies, the developers of the MIPS architecture. Over that time, Imagination has made a concerted effort to scale its MIPS CPU architectures outward and upward to support a wider array of workloads, use-cases, and capabilities. MIPS was once a high-end architecture that powered workstations and high performance computers, but its stronghold today is in the embedded market. The I6500 could change that, and the core has already been tapped to drive the future (pun-intended) of self-driving car technology.
First, let’s talk about the core. The I6500 and I6400 share the same basic CPU architecture. Both chips are dual-issue processors with support for simultaneous multi-threading (SMT). Both can handle up to four threads per CPU core, offer an optional SIMD/FPU implementation with 128-bit registers (broadly equivalent to ARM’s NEON, though the particular implementation details may vary), hardware virtualization support, and a 16-way associative L2 cache that can scale up to 8MB.
What sets the I6500 apart from the earlier I6400 is its support for large-scale cluster deployments and heterogeneous compute integration. Where the I6400 targets multi-core applications, the I6500 is meant to be used as part of a multi-cluster system with potential support for hundreds of CPUs, as well as other heterogeneous compute accelerators.
Imagination is dividing the chip’s heterogeneous capabilities into two segments — “heterogeneous inside” and “heterogeneous outside.” “Heterogeneous inside,” according to Imagination, refers to how customers can optimize the I6500 core in a number of ways, including altering its simultaneous multi-threading configuration, changing the L1 cache size, including or excluding the SIMD/FPU unit, adding up to 1MB of Data ScratchPad RAM, adding up to 4 AXI ports to connect with low-latency peripherals, and tweaking dynamic voltage and frequency settings on a per-core basis. This isn’t really heterogeneous computing as AMD has defined it, but it does speak to how the I6500 is an extremely flexible core design.
“Heterogeneous outside,” more closely resembles HSA as we’ve seen it used by AMD and Qualcomm. MIPS writes:
The capabilities of the I6500 family extend even further on the “heterogeneous outside” framework through a unique feature supporting the build of “accelerator only” cluster(s). A single cluster of the I6500 family platform can actually be configured for having up to eight IO coherence units (IOCUs) connected together, with no CPU in the cluster.
Custom-designed or third party functional accelerators can be connected via standard AXI4 interface to these IOCU ports, providing very localized and concentrated compute resources for specific tasks or applications. Such a configuration provides benefits to a cluster of functional accelerators by utilizing a localized, shared low latency L2 cache among the accelerator units. It concentrates the processing and traffic of the accelerators and the CPUs into separate clusters.