San Jose (CA) - Tilera today announced the next generation of their tile-based processors. A follow-on to their previous Tile64 embedded CPUs, two new TilePro models offer 36 and 64 cores with notably greater performance per watt. A toolset revision called Multicore Development Environment (MDE) 2.0 allows full emulation and simulation with clock cycle granularity.
Tile64 to TilePro36 and TilePro64
Tilera uses a common multi-core approach to design both for their older Tile64 and the new TilePro lines. A single core is created, perfected, validated and tested. Once it's working it is replicated as many times as are needed for the silicon die.
The original Tile64 offered only a 64-core version. This new release introduces a 36-core version called TilePro36 (in addition to the 64-core version). TilePro36 uses a scaled down implementation of their 64-core product designed to increase yields and provide a lower-power mid-range product. Tilera is continuing to expand its design and products with more than 64 cores are planned, TG Daily was told.
Same process technology
Tile64 and TilePro are both manufactured using a 90 nm process generation. Tilera claims an increase in performance of 1.5x to 2.5x in TilePro. Fed primarily by a doubling of the L1 cache size per tile and doubling of the L2 associativity, the addition of a new communication channel and benefits given by added instructions through recompilation, all for an increase of 5% in overall power consumption. Unit pricing will increase from $435 per chip in 10K unit quantities for Tile64, to around $900 per chip for TilePro64 in 200 unit quantities. Development boards and MDE 2.0 software cost $18,000.
Tilera introduced several new instructions with TilePro, including some for multimedia, unaligned loads, memory and fence hints as well as offset load/store instructions. The company claims the new multimedia instructions double the throughput and performance of audio codecs, as well as echo cancellation processing. The new offset load/store instructions increase video encoding by 50% and the unaligned loads are now 60% faster.
TilePro is both binary and socket compatible with Tile64. Existing customers can literally pop out their old chips, pop in the new ones and be up and running without any changes to software. Customers will see an immediate increase in performance due to the larger cache, according to teh company. However, there are features added to the new cores which require a recompilation (such as the new instructions and additional communications lane).
Tilera Comparison Chart Description Tilera Tile64 Tilera TilePro36 Tilera TilePro64 Available? Yes Yes Yes Introduced Jul 17, 2007 Sep 22, 2008 Sep 22, 2008 Cores 64 36 64 Core Clock 500,700,866 MHz 500 MHz 700,866 MHz DDR2 Clock 667,800 MHz 533 MHz 800 MHz DDR2 controllers 4 3 4 DDR2 efficiency 55% 70%+ 70%+ PCI-e controllers 2 1 2 10 GbE + XAUI 2 1 2 Misc I/O 10 Gbps 10 Gbps 10 Gbps Flexible I/O 20 Gbps 20 Gbps 20 Gbps Max realtime I/O 50 Gbps 30 Gbps 50 Gbps Max intra-die I/O 31 Tbps 20.9 Tbps 37.2 Tbps Mesh traffic 32 bits/clock
"Direct-to-tile" I/O? No Yes Yes max Watts 22 16 23 L1 Cache/core 8KB Instruction
L2 Cache/core 64KB 64KB 64KB Cache line 64 bytes 64 bytes 64 bytes Possible L3 Cache 4MB 2.3MB 4MB Dedicated coherency network? No Yes Yes 16-bit flops 221 Gflops 144 Gflops 221 Gflops 32-bit flops 166 Gflops 54 Gflops 166 Gflops 16-bit Flops/watt 10.05 7.13 9.61 32-bit Flops/watt 7.55 3.38 7.22 16-bit Flops/core 3.45 3.17 3.45 32-bit Flops/core 2.59 1.5 2.59 Dual endian support? No Yes Yes Memory striping? No Yes Yes Cache distributable to other tiles? Yes Yes Yes ISA 64-bit VLIW
Socket 1517 BGA 1517 BGA 1517 BGA Package 40mm x 40mm 40mm x 40mm 40mm x 40mm
Tilera is still in startup company mode, funded by venture capitalists. Its first commercial product was announced in August, 2007, though they have said samples were shipped as early as June, 2007. That announcement took the company out of "stealth mode" even though volume products were not available until April, 2008. Tilera now claims to have more than 45 customers, many of which were taken directly from high-speed technology fields, such as those typically employing custom ASICs and FPGAs.