RAM missing link
The various types of RAM commonly available have been in three basic types for decades now,
- tiny SPI RAM
- Async SRAM needing many pins (upwards of 30)
- SDRAM / DDR for much more dense storage but with a big hike in interface complexity
It's certainly possible to interface DDR to an FPGA, but it has implications for the cost of the FPGA that has the necessary IO standards to do it at speed, and the realestate needed
Well now there's a brand new RAM in town using Spansion's hyperbus, called "hyperram", that fills a big gap between SRAM and DDR.
Hyperbus detailed description:
Today since it's hot off the fab, only the 64Mbit (8MByte, or more accurately 4Mx16) and 128Mbit variants are in production. However 256Mbit is coming soon in the same footprint.
The footprints are defined in the licensed standard, so the vendor chips are interchangeable. And because the address is passed in sequentially using the data bus, the footprint is immutable against changes in storage array size. There are no extra pins for larger addresses like Async SRAM.
This new 12-signal bus has been widely licensed and many manufacturers are coming out with PSRAM ("hyperram") and Flash chips (yes... "hyperflash" amazingly enough) using it.
The trend in digital busses for a while now has been to serialize them with some kind of LVDS: USB, PCI express, and the next gen SD cards all use this technique for example. (As did the now defunct Ara).
However hyperbus rejects this, and blends DDR - the data rate is twice the clock - with a parallel, 8-bit interface, without differential signalling (except for the clock). In fact the signalling on the bus itself is oldstyle NRZ signalling at the ancient 3.3V and 1.8V standards; there is separate silicon for each voltage standard.
|Max clock rate
The bus is much simplified compared to LVDS or DDR DRAM due to the relatively low max clock rate, there is no training / retraining and the need for matching bytelane routing length is correspondingly relaxed, in fact if you don't intend to clock it at near the max rate, it's very relaxed. The data bus drive strength is also configurable by internal registers, for EMI and signal quality control.
The fact it uses traditional IO standards, if you accept DDR clocking in that category (ICE40 IO cells support it natively), also means very low-cost FPGAs can talk to it, even CPLDs if the clock rate is low. That's a brand new capability introduced with hyperram, low-cost FPGA mated with low cost high density memory.
Underneath the shiny new bus, Hyperram is an old memory technology known as "Pseudostatic" RAM, it's actually DRAM but with an interface that hides the periodic refresh activity that the DRAM array requires internally.
That gets you into DDR DRAM level of density and cost, without the host controller having to take any care about refresh details.
However the refresh still goes on internally and blocks access externally for the duration: the interface adds dynamic wait states when the RAM array needs some "me time". So the host controller doesn't entirely escape having to deal with it adaptively: however hyperram lets you set a config register to select to always insert the refresh waitstate whether it's refreshing or not, to trade off determinism and host controller complexity against latency. If you take that route and don't care about the wasted extra clocks you really have escaped all sign it's DRAM under the covers, and the latencies are completely deterministic like SRAM.
The underlying DRAM reality and its row / column architecture still need consideration with hyperram, it's reflected in the address map used by the chip and the row and column sizes for a chip can be read out from configuration registers in the chip to help with that.
|CK / CK#
|Bus clock (both edges significant)
|Active low chip select
|1) Wait state signal from RAM to hold off transaction, 2) Bytewise write enable
|Data Bus, also used for issuing address and cycle type information
It's important to notice that although the physical external data bus is 8 bits wide...
with DDR there are naturally two transfers on that 8-bit bus per clock, ie, 16-bits
the addressable unit of the device is a 16-bit word. Ie address 0 points to one 16-bit word and address 1 points to the next 16-bit word.
Therefore the natural unit for addressability and for transfer is 16-bits.
Each transaction begins with the master writing 48 bits (it's DDR, so those 6 bytes transfer in 3 clocks) that defines the transaction type and address information.
One of the nice things about hyperbus is even after the largest device on the initial roadmaps comes out (32MByte), there are still 19 spare bits in this packet, meaning it can address 128TB per chip (!) without needing extra pins or protocol change. So in the next years, we can expect to see GB hyperram chips in compatible packages and interfaces.
The addressing scheme is a bit convoluted... the underlying row and column addresses are separated in the map and some top bits are reserved for the type of transaction.
|0 = Write, 1 = Read
|0 = memory, 1 = configuration registers
|0 = wrapped burst, 1 = linear
|44 - 37
|34 - 22
|Row A21 - A9
|21 - 16
|Upper Col A8 - A3
|15 - 3
|2 - 0
|Lower Col A2 - A0
Again, notice the address bus addresses 16-bit words.
What came from where
Hyperbus blends a lot of existing technologies to make something new.
|Original RAM using it
|3.3V / 1.8V simple IO standard
|DDR "both edge" clocking
|Config registers on die
|ASYNC SRAM / DDR DRAM
|High speed differential clock
|Bytewise write masking
|Provide address via data bus
What we learned this time
Hyperram gives a new way to marry cheap FPGAs with dense memory
different vendor chips should be interchangeable due to specified footprints
64Mbit is available now, 128Mbit and 256Mbit in same footprint with same interface soon. And there is plenty of room for expansion.
It's really pseudostatic SRAM (ie, DRAM) with hidden refresh: can be completely hidden if you are OK burning some cycles
1.8V and 3.3V silicon with 1.8V having differential clocks and 333MB/sec performance
12-pin interface with 8-bit bus, DDR clocking but supports simple old IO standards otherwise
Addressable unit is a 16-bit word (1 clock / 2 edges to transfer on the 8-bit physical bus)
48-bit packet defines the bus transaction and attributes, followed by the transaction data until CS# deasserted
Much simpler to interface to than DDR DRAM