Open source FPGA toolchain for AMD/Xilinx Series 7 chips, including Kintex-7. Supports Kintex7 (including 325/420/480t), Artix7, Spartan7 and Zynq7
https://f4pga.org & https://news.ycombinator.com/item?id=32861075 (umbrella project) Open source toolchain for the development of FPGAs of multiple vendors. Currently, it targets the Xilinx 7-Series, Lattice iCE40, Lattice ECP5 FPGAs, QuickLogic EOS S3 and is gradually being expanded to provide a comprehensive end-to-end FPGA synthesis flow.
https://www.bunniestudios.com/blog/2017/litex-vs-vivado-firs...> There’s already IP cores for DRAM, PCI express, ethernet, video, a softcore CPU (your choice of or1k or lm32) and more.. LiteX produces a design that uses about 20% of an XC7A50 FPGA with a runtime of about 10 minutes, whereas Vivado produces a design that consumes 85% of the same FPGA with a runtime of about 30-45 minutes.
https://news.ycombinator.com/item?id=39836745#39922534
> you can.. get [FPGA] parts for significant discounts in 1-off quantities through legit Chinese distributors like LCSC. For example, a XC7A35T-2FGG484I is 90$ on Digikey and 20$ at LCSC. I think a personalized deal for that part would be cheaper than 20$ though...
But at the same time, those cores are big and powerful, and optimize horribly because the customers who actually use them need all of those features. Those customers aren't really concerned with area but rather with meeting performance requirements. Using the Xilinx provided QDMA core, I've been able to achieve line rate performance on PCI-e 4.0 x16 for large DMA transactions with a setup time of about 3 total days of work. I'd like to see an open source solution that could even do that with just ACKing raw TLPs because I haven't found one yet.
As for pricing, AMD/Xilinx and Altera don't want you as a customer. They want to sign $10M+/yr accounts or accounts which push the envelope of what's possible in terms of frequency (HFT). And they price their products accordingly for the public. If you actually end up as a direct customer, the prices are significantly cheaper to the point where those cheaper Chinese vendors don't make sense to use.
> There's been some interesting recent work to get the QMTech Kintex7-325 board (among others) supported under yosys/nextpnr.. It works well enough now to build a RISC-V SoC capable of running Linux
https://riscv.or.jp/wp-content/uploads/RV-Days_Tokyo_2024_Su...
32-bit MMU/No-MMU Linux-capable RISC-V softcore with rich peripherals is implemented by pure Verilog, and supported by OpenXC7, the FOSS FPGA toolchain.. These are still modern devices: 7-Series lifetime extended to 2035.
https://github.com/regymm/quasiSoC2023, "FPGA Dev Boards for $150 or Less", 80 comments, https://news.ycombinator.com/item?id=38161215
2021, "FPGA dev board that's cheap, simple and supported by OSS toolchain", 70 comments, https://news.ycombinator.com/item?id=25720531
Not an FPGA, but if you already have a recent Ryzen device, the AMD NPU might be worth a look, with Xilinx lineage and current AI/LLM market frenzy, https://news.ycombinator.com/item?id=43671940
> The Versal AI Engine is the NPU. And the Ryzen CPUs NPU is almost exactly a Versal AI Engine IP block to the point that in the Linux kernel they share the same driver (amdxdna) and the reference material the kernel docs link to for the Ryzen NPUs is the Versal SoC's AI Engine architecture reference manual.
At one point, cheap ex-miner FPGAs were on eBay, https://hackaday.com/2020/12/10/a-xilinx-zynq-linux-fpga-boa.... The Zynq (Arm + Xilinx FPGA) dev board is around $200, https://www.avnet.com/americas/products/avnet-boards/avnet-b.... There was an M.2 Xilinx FPGA (PicoEVB) that conveniently fit into a laptop for portable development, but it's not sold anymore. PCIe FPGAs are used for DMA security testing, some of those boards are available, https://github.com/ufrisk/pcileech-fpga
FPGAs can be developed using CAE-like systems or SystemVerilog, VHDL, or something modern like Veryl. Real FPGAs include acceleration blocks like DRAM, SRAM, CAM, shifters, ALU elements, and/or ARM cores.
At the end of the day though, the best teacher is to learn by doing and finding venues to ask questions.
Given how ruinously expensive silicon products are to bring to market, it's amazing that there are multiple companies competing (albeit in distinct segments).
FPGAs also seem like a largely untapped domain in general purpose computing, a bit like GPUs used to be. The ability to reprogram an FPGA to implement a new digital circuit in milliseconds would be a game changer for many workloads, except that current CPUs and GPUs are already very capable.
I would love to see the open source world come to the rescue here. There are some very nice open source tools for Lattice FPGAs and Lattice's lawyers have essentially agreed to let the open source tools continue unimpeded (they're undoubtedly driving sales), but the chips themselves can't compete with the likes of Xilinx.
Both languages suck for different reasons but no one has figured out how to make a better language and output a netlist from it (yes, there is an open interchange standard that almost every proprietary tool supports).
As much as I love FPGAs, GPUs really ate their lunch in the acceleration sphere (trying to leverage the FPGA's parallelism to overcome a >20x clock speed disadvantage is REALLY hard, especially if power is a concern) and so it seems the only niche left for them is circuit emulation. Of course, circuit emulation is a sizable market (low volume designs which don't make sense as ASICs, verification, research, etc.) and so it's not exactly a death sentence.
And all this is due to the actually very good open source toolchain, including synthesis (Yosys) P&R (NextPNR, Trellis etc), Verilator, Icarus, Surfer and many more. Lattice being more friendly than other vendors has seen an uptake in sales because of this. They make money on the devices, not their tools.
And even if you move to ASICs, open source tools are being used more and more, esp at simulation, front end design. As an ASIC and FPGA designer for 25 odd years I spend most of my time in open source tools.
https://github.com/tillitis/tillitis-key1 https://github.com/tillitis/tillitis-key1/pkgs/container/tke...
This is because some of the challenges for the synthesis/routing process are effectively NP-hard. Instead, the compiler uses heuristics and a random process to try to find a valid solution that meets the timing constraints, rather then the best possible solution.
I believe you can control the synthesis seed to make things repeatable, I believe that the stochastic nature of the process means that any change to the input can substantially change the output.
Also there is a large gamut and pretty much always has been for decades of programmable logic.. some useful parts are not much more than a mid range microcontroller. The top end is for DoD, system emulation, novel frontier/capture regimes (like "AI", autonomous vehicles).. few people ever work on those compared to the cheaper parts.
Eventually Nokia ended up buying Alcatel Lucent and not too long after he left, not sure what their current strategy is.
All the modulation, demodulation, framing, scrambling, forward error correction coding/decoding, etc. has to all happen continuously at the same time.
There are some open source software defined radios that can do that for one or two stations on a CPU at low data rates, but it's basically impossible with current CPUs for anything like the number of stations (phones) that are handled in one FPGA with decent data rates, latency etc.
You'd probably need a server rack's worth of servers and hundreds of times the power consumption to do what's happening in the one chip.
For some workloads, an FPGA is orders of magnitude faster than a CPU.
That's part of the FPGA business model - they have an automated way to take an FPGA design and turn it into a validated semi-custom ASIC, at low NRE, at silicon nodes(10nm?) you wouldn't have access to otherwise.
And all of that at a much lower risk. This is a strong rational but also emotional appeal. And people are highly influenced by that.
For what it's worth, Xilinx EasyPath was never actually ASIC. The parts delivered were still FPGAs; they were just FPGAs with a reduced testing program focusing on functionality used by the customer's design.
Anyone who claims to turn a modern FPGA design into an ASIC "automatically" is selling snake oil.
No, not always - I use no vendor IP whatsoever for extremely large designs.
For ASICs is basically required to use fab IP (for physical production/electrical/verification reasons,) but that's absolutely not the case for FPGAs.
https://www.intel.com/content/www/us/en/products/details/eas...
Someone has to design each of those reconfigurable digital circuits and take them through an implementation flow.
Only certain problems map well to easy FPGA implementation: anything involving memory access is quite tedious.
I would also question the premise that mem access is less tedious, easy for MCUs/CPU. Esp if you need determinstic performance and response times. Most CPUs have memory hierarchies.
The more practial attempts at dynamic, partial reconfiguration involves swapping out accelerators for specific functions. Encoders, fecoders for different wireless standards, Different curves in crypto for example. And yes somebody has to implement those.
HLS is not good, so I don't know what you are referring to as "modern." I am primarily experienced with large UltraScale+ and Versal chips - nothing has changed in 15 years here.
> basically the same as for an ASIC
What does this even mean, specifically? Use RTL examples. ASIC memory access isn't "easy," either (though it is basically the "same.")
> partial reconfiguration involves swapping out accelerators for specific functions
Tell me you've never used PR without telling me. Current vendor implementations of this are terrible (with Xilinx leading the pack.)
If you don't need programmability, then all that flexibility represents pure waste. But then again, we can make the same argument with ASIC vs CPUs and GPUs. The ASIC always wins, because CPUs and GPUs come with unnecessary flexibility.
The real problem with FPGAs isn't even that they get beaten by ASICs, because you can always come up with a low volume market for them, especially as modern process nodes get more and more expensive to the point where bleeding edge FPGAs are becoming more and more viable. You can now have FPGAs on 7nm with better performance than ASICs with older but more affordable process nodes that fit in your budget.
The real problem is that the vast majority of FPGA manufacturers don't even play the same game as GPUs and CPUs. You can have fast single and double precision floats on a CPU and really really fast single precision floats on GPUs, but on FPGAs? Those are reserved for the elite Versal series (or Intel's equivalent). Every other FPGA manufacturer? Fixed point arithmetic plus bfloat16 if you are lucky.
Now let me tell you. For AI this doesn't really matter. The FPGAs that do AI, focus primarily on supporting a truckload of simultaneous of camera inputs. There is no real competition here. No CPU or GPU will let you connect as many cameras as an FPGA, unless its an SoC specifically built for VR headsets.
Meanwhile for everything else, not having single precision floats is a curse. Porting an algorithm from floating point to fixed point arithmetic is non-trivial and requires extensive engineering effort. You not only need to know how to work with hardware, but also need to understand the algorithm in its entirety and all the numerical consequences that entails. You go from dropping someone's algorithm into your code and having it work from the get go, to needing to understand every single line and having it break anyway.
These problems aren't impossible to fix, but they are guaranteed to go away the very instant you get your hands on floating point arithmetic. This leads to a paradox. FPGAs are extremely flexible, but simultaneously extremely constricting. The appeal is lost.
Another aspect where FPGAs are interesting alternatives are security. Open up a fairly competent HSM and you will find FPGAs. FPGAs, esp ones that can be locked to a bitstream - for example anti-fuse or Flash based FPGAs from Microchip are used in high security systems. The machines can be built in a less secure setting, and the injection, provisioning of a machine can be done in a high security setting.
Dynamically reconfigurable systems was a very interesting idea. With support for partial reconfiguration, which allowed you to change accelarator cores connected to a CPU platform seemed to bring a lot of promise. Xilinx was an early provider with the C6x family IRRC through company they bought. AMD also provided devices with support for partial reconfiguration. There were also some research devices and startups for this in the early 2000s. I planned to do a PhD around this topic. But tool, language support and the added cost in the devices seemed to have killed this. At least for now.
Today, in for example mobile phone systems, FPGAs provide the compute power CPUs can't do with the added ability do add new features as the standards evolve, regional market requirements affect the HW. But this is more like FW upgrades.
6-digit at the high end.
https://www.digikey.com/en/products/detail/amd/XCVU29P-3FSGA...
Companies make products based around FPGAs and can sell the whole thing for less than you could buy just the single FPGA part for on a place like Digi-key. It's just part of the FPGA companies' business models. In volume the price will be far smaller.
The $140,000 device doesn’t become a $400 device in any volume; it might become a $90,000 device.
VU13Ps are quoted $300/ea at tray quantities from Xilinx, yet are $89k on DigiKey with no price breaks.
Only 47 milliseconds from power-on to operational.
Lattice Avant™-G FPGA: Boot Up Time Demo (12.12.2023)
Lattice make some really cool devices. Not the fastest fmax speeds, but hell if the time to config and tiny power draw don't half make up for it.
Absolute eternity by modern computer standards. GPU will be a trillion operations ahead of you before you even start. Or for another view, that's a whole seven frames at 144Hz.
People say FPGAs will be great for many workloads, but then don't give examples. In my experience the only real ones are those requiring low-latency hardware comms. ADC->FPGA->DAC is a powerful combo. Everything else gets run over by either CPU doing integer work or GPU doing FP.
With the jetsons (agx orin) I have on my desk it would take a bit of tinkering to even get it under a minute.
ASICs require a certain scale and a very high up-front cost.
Still practically theory as I have never seen anything come of it. It is going up against ASIC design which is a great middle ground for those thing even if it means you are not free to do it yourself.
I think it's not so much about too expensive, but once you've got the resources it will always be better to switch to an ASIC.
Not a hardware engineer, but it seems obvious to me that any circuitry implemented using an FPGA will be physically bigger with more "wiring" (more resistance, more energy, more heat) than the equivalent ASIC, and accordingly the tolerances will need to be larger so clock speeds will be lower.
Basically, at scale an ASIC will always win out over an FPGA, unless your application is basically "give the user an FPGA" (but this is begging the question—unless your users are hardware engineers this can't be a goal).
Profit is dependent on scale. FPGAs are useful if the scale is so small that an ASIC production line is more expensive than buying a couple of FPGAs.
If the scale is large enough that ASIC production is cheaper, you reap the performance improvements.
Think of it this way: FPGAs are programmed using ASIC circuitry. If you programmed an FPGA using an FPGA (using ASIC circuitry), do you think you'll achieve the same performance as the underlying FPGA? Of course not (assuming you're not cheating with some "identity" compilation). Same thing applies with any other ASIC.
Each layer of FPGA abstraction incurs a cost: more silicon/circuitry/resistance/heat/energy and lower clock speeds.
I'll admit I'm not familiar with the processing requirements of basestations, but the prospect of mass-produced FPGA baseband hardware still seems dubious to me, and I can't find conclusive evidence it being used, only suggestions that it might be useful (going back at least 20 years). Feel free to share more info.
[0] ASIC vs FPGA comparison of RISC-V processor, showing an 18x slowdown (or 94.[4]% reduction), apparently consistent with the "general design performance gap": https://iugrc.journals.ekb.eg/article_302717_7bac60ca6ef9fb9...
And it did a good job. The code it made probably works fine and will run on most Xilinx FPGAs.
Solve your silicon verification workflow with this one weird trick: "looks good to me"!
Placement and routing is an NP-Complete problem.
For me, the promise of computation is not in the current CPU (or now GPU) world, but one in which the hardware can dynamically run the most optimized logic for a given operation.
Sadly, this has yet to materialize (with some exceptions[0][1][2]).
Hopefully in an LLM-first world, we can start to utilize FPGAs more effectively at (small) scale :)
[0]https://www.microsoft.com/en-us/research/wp-content/uploads/...
[2]https://www.alibabacloud.com/blog/deep-dive-into-alibaba-clo...
The issue with FPGAs, is that if you have enough scale, an ASIC starts to make more sense.
Still, I believe they provide a glimpse into the hardware of the future!
GPU/TPU etc are just domain-specific hardware, not much help for exploring new paradigms.
To really solve this, we'll likely need someone who's won the internet lottery and is willing to invest serious capital ($1 billion plus) to solve real problems and get real work done. Until then, it's going to be tech bros playing with VR and LLMs.
I'll see you all in 10 years when the situation still hasn't changed.
> OpenFPGA.. aims to automate the design, verification and layout of highly versatile FPGA architectures. OpenFPGA offers a high-level architecture description language for users to customize their FPGA architectures down to circuit-level details. Based on the architecture modeling, OpenFPGA can auto-generate Verilog netlists, with which users can perform verification as well as generate production-ready layouts using modern EDA tools. OpenFPGA includes a generic Verilog-to-Bitstream generator, as a native EDA toolchain for any FPGAs that are prototyped by OpenFPGA.
2020 DARPA ERI video on open-source accelerated chip design, https://www.youtube.com/watch?v=xKxv7Bdm7Do
2022-2024 presentations on open-source computer architecture, https://oscar-workshop.github.io/Archive.html
> workshop on open-source hardware which addresses the wide variety of challenges encountered by both hardware and software engineers in dealing with the increasing heterogeneity of next-generation computer architectures. By providing a venue which brings together researchers from academia, industry and government labs, OSCAR promotes a collaborative approach to foster the efforts of the open-source hardware community
https://www.righto.com/2020/09/reverse-engineering-first-fpg...
It's really too bad that it was locked to NI products and they've kind faded away.
I sometimes like to think of what could have been, if the ability to program FPGAs so easily would have become more popular.
If you want to do general purpose computing, it's my strong (and minority) opinion that routing fabrics are a premature optimization. The trend has been in the wrong direction.
If you were to go the other way, and just build a systolic array of look up tables, as I have hypothesized for years with my BitGrid, you could save 90% of the silicon, and still get almost all of the compute. It gets better when you consider the active logic would only be between neighboring cells, thus capacitive power would be much lower, and speeds could be higher.
[1] https://downloads.reactivemicro.com/Electronics/FPGA/Xilinx%...
Do you have a link where I could read more about "systolic array of lookup tables" and "BitGrid"?
I've no idea what those two things are but it sure sounds interesting.
Mostly, I've written here, multiple times[1]
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
And pour one out for Altera that was actually founded a year before but took 5 years to arrive at something similar. It was eventually acqui-crushed by Intel.