LiteX vs. Vivado: First Impressions

October 30th, 2017

Previously, I had written about developing a reference design for the NeTV2 FPGA using Xilinx’s Vivado toolchain. Last year at 33C3 Tim ‘mithro’ Ansell introduced me to LiteX and at his prompting I decided to give it a chance.

Vivado was empowering because instead of having to code up a complex SoC in Verilog, I could use their pseudo-GUI/TCL interface to create a block diagram that largely automated the task of building the AXI routing fabric. Furthermore, I could access Xilinx’s extensive IP library, which included a very flexible DDR memory controller and a well-vetted PCI-express controller. Because of this level of design automation and available IP, a task that would have taken perhaps months in Verilog alone could be completed in a few days with the help of Vivado.

The downsides of Vivado are that it’s not open source (free to download, but not free to modify), and that it’s not terribly efficient or speedy. Aside from the ideological objections to the closed-source nature of Vivado, there are some real, pragmatic impacts from the lack of source access. At a high level, Xilinx makes money selling FPGAs – silicon chips. However, to attract design wins they must provide design tools and an IP ecosystem. The development of this software is directly subsidized by the sale of chips.

This creates an interesting conflict of interest when it comes to the efficiency of the tools – that is, how good they are at optimizing designs to consume the least amount of silicon possible. Spending money to create area-efficient tools reduces revenue, as it would encourage customers to buy cheaper silicon.

As a result, the Vivado tool is pretty bad at optimizing designs for area. For example, the PCI express core – while extremely configurable and well-vetted – has no way to turn off the AXI slave bridge, even if you’re not using the interface. Even with the inputs unconnected or tied to ground, the logic optimizer won’t remove the unused gates. Unfortunately, this piece of dead logic consumes around 20% of my target FPGA’s capacity. I could only reclaim that space by hand-editing the machine-generated VHDL to comment out the slave bridge. It’s a simple enough thing to do, and it had no negative effects on the core’s functionality. But Xilinx has no incentive to add a GUI switch to disable the logic, because the extra gates encourage you to “upgrade” by one FPGA size if your design uses a PCI express core. Similarly, the DDR3 memory core devotes 70% of its substantial footprint to a “calibration” block. Calibration typically runs just once at boot, so the logic is idle during normal operation. With an FPGA, the smart thing to do would be to run the calibration, store the values, and then jam the pre-measured values into the application design, thus eliminating the overhead of the calibration block. However, I couldn’t implement this optimization since the DDR3 block is provided as an opaque netlist. Finally, the AXI fabric automation – while magical – scales poorly with the number of ports. In my most recent benchmark design done with Vivado, 50% of the chip is devoted to the routing fabric, 25% to the DDR3 block, and the remainder to my actual application logic.

Tim mentioned that he thought the same design when using LiteX would fit in a much smaller FPGA. He has been using LiteX to generate the FPGA “gateware” (bitstreams) to support his HDMI2USB video processing pipelines on various platforms, ranging from the Numato-Opsis to the Atlys, and he even started a port for the NeTV2. Intrigued, I decided to port one of my Vivado designs to LiteX so that I could do an apples-to-apples comparison of the two design flows.

LiteX is a soft-fork of Migen/MiSoC – a python-based framework for managing hardware IP and auto-generating HDL. The IP blocks within LiteX are completely open source, and so can be targeted across multiple FPGA architectures. However, for low-level synthesis, place & route, and bitstream generation, it still relies upon proprietary chip-specific vendor tools, such as Vivado when targeting Artix FPGAs. It’s a little bit like an open source C compiler that spits out assembly, so it still requires vendor-specific assemblers, linkers, and binutils. While it may seem backward to open the compiler before the assembler, remember that for software, an assembler’s scope of work is simple — primarily within well-defined 32-bit or so opcodes. However, for FPGAs, the “assembler” (place and route tool) has the job of figuring out where to place single-bit primitives within an “opcode” that’s effectively several million bits long, with potential cross-dependencies between every bit. The abstraction layers, while parallel, aren’t directly comparable.

Let me preface my experience with the statement that I have a love-hate relationship with Python. I’ve used Python a few times for “recreational” projects and small tools, and for driving bits of automation frameworks. But I’ve found Python to be terribly frustrating. If you can use their frameworks from the ground-up, it’s intuitive, fun, even empowering. But if your application isn’t naturally “Pythonic”, woe to you. And I have a lot of needs for bit-banging, manipulating binary files, or grappling with low-level hardware registers, activities that are decidedly not Pythonic. I also spend a lot of time fighting with the “cuteness” of the Python type system and syntax: I’m more of a Rust person. I like strictly typed languages. I am not fond of novelties like using “-1” as the last-element array index and overloading the heck out of binary operators using magic methods.



Comics courtesy of xkcd, CC BY-NC-2.5

Surprisingly, I was able to get LiteX up and running within a day. This is thanks in large part to Tim’s effort to create a really comprehensive bootstrapping script that checks out the git repo, all of the submodules (thank you!), and manages your build environment. It just worked; the only bump I encountered was a bit of inconsistent documentation on installing the Xilinx toolchain (for Artix builds you need to grab Vivado; and Spartan you grab ISE). The whole thing ate about 19GiB of hard drive space, of which 18GiB is the Vivado toolchain.

I was rewarded with a surprisingly powerful and mature framework for defining SoCs. Thanks to the extensive work of the MiSoC and LiteX crowd, there’s already IP cores for DRAM, PCI express, ethernet, video, a softcore CPU (your choice of or1k or lm32) and more. To be fair, I haven’t been able to load these on real hardware and validate their spec-compliance or functionality, but they seem to compile down to the right primitives so they’ve got the right shape and size. Instead of AXI, they’re using Wishbone for their fabric. It’s not clear to me yet how bandwidth-efficient the MiSoC fabric generator is, but the fact that it’s already in use to route 4x HDMI connections to DRAM on the Numato-Opsis would indicate that it’s got enough horsepower for my application (which only requires 3x HDMI connections).

As a high-level framework, it’s pretty magical. Large IP instances and corresponding bus ports are allocated on-demand, based on a very high level description in Python. I feel a bit like a toddler who has been handed a loaded gun with the safety off. I’m praying the underlying layers are making sane inferences. But, at least in the case of LiteX, if I don’t agree with the decisions, it’s open source enough that I could try to fix things, assuming I have the time and gumption to do so.

For my tool flow comparison, I implemented a simple 2x HDMI-in to DDR3 to 1x HDMI-out design in both Vivado and in LiteX. Creating the designs is about the same effort on both flows – once you have the basic IP blocks, instantiating bus fabric and allocation of addressing is largely automated in each case. Vivado is superior for pin/package layout thanks to its graphical planning tool (I find an illustration of the package layout to be much more intuitive than a textual list of ball-grid coordinates), and LiteX is a bit faster for design creation despite the usual frustrations I have with Python (up to the reader’s bias to decide whether it’s just that I have a different way of seeing things or if my intellect is insufficient to fully appreciate the goodness that is Python).


Pad layout planning in Vivado is aided by a GUI


Example of LiteX syntax for pin constraints

But from there, the experience between the two diverges rapidly. The main thing that’s got me excited about LiteX is the speed and efficiency of its high-level synthesis. LiteX produces a design that uses about 20% of an XC7A50 FPGA with a runtime of about 10 minutes, whereas Vivado produces a design that consumes 85% of the same FPGA with a runtime of about 30-45 minutes.

Significantly, LiteX tends to “fail fast”, so syntax errors or small problems with configurations become obvious within a few seconds, if not a couple minutes. However, Vivado tends to “fail late” – a small configuration problem may not pop up until about 20 minutes into the run, due to the clumsy way it manages out-of-context block synthesis and build dependencies. This means that despite my frustrations with the Python syntax, the penalty paid for small errors is much less in terms of time – so overall, I’m more productive.

But the really compelling point is the efficiency. The fact that LiteX generates more efficient HDL means I can potentially shave a significant amount of cost out of a design by going to a smaller FPGA. Remember, both LiteX and Vivado use the same back-end for low-level sythesis and place and route. The difference is entirely in the high-level design automation – and this is a level that I can see being a good match for a Python-based framework. You’re not really designing hardware with Python (eventually it all turns into Verilog) so much as managing and configuring libraries of IP, something that Python is quite well suited for. To wit, I dug around in the MiSoC libraries a bit and there seem to be some serious logic designs using this Python syntax. I’m not sure I want to wrap my head around this coding style, but the good news is I can still write my leaf cells in Verilog and call them from the high-level Python integration framework.

So, I’m cautiously proceeding to use LiteX as the main design flow going forward for NeTV2. We’ll see how the bitstream proves out in terms of timing and functionality once my next generation hardware is available, but I’m optimistic. I have a few concerns about how debugging will work – I’ve found the Xilinx ILA cores to be extremely powerful tools and the ability to automatically reverse engineer any complex design into a schematic (a feature built into Vivado) helps immensely with finding timing and logic bugs. But with a built-in soft CPU core, the “LiteScope” logic analyzer (with sigrok support coming soon), and fast build times, I have a feeling there is ample opportunity to develop new, perhaps even more powerful methods within LiteX to track down tricky bugs.

My final thought is that LiteX, in its current state, is probably best suited for people trained to write software who want to design hardware, rather than for people classically trained in circuit design who want a tool upgrade. The design idioms and intuitions built into LiteX pulls strongly from the practices of software designers, which means a lot of “obvious” things are left undocumented that will throw outsiders (e.g. hardware designers like me) for a loop. There’s no question about the power and utility of the design flow – so, as the toolchain matures and documentation improves I’m optimistic that this could become a popular design flow for hardware projects of all magnitudes.


Interested? Tim has suggested the following links for further reading:

Name that Ware October 2017

October 26th, 2017

The Ware for October 2017 is shown below.

Sometimes a ware just presents itself to you among your travels; thankfully cell phone cameras have come a long way.

Winner, Name that Ware September 2017

October 26th, 2017

The Ware for September 2017 is a WP 5007 Electrometer. I’ll give this one to Ingo, for the first mention of an electrometer. Congrats, email me for your prize! And @zebonaut, agreed, polystyrene caps FTW :)

Why I’m Using Bitmarks on my Products

October 13th, 2017

One dirty secret of hardware is that a profitable business isn’t just about design innovation, or even product cost reduction: it’s also about how efficiently one can move stuff from point A to B. This explains the insane density of hardware suppliers around Shenzhen; it explains the success of Ikea’s flat-packed furniture model; and it explains the rise of Amazon’s highly centralized, highly automated warehouses.

Unfortunately, reverse logistics – the system for handling returns & exchanges of hardware products – is not something on the forefront of a hardware startup’s agenda. In order to deal with defective products, one has to ship a product first – an all-consuming goal. However, leaving reverse logistics as a “we’ll fix it after we ship” detail could saddle the venture with significant unanticipated customer support costs, potentially putting the entire business model at risk.

This is because logistics are much more efficient in the “forward” direction: the cost of a centralized warehouse to deliver packages to an end consumer’s home address is orders of magnitude less than it is for a residential consumer to mail that same parcel back to the warehouse. This explains the miracle of Amazon Prime, when overnighting a pair of hand-knit mittens to your mother somehow costs you $20. Now repeat the hand-knit mittens thought experiment and replace it with a big-screen TV that has to find its way back to a factory in Shenzhen. Because the return shipment can no longer take advantage of bulk shipping discounts, the postage to China is likely more than the cost of the product itself!

Because of the asymmetry in forward versus reverse logistics cost, it’s generally not cost effective to send defective material directly back to the original factory for refurbishing, recycling, or repair. In many cases the cost of the return label plus the customer support agent’s time will exceed the cost of the product. This friction in repatriating defective product creates opportunities for unscrupulous middlemen to commit warranty fraud.

The basic scam works like this: a customer calls in with a defective product and gets sent a replacement. The returned product is sent to a local processing center, where it may be declared unsalvageable and slated for disposal. However, instead of a proper disposal, the defective goods “escape” the processing center and are resold as new to a different customer. The duped customer then calls in to exchange the same defective product and gets sent a replacement. Rinse lather repeat, and someone gets rich quick selling scrap at full market value.

Similarly, high-quality counterfeits can sap profits from companies. Clones of products are typically produced using cut-rate or recycled parts but sold at full price. What happens when customers then find quality issues with the clone? That’s right – they call the authentic brand vendor and ask for an exchange. In this case, the brand makes zero money on the customer but incurs the full cost of supporting a defective product. This kind of warranty fraud is pandemic in smart phones and can cost producers many millions of dollars per year in losses.


High-quality clones, like the card on the left, can cost businesses millions of dollars in warranty fraud claims.

Serial numbers help mitigate these problems, but it’s easy to guess a simple serial number. More sophisticated schemes tie serial numbers to silicon IDs, but that necessitates a system which can reliably download the serialization data from the factory. This might seem a trivial task but for a lot of reasons – from failures in storage media to human error to poor Internet connectivity in factories – it’s much harder than it seems to make this happen. And for a startup, losing an entire lot of serialization data due to a botched upload could prove fatal.

As a result, most hardware startups ship products with little to no plan for product serialization, much less a plan for reverse logistics. When the first email arrives from an unhappy customer, panic ensues, and the situation is quickly resolved, but by the time the product arrives back at the factory, the freight charges alone might be in the hundreds of dollars. Repeat this exercise a few dozen times, and any hope for a profitable run is rapidly wiped out.

I’ve wrestled with this problem on and off through several startups of my own and finally landed on a solution that looks promising: it’s reasonably robust, fraud-resistant, and dead simple to implement. The key is the bitmark – a small piece of digital data that links physical products to the blockchain.

Most people are familiar with blockchains through Bitcoin. Bitcoin uses the blockchain as a public ledger to prevent double-spending of the same virtual coin. This same public ledger can be applied to physical hardware products through a bitmark. Products that have been bitmarked can have their provenance tracked back to the factory using the public ledger, thus hampering cloning and warranty fraud – the physical equivalent of double-spending a Bitcoin.

One of my most recent hardware startups, Chibitronics has teamed up with Bitmark to develop an end-to-end solution for Chibitronics’ newest microcontroller product, the Chibi Chip.

As an open hardware business, we welcome people to make their own versions of our product, but we can’t afford to give free Chibi Chips to customers that bought cut-rate clones and then report them as defective for a free upgrade to an authentic unit. We’re also an extremely lean startup, so we can’t afford the personnel to build a full serialization and reverse logistics system from scratch. This is where Bitmark comes in.

Bitmark has developed a turn-key solution for serialization and reverse logistics triage. They issue us bitmarks as lists of unique, six-word phrases. The six-word phrases are less frustrating for users to type in than strings of random characters. We then print the phrases onto labels that are stuck onto the back of each Chibi Chip.


Bitmark claim code on the back of a Chibi Chip

We release just enough of these pre-printed labels to the factory to run our authorized production quantities. This allows us to trace a bitmark back to a given production lot. It also prevents “ghost shifting” – that is, authorized factories producing extra bootleg units on a midnight shift that are sold into the market at deep discounts. Bitmark created a website for us where customers can then claim their bitmarks, thus registering their product and making it eligible for warranty service. In the event of an exchange or return, the product’s bitmark is updated to record this event. Then if a product fails to be returned to the factory, it can’t be re-claimed as defective because the blockchain ledger would evidence that bitmark as being mapped to a previously returned product. This allows us to defer the repatriation of the product to the factory. It also enables us to use unverified third parties to handle returned goods, giving us a large range of options to reduce reverse logistics costs.

Bitmark also plans to roll out a site where users can verify the provenance of their bitmarks, so buyers can check if a product’s bitmark is authentic and if it has been previously returned for problems before they buy it. This increases the buyer’s confidence, thus potentially boosting the resale value of used Chibi Chips.

For the cost and convenience of a humble printed label, Bitmark enhances control over our factories, enables production lot traceability, deters cloning, prevents warranty fraud, enhances confidence in the secondary market, and gives us ample options to streamline our reverse logistics.

Of course, the solution isn’t perfect. A printed label can be peeled off one product and stuck on another, so people could potentially just peel labels off good products and resell the labels to users with broken clones looking to upgrade by committing warranty fraud. This scenario could be mitigated by using tamper-resistant labels. And for every label that’s copied by a cloner, there’s one victim who will have trouble getting support on an authentic unit. Also, if users are generally lax about claiming their bitmark codes, it creates an opportunity for labels to be sparsely duplicated in an effort to ghost-shift/clone without being detected; but this can be mitigated with a website update that encouraging customers to immediately register their bitmarks before using the web-based services tied to the product. We also have to exercise care in handling lists of unclaimed phrases because, until a customer registers their bitmark claim phrase in the blockchain, the phrases have value to would-be fraudsters.

But overall, for the cost and convenience, the solution outperforms all the other alternatives I’ve explored to date. And perhaps most importantly for hardware startups like mine that are short on time and long on tasks, printing bitmarks is simple enough for us to implement that it’s hard to justify doing anything else.

Disclosure: I am a technical advisor and shareholder of Bitmark.

Name that Ware, September 2017

September 30th, 2017

The Ware for September 2017 is shown below.

And here is the underside of the plug-in module from the left hand side of the PCB:

Thanks to Chris for sending in this gorgeous ware. I really appreciate both the aesthetic beauty of this ware, as well as the exotic construction techniques employed.