What's Bad about Bluespec System Verilog (part 2)

Thursday, 4 July 2013

The previous post in this series dealt with the limits of implicit conditions in BSV. Those limitations make it harder to write certain things in BSV, such as the multiplexer example I provided. But, as I also wrote, there are practical difficulties with expanding the capabilities of implicit conditions.

Specifically, the implicit conditions for a method are brought together as a READY signal which runs from the component containing the method to the component using the method. This can't be used to prioritise one method over another, because that would require a signal in the opposite direction. We would need methods with different semantics, or something other than a method (a sub-rule?). There is no easy solution.

The subject of this post is another matter that also lacks an easy solution, but of a rather different nature. It's about "type classes".

Modules in BSV may be polymorphic. A polymorphic module is described in the BSV code in generic form, i.e. it acts on objects of type "X". When the module is instantiated, X is given a real type, such as "Bit#(32)". This is a brilliant idea, and a mark of all high-level languages. It facilitates the creation of reusable generic data structures and algorithms.

But in BSV, generic types such as X cannot always be completely opaque, i.e. with semantics that are unimportant to the polymorphic module. Sometimes it may be necessary to require that "X is a vector of N bits" or similar.

For Blueshell, I wrote various polymorphic modules that made restrictions of this sort. One was a cache module, implementing a direct-mapped cache of configurable size. The size was configured by specifying the type of the cache index (e.g. Bit#(9) for a cache containing 512 lines) and the type of the cache tag (e.g. Bit#(19)). The cache index and tag are part of each address requested from the cache. (The size of a cache line is specified by the bus size as 4 - it's not a generic parameter.)

It is necessary to restrict the size of each of these types. Some sizes don't make sense in isolation (e.g. Bit#(0)). But other sizes don't make sense when used together. The index size plus the tag size plus the cache line size must be no greater than 32 bits - because that is the size of an address from the CPU. This requirement is imposed by statements inside the cache, where the tag and index are derived from the address, e.g.:
Bit#(num_Index_Bits) cache_index = address[4 + valueOf(num_Index_Bits) - 1 : 4 ];
If num_Index_Bits is (say) 50, then this will be equivalent to:
Bit#(50) cache_index = address[53 : 4];
which is incorrect, as only 32 bits are defined for address.

To prevent this type error, BSV has a formal system to impose type restrictions on generic parameters. When defining a polymorphic module, we must state the limits on types, by defining the "type class" that each type can belong to. For instance:
Add#(num_Index_Bits, num_Tag_Bits, 28)
requires that num_Index_Bits + num_Tag_Bits = 28.

The formal type class system is a sub-language of BSV. It has its own expressions - note the use of "Add#" in place of "+". It acts on things that look like constants (e.g. num_Index_Bits) but are actually types, and have to be converted to constants if they are used inside the regular variety of BSV expression.

Type classes are checked at an early stage of compilation, which will fail if any constraint is insufficiently specified. The constraints for a module must be at least as strict as the constraints used within that module. Just including a statement such as
Bit#(num_Index_Bits) cache_index = address[4 + valueOf(num_Index_Bits) - 1 : 4 ];
will force the module definition to include the constraints for the address[] operation, which prevent accessing more than 32 bits.

There is a good reason for a formal type system like this within a software language. That's because run-time errors are bad. Having a formal type system allows the compiler to find type errors while the program is being built, because the capabilities and attributes of each type are precisely specified using rules that the compiler can verify. The extra overhead of using the formal type system is more than compensated by the reduced need for debugging and testing.

But this motivation does not apply equally to a HDL, because there is no run-time. The compiler output is a logic circuit. The cache_index assignment is not resolved dynamically - it is translated into a Verilog statement such as:
assign cache_index = address[12 : 4];
If the size of the cache_index type were incorrect, either this translation or the Verilog synthesis would fail.

Because the BSV compiler generates a circuit, the implications of every type used in every generic module are completely evaluated during compilation. There is simply no need for a formal system, because the purpose of the formal system is to detect errors at compile time, and there is a much easier way to do exactly that. The formal system is just overhead for the designer.

I would advocate a lightweight type system in which it is possible for modules to assert facts about the relationships between types, using the same expressions used in normal BSV code instead of special expressions (e.g. "Add#") used within type class specifications. For example:
assert (num_Index_Bits + num_Tag_Bits) == 28

These assertions would be optional, as well as being easier to write.

We would still see the same type errors at compile time, but they'd be easier to understand (e.g. "assertion on line 100 failed" or "'address' is 32 bits wide, and you're trying to access 'address[53]'").

It would also reduce the effort required when one generic component is used within another. In that case, you have to copy the type class specification verbatim, or make it more strict, so your code ends up peppered with similar specifications. The length of each specification seems to be O(N^2) for N related type classes, so this is pretty messy.

Finally, reducing the complexity of the language and the compiler is generally a good thing, and removing (or simplifying) unnecessary formalisms is a good way to do that. But this would be a major change to the language, so once again, there's no easy solution.

In the last part of this series I will look at the relationship between BSV and Verilog code, and where it is important to understand it.