Smart contracts use the GHCJS cross-compiler to translate off-chain code
4 June 2020 8 mins read
This is the second of the Developer Deep Dive technical posts from our Haskell team. This occasional series offers a candid glimpse into the core elements of the Cardano platform and protocols, and gives insights into the engineering choices that have been made. Here, we outline some of the work that has been going on to improve the libraries and developer tools for Plutus, Cardano’s smart contract platform.
At IOHK we are developing the Plutus smart contract platform for the Cardano blockchain. A Plutus contract is a Haskell program that is partly compiled to on-chain Plutus Core code and partly to off-chain code. On-chain code is run by Cardano network nodes using the interpreter for Plutus Core, the smart contract language embedded in the Cardano ledger. This is how the network verifies transactions. Off-chain code is for tasks such as setting up the contract and for user interaction. It runs inside the wallet of each user of the contract, on a node.js runtime.
In the past year, we haven't made many changes in the GHCJS code generator. Instead, we did some restructuring to make compiling things with GHCJS more reliable and predictable as well as adding support for Windows and making use of the most recent Cabal features. This post gives an overview of what has happened, and a brief look at what's in store for this year.
When installing a package with GHCJS, you probably use the
--ghcjs command line flag or include
compiler: ghcjs in your configuration file. This activates the
ghcjs compiler flavor in Cabal. The
ghcjs flavor is based on the
Cabal has introduced many features in recent years, including support for backpack, Nix-style local builds, multiple (named) libraries per package, and per-component build plans. Unfortunately, the new features resulted in many changes to the code base, and maintenance for the
ghcjs flavor fell behind for some time. We have brought GHCJS support up to date again in version 3.0. If you want to use the new-style build features, make sure that you use cabal-install version 3 or later.
The differences between the
ghc compiler flavors are minor, and cross-compilation support in Cabal has been improving. Therefore, we hope that eventually we will be able to drop the
ghcjs compiler flavor altogether. The extensions would instead be added as platform-specific behaviour in the
GHC allows the compiler to be extended with plug-ins, which can change aspects of the compilation pipeline. For example, plug-ins can introduce new optimization features or extend the typechecker.
Unlike Template Haskell, which is separated from the compiler through the Quasi typeclass abstraction, plug-ins can directly use the whole GHC API. This makes the ‘external interpreter’ approach that GHCJS introduced for running Template Haskell in a cross-compiler unsuitable for compiler plug-ins. Instead, plug-ins need to be built for the build platform (that runs GHC).
In 2016, GHCJS introduced experimental support for compiler plug-ins. This relied on looking up the plug-in in the GHCJS package database and then trying to find a close match for the plug-in package and module in the GHC package database. We have now added a new flag to point GHCJS to packages runnable on the build system. This makes plug-ins usable again with new-style builds and other ‘exotic’ package database configurations.
In principle, our new flag can make plug-ins work on any GHC cross-compiler, but the requirement to also build the plug-in with the cross-compiler is quite ugly. We are working on removing this requirement followed by merging plug-in support for cross-compilers into upstream GHC (see ticket 14335 and ticket 17957).
Long, long ago, GHCJS worked on Windows. One or two brave souls might have actually used it! Its boot packages (the packages built by
ghcjs-boot) would include the Win32 package on the Windows build platform. The reason for this was the Cabal configuration with GHCJS. Cabal’s
os(win32) flag would be set if the build platform was Windows. At the time it was easiest to just patch the package to build without errors with GHCJS, and include it in the boot packages. However, the
Win32 package didn't really work, and keeping it up to date was a maintenance burden. At some point it fell behind and GHCJS didn't work on Windows any more.
The boot packages having to include
Win32 on Windows was indicative of poor separation between the build platform (which runs the compiler) and the host platform (which runs the executable produced by the compiler). This was caused by the lack of a complete C toolchain for GHCJS. Many packages don't just have Haskell code, but also have files produced by a C toolchain, for example via an Autotools
configure script or
The GHCJS approach was to include some pre-generated files, and use the build platform C toolchain (the same that GHC would use) for everything else, hoping that it wouldn't break. If it did break, we would patch the package.
In recent years, the web browser as a compilation target has steadily been gaining more traction. Emscripten has been providing a C toolchain for many years, and has recently switched from its own compiler backend to the Clang compiler with the standard LLVM backend.
Clang has been supported by GHC as a C toolchain for a while. It can output asm.js and WebAssembly code that can run directly in the browser. Unfortunately, users of the compiler cannot yet directly interact with compiled C code through the C FFI (
foreign import ccall) in GHCJS. But having a C toolchain allows packages that depend on
configure scripts or
hsc2hs to compile much more reliably. This fixes some long-standing build problems and allows us to support Windows again. We thought this is already worth the additional dependency.
A variant of GHCJS 8.6 using the Emscripten toolchain is available in the
ghc-8.6-emscripten branch, which can be installed on Windows. This time around, the set of boot packages is the same on every build platform. Emscripten is planned to be the standard toolchain in GHCJS 8.8 onwards.
The downside of this approach was that the build platform very much affected the generated code. If you build on a 64-bit Linux machine, all platform constants would come from the Linux platform. And the code would be built with the assumption that
Later, we switched to using
ghc as a library, introducing
Hooks to change the compilation pipeline where needed. This made it possible to make the GHCJS platform word size independent of the build platform, introduce the
Unfortunately, it turned out to be hard to keep up with changes in the upstream
ghc library. In addition, modifying the existing
Hooks encouraged engineers to work around issues instead of directly fixing them upstream.
In early 2018, we decided to build a custom
ghc library for GHCJS, installed as
ghc-api-ghcjs, allowing us to work around serious issues before they were merged upstream. Recently, we dropped the separate library, and built both the GHC and GHCJS source code in one library:
Although we cannot build GHCJS with the GHC build system yet, we are using the upstream GHC source tree much more directly again. Are we going back to the past? Perhaps, but this time we have our own platform with a toolchain and build tools, avoiding the pitfalls that made this approach so problematic the first time.