{"id":395,"date":"2017-11-20T06:53:23","date_gmt":"2017-11-19T22:53:23","guid":{"rendered":"https:\/\/colliot.me\/?p=395"},"modified":"2017-11-22T18:11:28","modified_gmt":"2017-11-22T10:11:28","slug":"what-is-gadt-in-haskell","status":"publish","type":"post","link":"https:\/\/colliot.org\/en\/2017\/11\/what-is-gadt-in-haskell\/","title":{"rendered":"What is GADT (Generalized Algebraic Data Types) in Haskell?"},"content":{"rendered":"<p>We assume that you are familiar with what we are doing without GADT. Say, for the simplest case, we are defining a list\uff1a<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">List a = Nil | Cons a (List a)<\/pre>\n<p>What exactly are we doing?<\/p>\n<ul>\n<li>On the LHS, it defines a &#8220;type constructor&#8221; called <code>List<\/code>. It takes a type as parameter, and returns a new type. It&#8217;s essentially a <em>function of types<\/em>.<\/li>\n<li>On the RHS, it tells us that we have two ways to construct a value of type <code>List a<\/code>. One is called <code>Nil<\/code>\u00a0and the other <code>Cons a (List a)<\/code>.\n<ul>\n<li>You can directly write <code>Nil<\/code>\u00a0to get a value of type <code>List a<\/code>, with type <code>a<\/code>\u00a0to be determined, more precisely inferred from the context.<\/li>\n<li>In the case of <code>Cons a (List a)<\/code>, it&#8217;s actually a function named <code>Cons<\/code>, taking two parameters, one with type <code>a<\/code>, the other with type <code>List a<\/code>. Again, the type of <code>a<\/code>\u00a0should be inferred from the context. But this time it&#8217;s often more direct, as <code>a<\/code>\u00a0is the type of the first parameter.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Then we can represent these two constructors as normal functions:<!--more--><\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">Nil :: List a\r\nCons :: a -&gt; List a -&gt; List a<\/pre>\n<p>We can see that they are functions both with the same return type <code>List a<\/code>. <em>The only thing that GADT does<\/em>, is allowing a specific <code>a<\/code>\u00a0here. For example, we can have a constructor with <code>a<\/code>\u00a0specifically set to <code>Int<\/code>, meaning it only constructs values of type <code>List Int<\/code>.<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">IntNil :: List Int<\/pre>\n<p>We can experiment with some code in the GADT flavor:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"9,13\" data-caption=\"\">data List a where\r\n  IntNil :: List Int\r\n  Nil :: List a\r\n  Cons :: a -&gt; List a -&gt; List a\r\n\r\na :: List Int\r\na = IntNil\r\n\r\n-- Would raise an error for type incompatibility.\r\n-- c_ :: List Char\r\n-- c_ = IntNil\r\n\r\n-- This one is compatible.\r\nc :: List Char\r\nc = Nil<\/pre>\n<p>Certainly, <code>a<\/code>\u00a0does not have to be a type as simple as <code>Int<\/code>. It could be an arbitrary type, even <code>List a<\/code>\u00a0or <code>List (List a)<\/code>\u00a0with <code>a<\/code>\u00a0still unspecified:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"8\" data-caption=\"\">data Pair a b where Pair :: a -&gt; b -&gt; Pair a b\r\n\r\ndata List a where IntListNil :: List (List Int)\r\n  ListNil :: List (List a)\r\n  IntNil :: List Int\r\n  Nil :: List a\r\n  Cons :: a -&gt; List a -&gt; List a\r\n  -- This one is invalid, since the top-level `Pair` is not `List`\r\n  -- Bad :: a -&gt; List a -&gt; Pair a (List a)<\/pre>\n<p>So what can GADT be used for? What can be additionally done with it? Here are some scenarios as far as I know:<\/p>\n<p>The most famous use case is to embed (the AST of) a language into Haskell. Suppose we have such a simple language of expressions:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">data Expr = \r\n    ILit Int\r\n  | BLit Bool\r\n  | Add Expr Expr\r\n  | Eq Expr Expr<\/pre>\n<p>Such a data type definition is sufficient to contain the language, but it also contains expressions beyond the language (that is, invalid expressions). For example, <code>Bool<\/code>\u00a0values cannot be <code>Add<\/code>ed. But this definition does not prevents us from writing:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">addBool :: Expr\r\naddBool = Add (BLit True) (BLit False)<\/pre>\n<p>An attempt to fix this is to record the &#8220;return type&#8221; of an <code>Expr<\/code>\u00a0somewhere. Remember we have done that with <code>List a<\/code>, where we record the member type <code>a<\/code>\u00a0in the list type <code>List a<\/code>. We can do the same to <code>Expr<\/code>:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">data Expr a = \r\n    ILit Int\r\n  | BLit Bool\r\n  | Add (Expr Int) (Expr Int)\r\n  | Eq (Expr a) (Expr a)<\/pre>\n<p>This looks plausible. But there are still problems!<\/p>\n<p>The crucial part is, while\u00a0<code>BLit True<\/code>\u00a0should be <code>Expr Bool<\/code>\u00a0so that it cannot be the parameters of <code>Add<\/code>, it isn&#8217;t actually. The <code>BLit True<\/code>\u00a0here is still of type <code>Expr a<\/code>\u00a0with <code>a<\/code>\u00a0undetermined (to be inferred). This is parallel to the case where <code>Nil<\/code>\u00a0is of type <code>List a<\/code>\u00a0with <code>a<\/code>\u00a0undetermined. There is actually no relation between the type of the parameter of the constructor <code>BLit<\/code>\u00a0and the type of the constructed value.<\/p>\n<p>Here GADT comes into the play. We can constraint the return type of the constructor only with GADT:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"5\" data-caption=\"\">data Expr a where\r\n  ILit :: Int -&gt; Expr Int\r\n  BLit :: Bool -&gt; Expr Bool\r\n  Add :: Expr Int -&gt; Expr Int -&gt; Expr Int\r\n  -- In the case of Eq, we can loosen the type specification into a typeclass.\r\n  Eq :: Eq a =&gt; Expr a -&gt; Expr a -&gt; Expr Bool<\/pre>\n<p>This would achieve the goal:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"1\" data-caption=\"\">-- Won't compile, since `BLit True` and `BLit False` are of type `Expr Bool`\r\n-- addBool =  Add (BLit True) (BLit False)\r\naddInt = Add (ILit 1) (ILit 2)\r\neqBool = Eq (BLit True) (BLit False)<\/pre>\n<p>Fantastic!<\/p>\n<p>This is the so-called &#8220;initial embedding&#8221; of a language. The final version we achieved with GADT is a &#8220;tagless\u00a0 initial embedding&#8221;. Why tagless? Because ADT is the so-called &#8220;tagged union&#8221;, and we are abandoning it, thus tagless.<\/p>\n<p>The <code>Expr<\/code>\u00a0represents a very simple language. You can well embed more complex languages such as simply type lambda calculus.<\/p>\n<p>Contrary to &#8220;initial embedding&#8221;, there&#8217;s &#8220;final embedding&#8221; as well. But that&#8217;s another story.<\/p>\n<hr \/>\n<p>We can even impose more restrictions on the embed of languages. Suppose we have a language describing stack operations. <code>Push<\/code>\u00a0means push a number to the top of the stack. <code>Add<\/code>\u00a0means pop the top two numbers and push back their sum:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"6\" data-caption=\"\">data StackLang where\r\n  Begin :: StackLang\r\n  Push :: Int -&gt; StackLang -&gt; StackLang\r\n  Add :: StackLang -&gt; StackLang\r\n\r\nbug = Add (Push 1 Begin)<\/pre>\n<p>The <code>bug<\/code>\u00a0line reflects a bug that would happen. When there&#8217;s only one element in the stack, an <code>Add<\/code>\u00a0could not be performed. If only we could eliminate such silly bugs on the type level!<\/p>\n<p>This could be possible when the number of elements in the stack is encoded in the type <code>StackLang<\/code>. This is exactly where GADT shines:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">data Z\r\ndata S n\r\n\r\ndata SStackLang t where\r\n  SBegin :: SStackLang Z\r\n  SPush :: Int -&gt; SStackLang t -&gt; SStackLang (S t)\r\n  SAdd :: SStackLang (S (S t)) -&gt; SStackLang (S t)<\/pre>\n<p>The <code>Z<\/code>\u00a0(for 0) and <code>S<\/code>\u00a0(for <em>successor<\/em>) here are simulations of the Peano system of natural numbers. Now compiling the following<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">sBug = SAdd (SPush 1 SBegin)<\/pre>\n<p>would raise an error:<\/p>\n<blockquote><p>\u2022 Couldn\u2019t match type \u2018Z\u2019 with \u2018S t\u2019<br \/>\nExpected type: SStackLang (S (S t))<br \/>\nActual type: SStackLang (S Z)<\/p><\/blockquote>\n<p>Safe as expected.<\/p>\n<p>This is actually a simulation of the <em>dependent type<\/em>. You can define types representing &#8220;lists of length n&#8221; and so on in this way.<\/p>\n<p>GADT can also convert types into values. You can do this without GADT:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">data Witness a = Witness a<\/pre>\n<p>But you need to supply a value of type <code>a<\/code>.\u00a0 What if you cannot find such a value? It&#8217;s not reasonable to require a value to encode a type, as such a value may not exist at all! (consider <code>Void<\/code>)<\/p>\n<p>You can drop this requirement with GADT:<\/p>\n<pre class=\"prettyprint lang-haskell\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">data Witness a where\r\n  IntWitness :: Witness Int\r\n  BoollWitness :: Witness Bool<\/pre>\n<p>In this way, you directly get a value in the encoded type <code>Witness a<\/code>, even if there&#8217;s no value of the original type <code>a<\/code>.<\/p>\n<p>This is the so called &#8220;type witness&#8221;. But how can this be used? (to be continued)<\/p>","protected":false},"excerpt":{"rendered":"<p>Sorry, this entry is only available in \u4e2d\u6587. For the sake of viewer convenience, the content is shown below in the alternative language. You may click the link to switch the active language.\u5148\u770b\u770b\u6ca1\u6709 GADT \u7684\u65f6\u5019\u6211\u4eec\u5728\u505a\u5565\u3002\u6700\u7b80\u5355\u7684\uff0c\u6bd4\u5982\u5b9a\u4e49\u4e00\u4e2a\u5217\u8868\uff1a List a = Nil | Cons a (List a) \u5b83\u662f\u5728\u505a\u5565\u5462\uff1f \u5148\u770b\u7b49\u53f7\u5de6\u8fb9\uff0c\u5b83\u9996\u5148\u5b9a\u4e49\u4e86\u4e00\u4e2a\u53eb List \u7684 type constructor\uff0c\u5b83\u63a5\u53d7\u4e00\u4e2a\u7c7b\u578b\uff0c\u5e76\u8fd4\u56de\u4e00\u4e2a\u65b0\u7684\u7c7b\u578b\u3002\u5176\u5b9e\u5c31\u662f\u7c7b\u578b\u7684\u51fd\u6570\u3002 \u518d\u770b\u53f3\u8fb9\uff0c\u5b83\u544a\u8bc9\u6211\u4eec\u6709\u4e24\u79cd\u65b9\u5f0f\u83b7\u5f97\u4e00\u4e2a\u7c7b\u578b\u4e3a List a \u7684\u503c\uff0c\u4e00\u4e2a\u53eb Nil\uff0c\u4e00\u4e2a\u53eb &hellip; <a href=\"https:\/\/colliot.org\/en\/2017\/11\/what-is-gadt-in-haskell\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;What is GADT (Generalized Algebraic Data Types) in Haskell?&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[16],"tags":[34,32],"_links":{"self":[{"href":"https:\/\/colliot.org\/en\/wp-json\/wp\/v2\/posts\/395"}],"collection":[{"href":"https:\/\/colliot.org\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/colliot.org\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/colliot.org\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/colliot.org\/en\/wp-json\/wp\/v2\/comments?post=395"}],"version-history":[{"count":5,"href":"https:\/\/colliot.org\/en\/wp-json\/wp\/v2\/posts\/395\/revisions"}],"predecessor-version":[{"id":435,"href":"https:\/\/colliot.org\/en\/wp-json\/wp\/v2\/posts\/395\/revisions\/435"}],"wp:attachment":[{"href":"https:\/\/colliot.org\/en\/wp-json\/wp\/v2\/media?parent=395"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/colliot.org\/en\/wp-json\/wp\/v2\/categories?post=395"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/colliot.org\/en\/wp-json\/wp\/v2\/tags?post=395"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}