<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="generator" content="Asciidoctor 2.0.26">
<title>PrintfGentle</title>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Open+Sans:300,300italic,400,400italic,600,600italic%7CNoto+Serif:400,400italic,700,700italic%7CDroid+Sans+Mono:400,700">
<link rel="stylesheet" href="./asciidoctor.css">
<link rel="stylesheet" href="./mlton.css">

</head>
<body class="article">
<div id="mlton-header">
<div id="mlton-header-text">
<h2>
<a href="./Home">
MLton
20241230+git20251029+dfsg-5
</a>
</h2>
</div>
</div>
<div id="header">
<h1>PrintfGentle</h1>
<div id="toc" class="toc">
<div id="toctitle">Table of Contents</div>
<ul class="sectlevel1">
<li><a href="#_introduction">Introduction</a></li>
<li><a href="#_from_tupling_to_currying">From tupling to currying</a></li>
<li><a href="#_overloading_and_dependent_types">Overloading and dependent types</a></li>
<li><a href="#_idea_express_type_information_in_the_format_string">Idea: express type information in the format string</a></li>
<li><a href="#_the_types_of_format_characters">The types of format characters</a></li>
<li><a href="#_understanding_guess_and_verify">Understanding guess and verify</a></li>
<li><a href="#_type_checking_this_using_a_functor">Type checking this using a functor</a></li>
<li><a href="#_implementing_printf">Implementing <code>Printf</code></a></li>
<li><a href="#_testing_printf">Testing printf</a></li>
<li><a href="#_user_definable_formats">User-definable formats</a></li>
<li><a href="#_a_core_printf">A core <code>Printf</code></a></li>
<li><a href="#_extending_to_fprintf">Extending to fprintf</a></li>
<li><a href="#_notes">Notes</a></li>
<li><a href="#_also_see">Also see</a></li>
</ul>
</div>
</div>
<div id="content">
<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>This page provides a gentle introduction and derivation of <a href="Printf">Printf</a>,
with sections and arrangement more suitable to a talk.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_introduction">Introduction</h2>
<div class="sectionbody">
<div class="paragraph">
<p>SML does not have <code>printf</code>.  Could we define it ourselves?</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">val () = printf ("here's an int %d and a real %f.\n", 13, 17.0)
val () = printf ("here's three values (%d, %f, %f).\n", 13, 17.0, 19.0)</code></pre>
</div>
</div>
<div class="paragraph">
<p>What could the type of <code>printf</code> be?</p>
</div>
<div class="paragraph">
<p>This obviously can&#8217;t work, because SML functions take a fixed number
of arguments.  Actually they take one argument, but if that&#8217;s a tuple,
it can only have a fixed number of components.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_from_tupling_to_currying">From tupling to currying</h2>
<div class="sectionbody">
<div class="paragraph">
<p>What about currying to get around the typing problem?</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">val () = printf "here's an int %d and a real %f.\n" 13 17.0
val () = printf "here's three values (%d, %f, %f).\n" 13 17.0 19.0</code></pre>
</div>
</div>
<div class="paragraph">
<p>That fails for a similar reason.  We need two types for <code>printf</code>.</p>
</div>
<div class="listingblock">
<div class="content">
<pre>val printf: string -&gt; int -&gt; real -&gt; unit
val printf: string -&gt; int -&gt; real -&gt; real -&gt; unit</pre>
</div>
</div>
<div class="paragraph">
<p>This can&#8217;t work, because <code>printf</code> can only have one type.  SML doesn&#8217;t
support programmer-defined overloading.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_overloading_and_dependent_types">Overloading and dependent types</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Even without worrying about number of arguments, there is another
problem.  The type of <code>printf</code> depends on the format string.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">val () = printf "here's an int %d and a real %f.\n" 13 17.0
val () = printf "here's a real %f and an int %d.\n" 17.0 13</code></pre>
</div>
</div>
<div class="paragraph">
<p>Now we need</p>
</div>
<div class="listingblock">
<div class="content">
<pre>val printf: string -&gt; int -&gt; real -&gt; unit
val printf: string -&gt; real -&gt; int -&gt; unit</pre>
</div>
</div>
<div class="paragraph">
<p>Again, this can&#8217;t possibly working because SML doesn&#8217;t have
overloading, and types can&#8217;t depend on values.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_idea_express_type_information_in_the_format_string">Idea: express type information in the format string</h2>
<div class="sectionbody">
<div class="paragraph">
<p>If we express type information in the format string, then different
uses of <code>printf</code> can have different types.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">type 'a t  (* the type of format strings *)
val printf: 'a t -&gt; 'a
infix D F
val fs1: (int -&gt; real -&gt; unit) t = "here's an int "D" and a real "F".\n"
val fs2: (int -&gt; real -&gt; real -&gt; unit) t =
   "here's three values ("D", "F", "F").\n"
val () = printf fs1 13 17.0
val () = printf fs2 13 17.0 19.0</code></pre>
</div>
</div>
<div class="paragraph">
<p>Now, our two calls to <code>printf</code> type check, because the format
string specializes <code>printf</code> to the appropriate type.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_the_types_of_format_characters">The types of format characters</h2>
<div class="sectionbody">
<div class="paragraph">
<p>What should the type of format characters <code>D</code> and <code>F</code> be?  Each format
character requires an additional argument of the appropriate type to
be supplied to <code>printf</code>.</p>
</div>
<div class="paragraph">
<p>Idea: guess the final type that will be needed for <code>printf</code> the format
string and verify it with each format character.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">type ('a, 'b) t   (* 'a = rest of type to verify, 'b = final type *)
val ` : string -&gt; ('a, 'a) t  (* guess the type, which must be verified *)
val D: (int -&gt; 'a, 'b) t * string -&gt; ('a, 'b) t  (* consume an int *)
val F: (real -&gt; 'a, 'b) t * string -&gt; ('a, 'b) t  (* consume a real *)
val printf: (unit, 'a) t -&gt; 'a</code></pre>
</div>
</div>
<div class="paragraph">
<p>Don&#8217;t worry.  In the end, type inference will guess and verify for us.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_understanding_guess_and_verify">Understanding guess and verify</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Now, let&#8217;s build up a format string and a specialized <code>printf</code>.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">infix D F
val f0 = `"here's an int "
val f1 = f0 D " and a real "
val f2 = f1 F ".\n"
val p = printf f2</code></pre>
</div>
</div>
<div class="paragraph">
<p>These definitions yield the following types.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">val f0: (int -&gt; real -&gt; unit, int -&gt; real -&gt; unit) t
val f1: (real -&gt; unit, int -&gt; real -&gt; unit) t
val f2: (unit, int -&gt; real -&gt; unit) t
val p: int -&gt; real -&gt; unit</code></pre>
</div>
</div>
<div class="paragraph">
<p>So, <code>p</code> is a specialized <code>printf</code> function.  We could use it as
follows</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">val () = p 13 17.0
val () = p 14 19.0</code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_type_checking_this_using_a_functor">Type checking this using a functor</h2>
<div class="sectionbody">
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">signature PRINTF =
   sig
      type ('a, 'b) t
      val ` : string -&gt; ('a, 'a) t
      val D: (int -&gt; 'a, 'b) t * string -&gt; ('a, 'b) t
      val F: (real -&gt; 'a, 'b) t * string -&gt; ('a, 'b) t
      val printf: (unit, 'a) t -&gt; 'a
   end

functor Test (P: PRINTF) =
   struct
      open P
      infix D F

      val () = printf (`"here's an int "D" and a real "F".\n") 13 17.0
      val () = printf (`"here's three values ("D", "F ", "F").\n") 13 17.0 19.0
   end</code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_implementing_printf">Implementing <code>Printf</code></h2>
<div class="sectionbody">
<div class="paragraph">
<p>Think of a format character as a formatter transformer.  It takes the
formatter for the part of the format string before it and transforms
it into a new formatter that first does the left hand bit, then does
its bit, then continues on with the rest of the format string.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">structure Printf: PRINTF =
   struct
      datatype ('a, 'b) t = T of (unit -&gt; 'a) -&gt; 'b

      fun printf (T f) = f (fn () =&gt; ())

      fun ` s = T (fn a =&gt; (print s; a ()))

      fun D (T f, s) =
         T (fn g =&gt; f (fn () =&gt; fn i =&gt;
                       (print (Int.toString i); print s; g ())))

      fun F (T f, s) =
         T (fn g =&gt; f (fn () =&gt; fn i =&gt;
                       (print (Real.toString i); print s; g ())))
   end</code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_testing_printf">Testing printf</h2>
<div class="sectionbody">
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">structure Z = Test (Printf)</code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_user_definable_formats">User-definable formats</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The definition of the format characters is pretty much the same.
Within the <code>Printf</code> structure we can define a format character
generator.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">val newFormat: ('a -&gt; string) -&gt; ('a -&gt; 'b, 'c) t * string -&gt; ('b, 'c) t =
   fn toString =&gt; fn (T f, s) =&gt;
   T (fn th =&gt; f (fn () =&gt; fn a =&gt; (print (toString a); print s ; th ())))
val D = fn z =&gt; newFormat Int.toString z
val F = fn z =&gt; newFormat Real.toString z</code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_a_core_printf">A core <code>Printf</code></h2>
<div class="sectionbody">
<div class="paragraph">
<p>We can now have a very small <code>PRINTF</code> signature, and define all
the format strings externally to the core module.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">signature PRINTF =
   sig
      type ('a, 'b) t
      val ` : string -&gt; ('a, 'a) t
      val newFormat: ('a -&gt; string) -&gt; ('a -&gt; 'b, 'c) t * string -&gt; ('b, 'c) t
      val printf: (unit, 'a) t -&gt; 'a
   end

structure Printf: PRINTF =
   struct
      datatype ('a, 'b) t = T of (unit -&gt; 'a) -&gt; 'b

      fun printf (T f) = f (fn () =&gt; ())

      fun ` s = T (fn a =&gt; (print s; a ()))

      fun newFormat toString (T f, s) =
         T (fn th =&gt;
            f (fn () =&gt; fn a =&gt;
               (print (toString a)
                ; print s
                ; th ())))
   end</code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_extending_to_fprintf">Extending to fprintf</h2>
<div class="sectionbody">
<div class="paragraph">
<p>One can implement fprintf by threading the outstream through all the
transformers.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">signature PRINTF =
   sig
      type ('a, 'b) t
      val ` : string -&gt; ('a, 'a) t
      val fprintf: (unit, 'a) t * TextIO.outstream -&gt; 'a
      val newFormat: ('a -&gt; string) -&gt; ('a -&gt; 'b, 'c) t * string -&gt; ('b, 'c) t
      val printf: (unit, 'a) t -&gt; 'a
   end

structure Printf: PRINTF =
   struct
      type out = TextIO.outstream
      val output = TextIO.output

      datatype ('a, 'b) t = T of (out -&gt; 'a) -&gt; out -&gt; 'b

      fun fprintf (T f, out) = f (fn _ =&gt; ()) out

      fun printf t = fprintf (t, TextIO.stdOut)

      fun ` s = T (fn a =&gt; fn out =&gt; (output (out, s); a out))

      fun newFormat toString (T f, s) =
         T (fn g =&gt;
            f (fn out =&gt; fn a =&gt;
               (output (out, toString a)
                ; output (out, s)
                ; g out)))
   end</code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_notes">Notes</h2>
<div class="sectionbody">
<div class="ulist">
<ul>
<li>
<p>Lesson: instead of using dependent types for a function, express the
the dependency in the type of the argument.</p>
</li>
<li>
<p>If <code>printf</code> is partially applied, it will do the printing then and
there.  Perhaps this could be fixed with some kind of terminator.</p>
<div class="paragraph">
<p>A syntactic or argument terminator is not necessary.  A formatter can
either be eager (as above) or lazy (as below).  A lazy formatter
accumulates enough state to print the entire string.  The simplest
lazy formatter concatenates the strings as they become available:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">structure PrintfLazyConcat: PRINTF =
   struct
      datatype ('a, 'b) t = T of (string -&gt; 'a) -&gt; string -&gt; 'b

      fun printf (T f) = f print ""

      fun ` s = T (fn th =&gt; fn s' =&gt; th (s' ^ s))

      fun newFormat toString (T f, s) =
         T (fn th =&gt;
            f (fn s' =&gt; fn a =&gt;
               th (s' ^ toString a ^ s)))
   end</code></pre>
</div>
</div>
<div class="paragraph">
<p>It is somewhat more efficient to accumulate the strings as a list:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="rouge highlight"><code data-lang="sml">structure PrintfLazyList: PRINTF =
   struct
      datatype ('a, 'b) t = T of (string list -&gt; 'a) -&gt; string list -&gt; 'b

      fun printf (T f) = f (List.app print o List.rev) []

      fun ` s = T (fn th =&gt; fn ss =&gt; th (s::ss))

      fun newFormat toString (T f, s) =
         T (fn th =&gt;
            f (fn ss =&gt; fn a =&gt;
               th (s::toString a::ss)))
   end</code></pre>
</div>
</div>
</li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_also_see">Also see</h2>
<div class="sectionbody">
<div class="ulist">
<ul>
<li>
<p><a href="Printf">Printf</a></p>
</li>
<li>
<p><a href="References#Danvy98">Functional Unparsing</a></p>
</li>
</ul>
</div>
</div>
</div>
</div>
</body>
</html>