
Hugs debugging primitives
---------------------------

This release of Hugs contains support for debugging by observations
inspired by the Andy Gill's Hood library. [1,2]

Hood is a portable Haskell library that implements the combinator

    Observable a => observe :: String -> a -> a

The partial application

    observe tag

behaves exactly like the identity function, but also records the value
of data to which it is applied. Any observations made are reported at
the end of the computation. The `tag' argument is used to label the
observed value when it is reported. Non-strict semantics is preserved
--- observe does not evaluate its second argument.

HugsHood uses the same observation model but differs in a number of ways.

* It is much faster. This is because HugsHood is implemented within the 
  Hugs evaluator and uses primitive builtin functions.
  Performance depends upon the volume of observations. More frequent
  observations incur a higher overhead. As a simple comparison, a
  test program which executed 1 million reductions and made 250
  observations incurred a 625 percent overhead when observations were made
  with the Hood library but just 10 percent when using HugsHood.

  Caveat: When not using observations, the modifications to the evaluator
  to support HugsHood imposes an overhead of about 6 percent.

* It is possible to easily observe arbitrary data structures. HugsHood 
  implements the primitive 

  observe :: String -> a -> a

  which is unconstrained by the need to build instances of the
  Observable class for each user defined data type whose values are
  being observed. HugsHood uses an internal primitive function to
  display observed values.  This may be considered both an advantage
  and a disadvantage: one does not need to define how to observe
  values, but one cannot define special user views of data.

* No modification to the program (apart from instrumentation with
  observe) is required. The Hood library must be invoked using a 
  special IO monadic combinator to ensure that observations are collected
  and displayed.

* There are a number of minor differences in the display format which are
  a consequence of the Hugs implementation. These are described below.

Using HugsHood
----------------
Modules that use HugsHood combinators must import the module Observe.lhs
from the lib/exts directory. Its only role is to provide the necessary 
primitive definitions, namely:

primitive observe :: String -> a -> a
primitive bkpt    :: String -> a -> a
primitive setBkpt :: String -> Bool -> IO ()

Breakpoints
------------
HugsHood implements breakpoints. A program can be instrumented with the
`bkpt' function. The partial application 
    
    bkpt bkpt_name

behaves exactly like the identity function, except that before it
returns its argument it checks if bkpt_name is enabled, and if it is
the user is presented with the opportunity to view observed data.  A
small set of commands is available when Hugs halts due to a
breakpoint:

  p [tag_name]  Print observations made since the computation began.
                If an observation tag is suppled then only the
                associated observations will be displayed. Otherwise
                all observations will be displayed.

  c [n]         Continue with program evaluation. With no arguments,
                evaluation will continue until another active breakpoint
                is encountered. The optional numeric argument will skip 
                n active breakpoints before stopping.
                
  s bkpt_name   Set a breakpoint.
  
  r [bkpt_name] Reset a named breakpoint or, if no breakpoint name is
                supplied, reset all breakpoints. 

A breakpoint is by default disabled. It can be enabled by using the 
`s' command in the debug breakpoint dialogue, or by using the 
setBkpt combinator. Clearly at least one breakpoint must be enabled
using setBkpt before a breakpoint dialogue can be triggered.

Breakpoint Example
------------------
Here is a very simple program using the three combinators.

import Observe

prog n = do { setBkpt "fib" True; putStr $ show (observe "fun" f n) }
f 0 = 1
f n = n * (bkpt "fib" $ observe "fun" f (n-1))

The following sample session shows how the `p' add `c' commands
can be used.

Main> prog 4
Break @ fib> p

>>>>>>> Observations <<<<<<

fun
  { \ 4  -> _
  }

Break @ fib> c
Break @ fib> p

>>>>>>> Observations <<<<<<

fun
  { \ 4  -> _
  , \ 3  -> _
  }

Break @ fib> c 2
Break @ fib> p

>>>>>>> Observations <<<<<<

fun
  { \ 4  -> _
  , \ 3  -> _
  , \ 2  -> _
  , \ 1  -> _
  }

Break @ fib> c
24
(98 reductions, 299 cells)

>>>>>>> Observations <<<<<<

fun
  { \ 4  -> 24
  , \ 3  -> 6
  , \ 2  -> 2
  , \ 1  -> 1
  , \ 0  -> 1
  }

10 observations recorded

   
Differences from Hood
---------------------

HugsHood uses a similar style of display to Hood, though there are
differences. One trivial difference is that Hood reports tags with a
leading `--' while HugsHood does not.  

Consider now more significant differences. 

Observing character strings
...........................

HugsHood (and Hood) reports lists using the cons operator.

Observe> observe "list" [1..3]
[1,2,3]

>>>>>>> Observations <<<<<<

list
  (1 : 2 : 3 : [])

This is too verbose for lists of characters, so HugsHood reports
strings in the usual format:

Observe> observe "string" ['a'..'d']
"abcd"

>>>>>>> Observations <<<<<<

string
  "abcd"

If only the initial part of the string is evaluated, a trailing "..." is
reported. 

Observe> take 2  $ observe "string" ['a'..'d']
"ab"

>>>>>>> Observations <<<<<<

string
  "ab..."

This is clearly ambiguous, because evaluating the expression

observe "string" "ab..." 

will give the same result, but in practice the ambiguity 
should be easy to resolve.

Unevaluated expressions
.......................

The `_' symbol is used to indicate an unevaluated expression.
In Hood all unevaluated expressions will be displayed using _.
In HugsHood, _ denotes an unevaluated expression, but not all unevaluated
expression are denoted by _.

For example the expression fst $ observe "pair" (1,2) yields

-- pair
  (1, _)

in both Hugs and HugsHood. However, fst $ observe "pair" ('a','b') yields

pair
  ('a','b')

in HugsHood, and ('a', _) in Hood. This is because HugsHood (unlike
Hood) does not actually record evaluation steps. It merely maintains an
internal pointer to that part of the heap representing the tagged
expression. If the expression in not in weak head normal form, then it
obviously has not been evaluated and so it is reported as just `_';
otherwise it displayed.  Integer constants like 1 and 2 are not in
WHNF, as they must be coerced to the correct type when evaluated.
Characters though are in WHNF so it is not possible to discern whether
a character was evaluated.

Another consequence of the HugsHood implementation by pointers rather
than Hood's implementation by tracing evaluation is that the strictness
behaviour of a function can be masked. Consider the example:

lazy pair = let x = observe "fst" fst pair
                y = snd pair
            in  (y,x)

For the expression lazy (1,2) Hood reports 

-- fst
  { \ (1, _)  -> 1
  }
while HugsHood reports

fst
  { \ (1,2)  -> 1
  }

HugsHood should not be used to deduce the strictness behaviour of a
function, or it should be done only with caution 

Interaction with the root optimisation
......................................

The hugs compiler uses an optimisation when generating code that builds
expressions on the heap. If a function definition has the form

f arg1 .. argN = ..... f arg1 .. argM .....   where 1<=M<=N 

then the expression graph for f arg1 .. argM is copied rather than
rebuilt from individual application nodes. This interacts with the
observation algorithm so that observing functions of the above form give
unexpected results.

For instance consider the expression

observe "fold" foldl (+) 0 [1..3]

When the root optimisation is applied to the compilation of foldl, we see

fold
  { \ primPlusInteger 6 []  -> 6
  , \ { \ 3 3  -> 6
      } 3 (3 : [])  -> 6
  , \ { \ 1 2  -> 3
      } 1 (2 : 3 : [])  -> 6
  , \ { \ 0 1  -> 1
      } 0 (1 : 2 : 3 : [])  -> 6

instead of the expected

fold
  { \ { \ 0 1  -> 1
      , \ 1 2  -> 3
      , \ 3 3  -> 6
      } 0 (1 : 2 : 3 : [])  -> 6
  }

The first form reports the arguments at each application of foldl,
while the second reports the arguments for just the initial application 
(the one marked by observe).

The root optimisation can be disabled using the -R option. This can be
done from the command line or by using :s -R at the Hugs prompt. If you
want to compile the prelude definitions without the root optimisation
you must invoke Hugs with the -R option. 

Testing of execution time with and without the root optimisation for a
selection of 23 benchmarks from the nofib suite has been carried out.
All but 5 tests resulted in an execution time penalty of less than 3%
when running without root optimisation (some even showed a very minor
speedup).

Known problems
--------------

Hugs can produce infinite (cyclic) dictionaries when implementing overloading.
The observation reporting mechanism does not detect these at present, which
leads to a non-terminating report. We plan to address this in a future
release. 

Bug Reports
------------
Please report bugs to Richard Watson, rwatson@usq.edu.au

In particular, if the message
    Warning: observation sanity counter > 0
appears, and your program has not terminated abnormally, please report
the error situation.

References
----------

[1] Andy Gill. Debugging Haskell by Observing Intermediate Data
    Structures. In Draft Proceedings of the 2000 Haskell Workshop.
[2] The Haskell Object Observation Debugger
    http://www.haskell.org/hood/
