                      The Collection-Extensions library
                      =================================

  Copyright (c) 1994  Carnegie Mellon University
  All rights reserved.

  The various modules in this library contain a few new types and operations
  which are compatible with the collection types specified in the Dylan
  Interim Reference Manual, but which are not part of that specification.  

  Users of Mindy 1.1 should be aware that the string-searching functions of
  the string-search module have been moved to the String-extensions library.
  The collection-extensions module "string-search" has for that reason been
  renamed "vector-search".  The semantics of the moved functions have been
  changed, so string-search users are advised to read the String-extensions
  documentation.  

  It is to be expected that more collection types will appear in time, and
  they will likely be added to this library.  This may also result in
  reorganizations which could force incompatible changes to the existing
  modules.  We hope to minimize such imcompatibilities and, when forced to
  them, will include sufficient information to facilitate conversion of
  existing code.  

  The particular modules provided at present are:  

  heap 
      Provides <heap>, a popular data structure for priority queues.  The
      semantics are basically those of a sorted sequence, with particularly
      efficient implementations of add!, first, and pop (i.e.
      "remove-first").  
  self-organizing-list 
      Provides "self-organizing lists".  These explicit key collections
      provide roughly the semantics of hash tables, but use a probabilistic
      implementation which provides O(n) worst case performance but can
      provide very fast constant time access in the best case.  
  subseq 
      Provides "subsequences", which represent an aliased reference to some
      part of an existing sequence.  These are analogous to slices (in Ada or
      Perl) or displaced arrays (in Common Lisp).  Subsequences are
      themselves subclasses of <sequence>, and can therefore be passed any
      <collection> or <sequence> operation.  
  vector-search 
      Provides a small assortment of specialized operations for searching and
      modifying <vector>s.  These operations are analogous to existing
      collection operations but provide keywords and efficiency improvements
      which are meaningful only within the more limited domain. 

Heap
----
  A heap is an implementation of the abstract data type "sorted list". A heap
  is a sorted sequence of items.  Most likely the user will end up writing
  something like 

  define class <heap-item> (<object>);
    slot priority;
    slot data;
  end class <heap-item>;

  with appropriate methods defined for < and =. The user could, however, have
  simply a sorted list of integers, or have some item where the priority is
  an integral part of the item itself.  

  make on heaps supports the less-than: keyword, which supply the heap's
  comparison and defaults to <.  

  Heaps support all the usual sequence operations. The most useful ones:  

       push(heap, item) => updated-heap
       pop(heap)        => smallest-element
       first(heap)      => smallest-element
       second(heap)     => second-smallest-element
       add!(heap, item) => updated-heap
       sort, sort!      => sorted-sequence

  These are all "efficient" operations (defined below).  As with <deque>,
  push is another name for add!, and does exactly the same thing except that
  push doesn't accept any keywords.  sort and sort! return a sequence that's
  not a heap. Not necessarily efficient but useful anyhow:  

       add-new!(heap, item, #key test:, efficient:) => updated-heap
       remove!(heap, item, #key test:, efficient:) => updated-heap
       member?(heap, item, #key test:, efficient:) => <boolean>

  The efficient: keyword defaults to #f. If #t, it uses the
  random-iteration-protocol (which is considerably more efficient, but isn't
  really standard behavior, so it had to be optional).  Conceivably most
  sequence methods could support such a keyword, but they don't yet.  

  The user can use element-setter or the iteration protocol to change an item
  in the heap, but changing the priority of an item is an error and Bad
  Things(tm) will happen. No error will be signaled.  Both of these
  operations are very inefficient.  

  Heaps are NOT <stretchy-collection>s, although add! and pop can magically
  change the size of the heap.  

  Efficiency: Approximate running times of different operations are given
  below: (N is the size of the heap) 

      first, first-setter                             O(1)
      second (but not second-setter)                  O(1)
      size                                            O(1)
      add!                                            O(lg N)
      push                                            O(lg N)
      pop(heap)                                       O(lg N)
      sort, sort!                                     O(N * lg N)
      forward-iteration-protocol          
                              setup:                  O(N)
                              next-state:             O(lg N)
                              current-element:        O(1)
                              current-element-setter: O(N)
      backwards-iteration-protocol
                              setup:                  O(N * lg N)
                              next-state:             O(1)
                              current-element:        O(1)
                              current-element-setter: O(N)
      random-iteration-protocol           
                              setup:                  O(1)
                              next-state:             O(1)
                              current-element:        O(1)
                              current-element-setter: O(1)
      element(heap, M)                                O(M*lg N + N)
      element-setter(value, heap, M)                  O(N + M*lg N + M)

  element, element-setter on arbitrary keys use the
  forward-iteration-protocol (via the inherited methods), and have
  accordingly bad performance.  

Self-Organizing-List
--------------------
  The "Self Organizing List" is a "poor man's hash table".  More precisely,
  <self-organizing-list> is a subclass of <dictionary> for which addition and
  retrieval are both linear in the worst case, but which use a probabilistic
  strategy which yields nearly constant time in the best case.  (A
  <dictionary> is a subclass of <mutable-explicit-key-collection> and
  <stretchy-collection&gt, and is described in mindy.doc) 

  Because they have a very low overhead, self-organizing lists may provide
  better peformance than hash tables in cases where references have a high
  degree of temporal locality.  They may also be useful in situations where
  it is difficult to create a proper hash function.  

  Define new self-organizing-lists with 

    make(<self-organizing-list>, test: test)

  Test is expected to be an equality function.  In particular, it is expected
  to satisfy the identity and transitivity requirements described in chapter
  5.  If not specified, test defaults to \==.  

Subseq
------
  <Subsequence> is a new subclass of <sequence>.  A subsequence represents an
  aliased reference to some part of an existing sequence.  Although they may
  be created by make (with required keywords source:, start: and end:) on one
  of the instantiable subclasses, they are more often created by calls of the
  form 

    subsequence(sequence, start: 0, end: 3)

  where start: and end: are optional keywords which default to the beginning
  and end, respectively, of the source sequence.  No other new operations are
  defined for subsequences, since all necessary operations are inherited from
  <sequence>.  

  Because subsequences are aliased references into other sequences, several
  properties must be remembered:  

   1. The contents of a subsequence are undefined after any destructive
      operation upon the source sequence.  
   2. Destructive operations upon subsequences may be reflected in the
      source.  The results of reverse! and sort! should be expected to affect
      the source sequence for vector subsequences.  

  If the source sequences are instances of <vector> or <string>, then the
  implementation will use subclasses of <subsequence> which are also
  subclasses of <vector> or <string>.  

  Efficiency notes:  

   1. The implementation tries to insure that subsequences of subsequences
      can be accessed as efficiently as the original subsequence.  (For
      example, the result of 

        subsequence(subsequence(source, start: 1), start: 2)

      would produce a subsequence identical to the one produced by 

        subsequence(source, start: 3)

   2. Vector subsequences, like all other vectors, implement constant time
      element access.  
   3. Non-vector subsequences may take non-constant time to create, but will
      provide constant-time access to the first element.  This should produce
      the best performance provided some element of the subsequence is
      accessed at least once.  

Vector-Search
-------------
  The "vector-search" module provides basic search and replace capabilities
  upon restricted subsets of <sequence> -- primarily <vector>.  Exploiting
  the known properties of these types yields substantially better performance
  than can be achieved for sequences in  general.  

  The following functions are supplied:  

  find-first-key vector predicate? #key start end failure => key 
      Find the index of first element (after start but before end) of a
      vector which satisfies the given predicate.  If no matching element is
      found, return failure.  The defaults for start, end and failure are,
      respectively,  0, size(vector), and #f.  This function is like
      find-key, but accepts start: and end: rather than skip:.) 

  find-last-key vector predicate? #key start end failure => key 
      This is like find-first-key, but goes backward from end.  

