Received: from ics.uci.edu by ICS.UCI.EDU id aa19129; 24 May 90 21:53 PDT
To: ML-LIST: ;
Subject: Machine Learning List: Vol. 2 No. 9
Reply-to: ml@ICS.UCI.EDU
Date: Thu, 24 May 90 21:46:07 -0700
From: Michael Pazzani <pazzani@ICS.UCI.EDU>
Message-ID:  <9005242153.aa19129@ICS.UCI.EDU>


		 Machine Learning List: Vol. 2 No. 9
			 Thursday,  May 24, 1990

Contents:
	Machine Learning Journal special issue on discovery
	Report on Change of Representation Workshop
	Get Real!  Comments on ML 2(8)

The Machine Learning List is moderated.  Contributions should be relevant to
the scientific study of machine learning. Mail contributions to ml@ics.uci.edu.
Mail requests to be added or deleted to ml-request@ics.uci.edu.  Back issues 
may be FTP'd from ics.uci.edu in /usr2/spool/ftp/pub/ml-list/V<X>/<N> or N.Z 
where X and N are the volume and number of the issue; ID & password: anonymous

------------------------------
From: ZYTKOW@wsuiar.wsu.ukans.edu
Date: Thu, 24 May 90 19:25 CST
Subject: Machine Learning Journal special issue on discovery

A special issue of the Machine Learning Journal devoted to machine
discovery is planned for the near future. Jan Zytkow will be a guest
editor for that issue.

The papers for the special issue should be submitted by October 30,
1990. All papers will be subject to the standard evaluation procedures.
An ideal paper should satisfy all the ML Journal criteria, as discussed
in various editorial statements. An ideal paper should, of course, make
a substantial contribution to machine discovery.

The preponderance of the work on machine discovery has been done in the
context of discovery in science. There are many good reasons to continue
that direction of research, but other applications are also welcome.
Papers are encouraged on discoveries in databases and virtually any
other application of discovery techniques, provided that the results are
substantial. Because sciences offer us a collection of relatively clear
discovery standards, it is a good idea to compare your method and the
application domain you use with an appropriate scientific counterpart,
and to relate your work to the existing research on scientific
discovery.

An issue of common interest to both the learning and the discovery
researchers is the borderline between discovery and learning, not merely
for the sake of making conceptual distinctions, but because important
research issues can be identified this way for both machine learning and
discovery. The boundary is vague, and although it may never be sharp, we
may try to go beyond the standard supervised vs. unsupervised
distinction and to identify the mechanisms missing in the learning
systems that allow an autonomous discovery system to collect data,
decide on important research problems, and to evaluate its own results.
By opposing the discoverer to the learner we may stimulate the interest
of the broader machine learning community in research on discovery, and
we may discover new problems for our future research.

Both comparisons (non-scientific to scientific discovery; discovery
to learning) should consume up to 10% of a paper, ideally in the
discussion following the main results, unless either of these comparisons
is a major motivation for the paper.

If you haven't been contacted yet, and you are interested in submitting
a paper, please send a message to Jan Zytkow. You will be included at
the mailing list and you may participate in the discussion on the
borderline between discovery and learning.

-Jan Zytkow

my email address:   zytkow@wsuiar.wsu.ukans.edu  or  zytkow@twsuvax.bitnet
my USMail address:  Computer Science Department
                    Wichita State University
                    Wichita, KS  67208


------------------------------
Subject: Report on Change of Representation Workshop
Date: Wed, 16 May 90 10:33:53 +0100
From: wrobel@gmdzi.uucp

   
                    Report about the 2nd Workshop on
          Change of Representation and Problem Reformulation
   
                            Stefan Wrobel
         GMD (German Natl. Research Center for Computer Science)
                   Research Group Expert Systems
                          wrobel @ gmdzi.uucp
   
                            10. 5. 1990
   
   
   ### 1. Introduction ###
   
   The *2nd Workshop on Change of Representation and Problem  Refor-
   mulation*  was  competently  organized  by  Jeffrey van Baalen of
   Price Waterhouse, an international business consulting firm,  and
   took  place  at their research center in Menlo Park, CA, on March
   24 and 25, 1990. Even though a public call for  papers  had  been
   posted,  attendance  was  by  invitation  only based on submitted
   research summaries and/or extended paper abstracts. The resulting
   audience  of 34 researchers was very qualified and focused on the
   topic of the workshop, even though  not  all  groups  working  on
   representation  adjustment  were  represented.  European partici-
   pance was very low as well.
   
   The sessions themselves were divided into 2 or 3 plenary talks of
   35  minutes  including discussion, augmented by special 30 minute
   discussion periods that concluded a group of related talks.  This
   organization provided sufficient discussion time, even though the
   summary discussions were often used for questions  to  individual
   presenters. This suggests that longer question periods might have
   been beneficial. Since meals were  served  ``on  location'',  the
   small  size  of  the  workshop  allowed for good interaction both
   within and between sessions.
   
   The research presented at the workshop reflects a subdivision  of
   the  Change of Representation field into three lines of work, two
   of which are oriented towards problem solving, and one  which  is
   oriented  towards Machine Learning. A fourth group of researchers
   was presenting their theoretically oriented work  on  representa-
   tion  change  based on category theory.  In the following we sum-
   marize the talks in each of those groups. Notably absent from the
   workshop  was  any  work  on the emergence of representation from
   perceptual information.  There are no  official  proceedings,  so
   papers should be obtained by contacting the authors directly.
   
   ### 2. Reformulation for problem solving ###
   
   The first group, which we might call ``reformulation for  problem
   solving'',  is  concerned  with  transformations  of  a problem's
   search space in order to make it simpler, smaller, or more  effi-
   cient  to  work with.  Those reformulations are usually predeter-
   mined and carried out automatically, i.e., here, the focus is not
   (yet)  on  automating  the  process  of designing reformulations.
   *Saul Amarel's  (Rutgers)* classical work on the Missionaries and
   Cannibals  problem  is  a good  example.  At the workshop, Amarel
   discussed this work and its continuations  in an invited talk.

   *Jeffrey van Baalen  (Price Waterhouse)*  described  a  technique 
   that  automatically  designs  representations  by  moving general 
   logical axioms expressing  a  problem's  constraints  into  those 
   representations.   Such  representations reflect constraints syn-
   tactically, eg.  by changing the representation  of  a  predicate  
   with functional properties into  a function, or  creating special 
   representational objects that locally enforce eg.  symmetry  (cf. 
   hybrid reasoners).   He  also  showed how a theorem prover solves 
   problems much more efficiently  in  the  representations that the 
   technique designs.
   
   *Craig Knoblock (CMU)*  presented  a  way  of  learning  problem-
   specific  abstraction hierarchies for hierarchical planning.  His
   ALPINE  system   determines  classes  of  literals  that  can  be
   achieved  without interfering with  classes higher in the hierar-
   chy based on the operators' preand postconditions;  since  con-
   dition  arguments/instantiations  are ignored, the granularity of
   the  classes is relatively coarse.
   
   The work of *David Zhu  Jean-Claude  Latombe  (Stanford)*   dealt
   with  reformulation in robot motion path planning.  From the ini-
   tial representation of the robot as a rectangle and the shapes of
   the  obstacles,   their  system  computes ``grown'' shapes of the
   obstacles that allow the  representation of the robot as a point,
   making path computations a lot easier.
   
   Finally, *Daniel Weld (Univ. of Washington)*  examined how a sys-
   tem  can select  an appropriate model for a prediction task based
   on a given set of models, i.e.,  the focus of the work is on  how
   to  compare  two models, and estimate how  their differences will
   affect the predictions.
   
   ### 3. Compilation for problem solving ###
   
   The second group, which we might call ``compilation  for  problem
   solving'',  is  concerned with the generation of efficiently exe-
   cutable forms of an available general model.  In contrast to  the
   first  group,  this  usually does not involve the introduction of
   new representation elements (as in eg. van  Baalen's  work),  but
   consists  instead of determining which parts of the general model
   are relevant for a particular solution, and  expressing  them  in
   operational  terms to yield a problem solving rule.  Explanation-
   based learning is of this general type.
   
   At the workshop, the work of *Giuseppe Cerbone and  Tom  Dietter-
   ich* dealt with the problem of how to deal with numeric models in
   compilation, presenting  a  compilation-based  solution  for  the
   non-linear  optimization  task  of  2-d structural design.  Their
   system analyzes numerically-derived  designs  to  detect  general
   geometric  patterns  that can act as constraints in future design
   task; new geometric features are created when necessary.
   
   *Nick Flann (also Oregon State  Univ.)*  presented a  method  for
   dealing  with  intractable proofs in EBL as they occur eg. in op-
   ponent planning, where one needs to  operationalize  proofs  that
   include  universal quantification over operators.  By employing a
   more abstract representation of the  influence  of  operators  on
   goals,  the  proofs  can be restricted to the relevant operators,
   allowing an operationalization that  is  incomplete  (applies  to
   only part of the problem domain), but guaranteed correct.
   
   In short presentations during a special session  on  ``Reformula-
   tion  for  engineering  problem  solving'',  *Brian  Falkenhainer
   (Xerox PARC)*, *Mark  Shirley (Xerox PARC)*, and *Richard  Keller
   (NASA  Ames)* discussed the general  issues in (qualitative) rea-
   soning from general models. With respect to  representation, they
   emphasized  the  necessity of using multiple models at  different
   levels  of  granularity  and  with  different  viewpoints  (Falk-
   enhainer),   and  being  able  to  combine  them into specialized
   models with less coverage but  better efficiency (Keller).
   
   ### 4. Inductive change of representation ###
   
   The third line of research,  ``inductive  change  of  representa-
   tion'', seeks to automate the process of Change of Representation
   by introducing new terms or selecting a better  formalism.   Most
   of  the  presented  work  was within the ``traditional'' symbolic
   learning framework.
   
   *Ruediger Wirth (UC Irvine)*  presented  the  inverse  resolution
   ``G''  operators   that are used in his LFP2 system.  They intro-
   duce new terms based on the  inversion  of  resolution  steps  in
   several  (failed)  proofs.  This restricts the search  space com-
   pared  to  inverse  resolution  based  on  one  example  (as   in
   Muggleton's   CIGOL) and allows the relaxation of the unit clause
   restriction, but requires that  all relevant facts are  known  to
   prevent overgeneralization.
   
   *Kevin Thompson  and  Pat  Langley  (NASA  Ames)*  described  LA-
   BYRINTH, an extension of existing  conceptual clustering programs
   (particularly COBWEB) to inputs  using   structured  (relational)
   information.  The  key  idea in the approach is to regard  struc-
   tured objects as trees based on the  ``part-of''  relation.   The
   system  then  first  clusters the leaves of this tree (the atomic
   parts of the structured object). It then  replaces  those  leaves
   by their now-known class names, which means the next  higher lev-
   el in the tree is now described by non-structured attributes, and
   thus   be  clustered by repeated application of the same process.
   This reuse of  discovered terms is new in  search-based  cluster-
   ing.
   
   *Stefan Wrobel (GMD)*  presented  the  representation  adjustment
   method used in MOBAL, which also uses structured information, but
   introduces terms in a demand-driven fashion,  thus  avoiding  the
   resource  allocation problem between rule  discovery and new term
   introduction. The approach exploits rule exception  sets  as  the
   basis  for  forming  new concepts, checks the quality of new con-
   cepts, and reuses them for restructuring the rule base and future
   learning.
   
   *Raj Seshu (Univ. of Illinois)* showed  how the  introduction  of
   new  features can overcome the ``parity problem'' that  occurs in
   decision-tree induction (TDIDT)  systems  (ID3  and  successors):
   they fail  to learn concepts that are defined by the n-ary parity
   operation.  Seshu's system  introduces a small number  of  parity
   features,  resulting in improved learning  on the parity problem,
   but only at the price  of  required  ``bias''  training,  and  of
   slightly  worse  performance on some other problems.  Also, there
   was some  discussion in the audience as to how important the par-
   ity problem is in  practice.
   
   *Sharad Saxena (Univ.  of  Massachusetts  Amherst)*  addressed  a
   presently  completely  open problem in representation change: how
   to evaluate  a given representation.  To tackle the  problem,  he
   restricted  it  to the case of  decision tree learning from exam-
   ples, and proposed to  measure  representation   quality  by  the
   length  of  an  approximate DNF description of the target concept
   that is generated from the examples before  the  actual  learning
   algorithm  is  run.   At present, computing this approximate con-
   cept description is  more  expensive   than  running  the  actual
   learning  algorithm  ($n^2$ vs. $n  be able to reduce the cost to
   linear in the future.
   
   *Craig Shaefer  (Rowland Inst. for Science)*  was  the  sole  ex-
   ponent  of  another  learning  paradigm, genetic algorithms.  His
   ARGOT system improves the quality of  genetic algorithm  learning
   by  changing the representation employed for the  chromosomes, so
   that the GA sampling is restricted to parts of the  total   solu-
   tion  space.   To that end, ARGOT employs several heuristics that
   either  narrow or widen the space based on the characteristics of
   payoff evolution.
   
   ### 5. Theoretical Foundations ###
   
   The field of representation change does not have a solid theoret-
   ical  foundation  yet.   One  approach that is being favored by a
   number of people is category theoretic models.  To introduce  the
   rest of the community to this mathematical theory, one session of
   the workshop was devoted to a category theory tutorial  given  by
   *Paul  Benjamin  (Philips)*,  *Mike  Lowry  (Kestrel  Inst.)* and
   *Robert  Zimmer (Brunel Univ.)*, who also presented a theoretical
   investigation  of   Rubik's  cube  representations (Benjamin) and
   some  theoretical  results  about  the   nature  of   homomorphic
   representation change as defined by Korf (Lowry).


Stefan Wrobel
Gesellschaft fuer Mathematik und Datenverarbeitung (GMD)
F3/XPS
Pf. 1240
D-5205 St. Augustin 1
Tel.: 02241/14-2093          E-mail: wrobel @ gmdzi.uucp
------------------------------
Date: Thu, 17 May 90 19:16:25 PDT
From: Jeff Shrager <shrager@parc.xerox.COM>
Subject: Get Real!  Comments on ML 2(8) &c

I must say that machine learning has gotten pretty boring, what with
trivial (or worse, randomly-generated) test data, a notion of "explanation"
that amounts to logical proof+chunking, and a notion of "concept" that's in
the dark ages of objectivist psycho-philosophy.... Oh, yes, either these or
connectionism and it's probable-equivalents, which have an infinite
algorithmic search space that no one understands, and so which are grounded
neither on an architectural, goal-directed, or psychological basis.

If those are the answers, I'm no longer interested in the questions!  

What ever happened to the goals of understanding human learning,
explanation, development, concept formation?  Instead of random test data
or soy bean data, why aren't you feeding your systems protocols of parent
or teacher input gathered from real homes or classrooms?  Yeah, I know,
you're trying to get at the deep computational properties and scope of
these algorithms...blah blah blah.  There's something to this, but it's a
shame to see a whole community drop into this black hole.  And anyway, how
do you know that the algorithms you're spending so much time studying the
details of are even vaguely plausible either as engineering tools or as
psychological models?  

Try this.  Tape record "All Things Considered" or "McNeil-Lehrer" one day
and transcribe the explanations that take place there.  (Note that the
whole show is really one big explanation, in some very interesting sense of
explanation!)  Alternatively, go sit on a park bench on a Saturday, or in a
museum and write down the explanations you hear people giving their
children or between peers...or watch them trying to figure out the exhibits
and try to gently engage these folks in explaining them to you...  or even
just write down what the exhibit placards say!  If you have children of 2
to 6 years old, or so, apy attention to what you tell them or what your
spouse tells them or what their freinds tell them.  I challenge you to fit
these phenomena into something even vaguely related to the ML notions of
data, explanation, activity, or concept.

If real world contact scares you, you can just pick up Holland, Holyoak,
Nisbett, & Thagard's "Induction" (try really really hard to ignore the
computational mumbo-jumbo) and study the excellent examples!  Here's the
beginning of a bibliography that every Machine Learning person that cares
about "real" learning and intelligence should read or at least know about.
There are certainly other works that don't happen to be in the tips of my
fingers, and a million papers in, around, and similar to these major works.
I've focussed on books simply to make my typing life easier.  I encourage
folks to add to this list their favorite contributions to the
exemplification of real intelligence and learning.

-Jeff

-> On scientific reasoning:

Feyerabend, P. (1975). Against method.  New York: Verso. **

Gholson, B., Shadish, W.R., Neimeyer, R.A., & Houts, A.C. (1989) Psychology
of science: Contributions to metascience. Cambridge University Press. %

Hacking, I. (1983). Representing and intervening.  Cambridge: Cambridge
University Press. **

Lakatos, I. (1978). The methodology of scientific research programmes.
Cambridge University Press. **

(anything by I. Lakatos)

Mayr, E. (1982). The growth of biological thought.  Cambridge, MA: Harvard
University Press. **

Suppe, F. (1977). The structure of scientific theories. Urbana, IL:
University of Illinois Press.  (esp. Suppe's critical introduction and
conclusions) *

-> On culture and development:

Bruner, J. (1983). Child's talk. New York: Norton. **

Burner, J., & Haste, H. (1987) Making sense: The child's construction of
the world.  New York: Methuen. *

(anything by J. Bruner)

Geertz, C. (1983). Local knowledge.  New York: Basic Books.**

Geertz, C. (1973). The interpretation of cultures.  New York: Basic Books.**

Gruber, H.E., & Voneche, J.J. (1977). The essential Piaget.  New York:
Basic Books. *

Heritage, J. (1984). Garfinkel and ethnomethodology.  Cambridge, UK: Polity
Press. *

Landau, B., & Gleitman, L. R. (1985). Language and experience: Evidence
from the blind child.  Harvard University Press. **

Vygotsky, L.S. (1934/1962).  Thought and language.  Cambridge, MA: MIT
Press and Wiley. **

Wertsch, J.V. (1985). Vygotsky and the social construction of mind.
Harvard University Press. *

-> On not-so-cognitive processes:

Arkes, H.R., & Hammond, K.R. (1986). Judgment and decision making: An
interdisciplinary reader.  Cambridge University Press.  (esp. those
chapters that give examples of real decisions, e.g., in politics) %

Luhrman, T.M. (1989).  Persuasions of the witch's craft.  Cambridge, MA:
Harvard University Press.  **

Neisser, U. (1982). Memory observed. New York: Freeman. *

Rogoff, B. & Lave, J. (1984). Everyday cognition: Its development in a
social context.  Harvard University Press. %

(anything by B. Rogoff)

Sudnow, D. (?) Ways of the hand. 
   (I don't have a complete reference for this.)

---

Key:

* Collections or reviews of important work, so providing reviews
    of the field (or of some line of research) at some interesting
    historical point as well as a place to begin study.

% Collections or anthologies, some contents of which make good top-nodes
    for entry an important literature.

** Major contributions, giving insightful analyses of
     activity or historical phenomena.


------------------------------
Date: Mon, 21 May 90 18:11:12 EDT
From: mostow@cs.rutgers.edu
Subject: Re: Get Real!  Comments on ML 2(8) &c

Interesting statement.  I hope it provokes informative discussion.  Naturally I
don't think it applies to my own work, largely because I'm not trying to
model human learning.

On explanation-as-proof:  Just because the initial version is simplistic is not
a reason to replace the powerful language and tools of logic and proof with
some new representation whose semantics and properties are less understood.  It
is a good reason to extend the notion of explanation to overcome recognized
limitations.  Examples of such extensions include work by William Cohen and
others on abductive explanation-based learning, and work by Tom Ellman, Prasad
Tadepalli, Scott Bennett, Steve Chien, Neeraj Bhatnagar, and others on using
simplifying assumptions to derive approximate explanations.  Moreover, using
proofs as explanations may be quite reasonable for learning control knowledge;
the problems seem more to do with intractability than with conceptual
impoverishment.  There is sometimes an unfortunate tendency to condemn a line
of research prematurely just because it doesn't achieve total immediate success
or suffers from limitations.  Unless those limitations are truly intrinsic to
the entire approach, it can pay to figure out how to overcome them.

On plausibility:  An emerging and informative (albeit imperfect) plausibility
criterion for learning is based on the PAC paradigm introduced by Valiant.
William Cohen's extension of this paradigm to performance improvement learning
shows that it is not limited to simple pure induction.

On connectionism:  If neural networks really have an "algorithmic search space
that no one understands," but manage to do some interesting things anyway, I'd
think that would be a pretty good reason to study them in order to understand
them better.

In general, such broadsides are harder to address than specific criticisms of
specific research.  However, they can be healthy insofar as they challenge us
to justify our work and reexamine our goals.  Jack Mostow

------------------------------
Date: Mon, 21 May 1990 18:38:57 EDT
From: Jaime Carbonell <jgc@nl.cs.cmu.edu>
Subject: Re: Get Real!  Comments on ML 2(8) &c

  I do not find Shrager's unfocused attack on Machine Leaning
particularly interesting or worthy of much reflection.  "Broadside"
was Mostow's appropriate characterization thereof.  A much more useful
criticism would focus upon one particular area of ML at a time,
analyze it in depth, and suggest well-thought-out reasons why the
research was not being well conducted, rather than Shrager's blanket
opinion that the field is boring.  Experimental Psychology appreciates
the value of thorough empirical testing, of balanced data sets, of
replicable experiments, etc.  Mathematics necessarily simplifies the
world in order to model formally some aspect of interest.  Computer
Science addresses tractable problems and attempts to push the
boundaries of what can be computed tractably.  Machine learning relies
on these disciplines, rather than some bull-in-a-china-shop approach.
  
  --Jaime Carbonell

------------------------------
END of ML-LIST 2.9


