Return-Path: Received: from super-pan.ics.uci.edu by paris.ics.uci.edu id aa15365; 14 Nov 97 17:01 PST To: smyth@super-pan.ics.uci.edu Subject: Tutorial Proposal. Date: Fri, 14 Nov 1997 17:04:51 -0800 From: Michael Pazzani Message-ID: <9711141701.aa15365@paris.ics.uci.edu> I hope e-mail is acceptable. In your mailbox, there's also 1. Previous tutorial notes (Me) 2. CV (hearst) November 13, 1997 Professor Padhriac Smyth, Tutorials Chair RE: AAAI tutorial We propose a tutorial to be presented at AAAI-97 on Advanced Techniques for Information Access. We are including a brief description of the tutorial, a statement of the background required and target audience, a discussion of why the tutorial is relevant, and our resumes. Prof. Pazzani has given a similar tutorial at IJCAI-97 and at the Information, Statistics and Induction in Science Conference held at Monash University, in Melbourne, Australia. He is including the complete set of overhead transparencies from this tutorial. Prof. Hearst is currently co-teaching a course in Information Organization and Retrieval; notes from this course can be reached from her home page http://www.sims.berkeley.edu/~hearst Due to the rapidly changing nature of this field, we will update these notes with recent developments before presentation at AAAI. The tutorial outline is included as the first few pages of the notes. The tutorial was attended by approximately 10% of the Monash conference attendees and 70 people at IJCAI. Marti A. Hearst & Michael J. Pazzani Address: Michael Pazzani, Chair Department of Information and Computer Science University of California Irvine, CA 92717-3425 phone (714) 824-5888 fax (714) 824-4056 e-mail pazzani@ics.uci.edu http://www.ics.uci.edu/~pazzani Marti Hearst School of Information Management and Systems University of California, Berkeley 102 South Hall Berkeley, CA 94720-4600 (510) 642-8016 (510) 642-5814 fax hearst@sims.berkeley.edu http://www.sims.berkeley.edu/~hearst Tutorial Description Advanced Techniques for Information Access Marti A. Hearst and Michael J. Pazzani UC Irvine and UC Berkeley The vast amount of information available on the Internet underscores the importance of techniques for locating relevant, useful or interesting information. These techniques range from filtering news groups for articles of interest, to determining which web sites are good sources of information, to helping users understand their retrieval results and reformulate their queries. This tutorial will review a variety the findings from several decades of research on information retrieval focusing on approaches to information filtering, classification and clustering. Next, machine learning approaches to text classification will be described. The relationship between machine learning and classic approaches from information retrieval will be discussed. Recent developments such as collaborative filtering, efficient rule learners, and weighted majority algorithms will be described. The tutorial will then describe how the results of these kinds of content analysis should be *used* as part of an information access system. We will describe the state-of-the-art in user interfaces for information access, and how they can make use of the results of machine learning on Internet data. Tutorial Audience The intended audience of this tutorial is practitioners and researchers interested in issues involved with applying machine learning and information retrieval algorithms to classification and ranking of information on the Internet. There are no special prerequisites for this tutorial, although a familiarity with introductory AI concepts such as classification and search, and basic knowledge of mathematics and probability will be assumed. Prior exposure to basic machine learning algorithms will be beneficial since these topics will be covered briefly.. Interest in Tutorial Topic With the increased usage and visibility of the Internet, there has been increased interest in artificial intelligence applications and research in providing automated means to assist a user in locating relevant information. For example, at AAAI-96, the session on Internet Agents was overflowing, while sessions on many traditional AI topics were sparsely attended. Background of Tutorial Presenters Michael Pazzani is a professor and department chair in Information and Computer Science at the University of California, Irvine. He has been active in Machine Learning research for the past decade with numerous publications in IJCAI, AAAI, and the International Machine Learning Conference. He has taught a variety of courses including Introduction to Artificial Intelligence at the undergraduate level (8 times), Natural Language Processing at the graduate level and graduate seminars in Machine Learning and Information Retrieval. Student teaching evaluations from the Introduction to Artificial Intelligence class are included. Students frequently report that the instructor is well prepared but that too much work is required. The numerical evaluations indicate that the instructor is above average at every quality measured by the students and outstanding at preparation and organization of material. Marti Hearst joined the faculty of the School of Information Management and Systems at the University of California, Berkeley in Fall 1997. Prior to this she was a Member of the Research Staff at Xerox PARC working on information access. She received her BA, MS, and PhD degrees in computer science from the University of California, Berkeley. While still a graduate student she organized a AAAI spring symposium on Improving Instruction of Introdutory AI and she is currently co-teaching a course on information organization and retrieval. She has been active in information retrieval and computational lingistics research for the last seven years and in computer-human interaction for the last three years. Prof. Hearst's current research interests focus on user interfaces and robust language analysis to build information access systems, and on furthering our understanding of how people use and understand such systems. Prof. Hearst will provide the background and core information retrieval, and will discuss the incorporation of the results of machine learning for content analysis into user interfaces for information retrieval. Prof. Pazzani will provide the background and core information about machine learning and will discuss the use of machine learning techniques for automated content analysis.