Clustering for Data Mining_ A Data Recovery Approach [Mirkin 2005-04-29].pdf

(4489 KB) Pobierz
Computer Science and Data Analysis Series
Clustering for
Data Mining
A Data Recovery Approach
Boris Mirkin
Boca Raton London New York Singapore
© 2005
by Taylor & Francis Group, LLC
Published in 2005 by
Chapman & Hall/CRC
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2005 by Taylor & Francis Group, LLC
Chapman & Hall/CRC is an imprint of Taylor & Francis Group
No claim to original U.S. Government works
Printed in the United States of America on acid-free paper
10 9 8 7 6 5 4 3 2 1
International Standard Book Number-10: 1-58488-534-3 (Hardcover)
International Standard Book Number-13: 978-1-58488-534-4 (Hardcover)
Library of Congress Card Number 2005041421
This book contains information obtained from authentic and highly regarded sources. Reprinted material is
quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts
have been made to publish reliable data and information, but the author and the publisher cannot assume
responsibility for the validity of all materials or for the consequences of their use.
No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic,
mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and
recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com
(http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration
for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate
system of payment has been arranged.
Trademark Notice:
Product or corporate names may be trademarks or registered trademarks, and are used only
for identification and explanation without intent to infringe.
Library of Congress Cataloging-in-Publication Data
Mirkin, B. G. (Boris Grigorévich)
Clustering for data mining : a data recovery approach / Boris Mirkin.
p. cm. -- (Computer science and data analysis series ; 3)
Includes bibliographical references and index.
ISBN 1-58488-534-3
1. Data mining. 2. Cluster analysis. I. Title. II. Series.
QA76.9.D343M57 2005
006.3'12--dc22
2005041421
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
Taylor & Francis Group
is the Academic Division of T&F Informa plc.
and the CRC Press Web site at
http://www.crcpress.com
© 2005
by Taylor & Francis Group, LLC
Chapman & Hall/CRC
Computer Science and Data Analysis Series
The interface between the computer and statistical sciences is increasing,
as each discipline seeks to harness the power and resources of the other.
This series aims to foster the integration between the computer sciences
and statistical, numerical, and probabilistic methods by publishing a broad
range of reference works, textbooks, and handbooks.
SERIES EDITORS
John Lafferty, Carnegie Mellon University
David Madigan, Rutgers University
Fionn Murtagh, Royal Holloway, University of London
Padhraic Smyth, University of California, Irvine
Proposals for the series should be sent directly to one of the series editors
above, or submitted to:
Chapman & Hall/CRC
23-25 Blades Court
London SW15 2NU
UK
Published Titles
Bayesian Artificial Intelligence
Kevin B. Korb and Ann E. Nicholson
Pattern Recognition Algorithms for Data Mining
Sankar K. Pal and Pabitra Mitra
Exploratory Data Analysis with MATLAB
®
Wendy L. Martinez and Angel R. Martinez
Clustering for Data Mining: A Data Recovery Approach
Boris Mirkin
Correspondence Analysis and Data Coding with JAVA and R
Fionn Murtagh
R Graphics
Paul Murrell
© 2005
by Taylor & Francis Group, LLC
Contents
Preface
List of Denotations
Introduction: Historical Remarks
1 What Is Clustering
Base words
1.1 Exemplary problems
1.1.1 Structuring
1.1.2 Description
1.1.3 Association
1.1.4 Generalization
1.1.5 Visualization of data structure
1.2 Bird's-eye view
1.2.1 De nition: data and cluster structure
1.2.2 Criteria for revealing a cluster structure
1.2.3 Three types of cluster description
1.2.4 Stages of a clustering application
1.2.5 Clustering and other disciplines
1.2.6 Di erent perspectives of clustering
Base words
2.1 Feature characteristics
2.1.1 Feature scale types
2.1.2 Quantitative case
2.1.3 Categorical case
2.2 Bivariate analysis
2.2.1 Two quantitative variables
2.2.2 Nominal and quantitative variables
2 What Is Data
© 2005
by Taylor & Francis Group, LLC
Zgłoś jeśli naruszono regulamin