Introduction to Regular Expressions in SAS [Windham 2014-11-18].pdf
(
3047 KB
)
Pobierz
Introduction to
Regular Expressions
in SAS
®
K. Matthew Windham
support.sas.com/bookstore
The correct bibliographic citation for this manual is as follows: Windham, K. Matthew. 2014.
Introduction to
Regular Expressions in SAS
®
. Cary, NC: SAS Institute Inc.
Introduction to Regular Expressions in SAS
®
Copyright © 2014, SAS Institute Inc., Cary, NC, USA
ISBN 978-1-61290-904-2 (Hardcopy)
ISBN 978-1-62959-498-9 (EPUB)
ISBN 978-1-62959-499-6 (MOBI)
ISBN 978-1-62959-500-9 (PDF)
All rights reserved. Produced in the United States of America.
For a hard-copy book:
No part of this publication may be reproduced, stored in a retrieval system, or transmitted,
in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written
permission of the publisher, SAS Institute Inc.
For a web download or e-book:
Your use of this publication shall be governed by the terms established by the
vendor at the time you acquire this publication.
The scanning, uploading, and distribution of this book via the Internet or any other means without the permission
of the publisher is illegal and punishable by law. Please purchase only authorized electronic editions and do not
participate in or encourage electronic piracy of copyrighted materials. Your support of others’ rights is
appreciated.
U.S. Government License Rights; Restricted Rights:
The Software and its documentation is commercial
computer software developed at private expense and is provided with RESTRICTED RIGHTS to the United
States Government. Use, duplication or disclosure of the Software by the United States Government is subject
to the license terms of this Agreement pursuant to, as applicable, FAR 12.212, DFAR 227.7202-1(a), DFAR
227.7202-3(a) and DFAR 227.7202-4 and, to the extent required under U.S. federal law, the minimum restricted
rights as set out in FAR 52.227-19 (DEC 2007). If FAR 52.227-19 is applicable, this provision serves as notice
under clause (c) thereof and no other notice is required to be affixed to the Software or documentation. The
Government's rights in Software and documentation shall be only those set forth in this Agreement.
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513-2414.
December 2014
SAS provides a complete selection of books and electronic products to help customers use SAS
®
software
to its fullest potential. For more information about our offerings, visit
support.sas.com/bookstore
or call
1-800-727-0025.
SAS
®
and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS
Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
Contents
About This Book ........................................................................................ vii
About The Author ...................................................................................... xi
Acknowledgments .................................................................................... xiii
Chapter 1: Introduction .............................................................................. 1
1.1 Purpose of This Book .............................................................................................................. 1
1.2 Layout of This Book ................................................................................................................. 1
1.3 Defining Regular Expressions................................................................................................. 2
1.4 Motivational Examples ............................................................................................................ 3
1.4.1 Extract, Transform, and Load (ETL) .............................................................................. 3
1.4.2 Data Manipulation .......................................................................................................... 4
1.4.3 Data Enrichment ............................................................................................................. 5
Chapter 2: Getting Started with Regular Expressions ................................. 9
2.1 Introduction ............................................................................................................................ 10
2.1.1 RegEx Test Code .......................................................................................................... 11
2.2 Special Characters ................................................................................................................. 13
2.3 Basic Metacharacters ............................................................................................................ 15
2.3.1 Wildcard ......................................................................................................................... 15
2.3.2 Word ............................................................................................................................... 15
2.3.3 Non-word ....................................................................................................................... 16
2.3.4 Tab.................................................................................................................................. 16
2.3.5 Whitespace .................................................................................................................... 17
2.3.6 Non-whitespace ............................................................................................................ 17
2.3.7 Digit ................................................................................................................................ 17
2.3.8 Non-digit ........................................................................................................................ 18
2.3.9 Newline .......................................................................................................................... 18
2.3.10 Bell ............................................................................................................................... 19
iv
2.3.11 Control Character ....................................................................................................... 20
2.3.12 Octal ............................................................................................................................. 20
2.3.13 Hexadecimal................................................................................................................ 21
2.4 Character Classes .................................................................................................................. 21
2.4.1 List .................................................................................................................................. 21
2.4.2 Not List........................................................................................................................... 22
2.4.3 Range ............................................................................................................................. 22
2.5 Modifiers ................................................................................................................................. 23
2.5.1 Case Modifiers .............................................................................................................. 23
2.5.2 Repetition Modifiers ..................................................................................................... 25
2.6 Options .................................................................................................................................... 32
2.6.1 Ignore Case ................................................................................................................... 32
2.6.2 Single Line ..................................................................................................................... 32
2.6.3 Multiline ......................................................................................................................... 33
2.6.4 Compile Once................................................................................................................ 33
2.6.5 Substitution Operator................................................................................................... 34
2.7 Zero-width Metacharacters .................................................................................................. 34
2.7.1 Start of Line ................................................................................................................... 35
2.7.2 End of Line..................................................................................................................... 35
2.7.3 Word Boundary ............................................................................................................. 35
2.7.4 Non-word Boundary ..................................................................................................... 36
2.7.5 String Start .................................................................................................................... 36
2.8 Summary ................................................................................................................................. 37
Chapter 3: Using Regular Expressions in SAS ........................................... 39
3.1 Introduction ............................................................................................................................ 39
3.1.1 Capture Buffer............................................................................................................... 39
3.2 Built-in SAS Functions ........................................................................................................... 40
3.2.1 PRXPARSE .................................................................................................................... 40
3.2.2 PRXMATCH ................................................................................................................... 42
3.2.3 PRXCHANGE ................................................................................................................. 43
3.2.4 PRXPOSN ...................................................................................................................... 46
3.2.5 PRXPAREN .................................................................................................................... 47
v
3.3 Built-in SAS Call Routines ..................................................................................................... 49
3.3.1 CALL PRXCHANGE ...................................................................................................... 50
3.3.2 CALL PRXPOSN ............................................................................................................ 54
3.3.3 CALL PRXSUBSTR ....................................................................................................... 56
3.3.4 CALL PRXNEXT ............................................................................................................ 57
3.3.5 CALL PRXDEBUG ......................................................................................................... 59
3.3.6 CALL PRXFREE............................................................................................................. 62
3.4 Summary ................................................................................................................................. 63
Chapter 4: Applications of Regular Expressions in SAS ............................ 65
4.1 Introduction ............................................................................................................................ 65
4.1.1 Random PII Generator ................................................................................................. 66
4.2 Data Cleansing and Standardization.................................................................................... 72
4.3 Information Extraction ........................................................................................................... 77
4.4 Search and Replacement ...................................................................................................... 80
4.5 Summary ................................................................................................................................. 83
4.5.1 Start Small ..................................................................................................................... 83
4.5.2 Think Big ........................................................................................................................ 83
Appendix A: Perl Version Notes ................................................................ 85
Appendix B: ASCII Code Lookup Tables .................................................... 87
Non-Printing Characters ............................................................................................................. 87
Printing Characters ...................................................................................................................... 89
Appendix C: POSIX Metacharacters .......................................................... 97
Index ...................................................................................................... 101
Plik z chomika:
musli_com
Inne pliki z tego folderu:
Beginning Regular Expressions [Watt 2005-02-04].pdf
(24823 KB)
Introducing Regular Expressions_ Unraveling Regular Expressions, Step-by-Step [Fitzgerald 2012-08-03].pdf
(12024 KB)
Regular Expressions Cookbook_ Detailed Solutions in Eight Programming Languages (2nd ed.) [Goyvaerts & Levithan 2012-09-06].pdf
(11484 KB)
Mastering Regular Expressions_ Powerful Techniques for Perl and Other Tools (2nd ed.) [Friedl 2002-07-15].pdf
(6308 KB)
Introduction to Regular Expressions in SAS [Windham 2014-11-18].pdf
(3047 KB)
Inne foldery tego chomika:
3D Design - Programming
ActionScript
Actionscript - Flash - Flex - Air
Ada
ADO
Zgłoś jeśli
naruszono regulamin