softsearch: 2.4/man/man3/String_

author	plus91-technologies-pvt-ltd
date	Sat, 31 May 2014 11:23:36 -0400
parents
children

rev	line source
16 8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	1 .\" Automatically generated by Pod::Man 2.25 (Pod::Simple 3.16)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	2 .\"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	3 .\" Standard preamble:
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	4 .\" ========================================================================
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	5 .de Sp \" Vertical space (when we can't use .PP)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	6 .if t .sp .5v
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	7 .if n .sp
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	8 ..
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	9 .de Vb \" Begin verbatim text
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	10 .ft CW
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	11 .nf
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	12 .ne \\$1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	13 ..
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	14 .de Ve \" End verbatim text
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	15 .ft R
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	16 .fi
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	17 ..
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	18 .\" Set up some character translations and predefined strings. \*(-- will
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	19 .\" give an unbreakable dash, \(PI will give pi, \(L" will give a left
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	20 .\" double quote, and \(R" will give a right double quote. \(C+ will
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	21 .\" give a nicer C++. Capital omega is used to do unbreakable dashes and
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	22 .\" therefore won't be available. \(C` and \(C' expand to `' in nroff,
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	23 .\" nothing in troff, for use with C<>.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	24 .tr \(*W-
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	25 .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	26 .ie n \{\
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	27 . ds -- \(*W-
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	28 . ds PI pi
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	29 . if (\n(.H=4u)&(1m=24u) .ds -- \(W\h'-12u'\(W\h'-12u'-\" diablo 10 pitch
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	30 . if (\n(.H=4u)&(1m=20u) .ds -- \(W\h'-12u'\(W\h'-8u'-\" diablo 12 pitch
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	31 . ds L" ""
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	32 . ds R" ""
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	33 . ds C` ""
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	34 . ds C' ""
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	35 'br\}
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	36 .el\{\
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	37 . ds -- \\|\(em\\|
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	38 . ds PI \(*p
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	39 . ds L" ``
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	40 . ds R" ''
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	41 'br\}
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	42 .\"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	43 .\" Escape single quotes in literal strings from groff's Unicode transform.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	44 .ie \n(.g .ds Aq \(aq
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	45 .el .ds Aq '
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	46 .\"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	47 .\" If the F register is turned on, we'll generate index entries on stderr for
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	48 .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	49 .\" entries marked with X<> in POD. Of course, you'll have to process the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	50 .\" output yourself in some meaningful fashion.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	51 .ie \nF \{\
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	52 . de IX
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	53 . tm Index:\\$1\t\\n%\t"\\$2"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	54 ..
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	55 . nr % 0
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	56 . rr F
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	57 .\}
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	58 .el \{\
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	59 . de IX
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	60 ..
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	61 .\}
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	62 .\"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	63 .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2).
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	64 .\" Fear. Run. Save yourself. No user-serviceable parts.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	65 . \" fudge factors for nroff and troff
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	66 .if n \{\
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	67 . ds #H 0
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	68 . ds #V .8m
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	69 . ds #F .3m
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	70 . ds #[ \f1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	71 . ds #] \fP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	72 .\}
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	73 .if t \{\
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	74 . ds #H ((1u-(\\\\n(.fu%2u))*.13m)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	75 . ds #V .6m
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	76 . ds #F 0
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	77 . ds #[ \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	78 . ds #] \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	79 .\}
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	80 . \" simple accents for nroff and troff
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	81 .if n \{\
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	82 . ds ' \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	83 . ds ` \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	84 . ds ^ \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	85 . ds , \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	86 . ds ~ ~
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	87 . ds /
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	88 .\}
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	89 .if t \{\
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	90 . ds ' \\k:\h'-(\\n(.wu8/10-\(#H)'\'\h"\|\\n:u"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	91 . ds ` \\k:\h'-(\\n(.wu8/10-\(#H)'\`\h'\|\\n:u'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	92 . ds ^ \\k:\h'-(\\n(.wu10/11-\(#H)'^\h'\|\\n:u'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	93 . ds , \\k:\h'-(\\n(.wu*8/10)',\h'\|\\n:u'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	94 . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'\|\\n:u'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	95 . ds / \\k:\h'-(\\n(.wu8/10-\(#H)'\z\(sl\h'\|\\n:u'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	96 .\}
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	97 . \" troff and (daisy-wheel) nroff accents
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	98 .ds : \\k:\h'-(\\n(.wu8/10-\(#H+.1m+\(#F)'\v'-\(#V'\z.\h'.2m+\(#F'.\h'\|\\n:u'\v'\(#V'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	99 .ds 8 \h'\(#H'\(b\h'-\*(#H'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	100 .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\(#H)/2u'\v'-.3n'\(#[\z\(de\v'.3n'\h'\|\\n:u'\*(#]
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	101 .ds d- \h'\(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\(#H'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	102 .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'\|\\n:u'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	103 .ds th \(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u2/3)'\s-1o\s+1\*(#]
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	104 .ds Th \(#[\s+2I\s-2\h'-\w'I'u3/5'\v'-.3m'o\v'.3m'\*(#]
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	105 .ds ae a\h'-(\w'a'u*4/10)'e
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	106 .ds Ae A\h'-(\w'A'u*4/10)'E
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	107 . \" corrections for vroff
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	108 .if v .ds ~ \\k:\h'-(\\n(.wu9/10-\(#H)'\s-2\u~\d\s+2\h'\|\\n:u'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	109 .if v .ds ^ \\k:\h'-(\\n(.wu10/11-\(#H)'\v'-.4m'^\v'.4m'\h'\|\\n:u'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	110 . \" for low resolution devices (crt and lpr)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	111 .if \n(.H>23 .if \n(.V>19 \
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	112 \{\
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	113 . ds : e
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	114 . ds 8 ss
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	115 . ds o a
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	116 . ds d- d\h'-1'\(ga
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	117 . ds D- D\h'-1'\(hy
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	118 . ds th \o'bp'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	119 . ds Th \o'LP'
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	120 . ds ae ae
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	121 . ds Ae AE
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	122 .\}
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	123 .rm #[ #] #H #V #F C
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	124 .\" ========================================================================
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	125 .\"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	126 .IX Title "Approx 3"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	127 .TH Approx 3 "2013-01-22" "perl v5.14.2" "User Contributed Perl Documentation"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	128 .\" For nroff, turn off justification. Always turn off hyphenation; it makes
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	129 .\" way too many mistakes in technical documents.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	130 .if n .ad l
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	131 .nh
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	132 .SH "NAME"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	133 String::Approx \- Perl extension for approximate matching (fuzzy matching)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	134 .SH "SYNOPSIS"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	135 .IX Header "SYNOPSIS"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	136 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	137 \& use String::Approx \(Aqamatch\(Aq;
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	138 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	139 \& print if amatch("foobar");
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	140 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	141 \& my @matches = amatch("xyzzy", @inputs);
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	142 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	143 \& my @catches = amatch("plugh", [\(Aq2\(Aq], @inputs);
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	144 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	145 .SH "DESCRIPTION"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	146 .IX Header "DESCRIPTION"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	147 String::Approx lets you match and substitute strings approximately.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	148 With this you can emulate errors: typing errorrs, speling errors,
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	149 closely related vocabularies (colour color), genetic mutations (\s-1GAG\s0
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	150 \&\s-1ACT\s0), abbreviations (McScot, MacScot).
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	151 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	152 \&\s-1NOTE:\s0 String::Approx suits the task of \fBstring matching\fR, not
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	153 \&\fBstring comparison\fR, and it works for \fBstrings\fR, not for \fBtext\fR.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	154 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	155 If you want to compare strings for similarity, you probably just want
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	156 the Levenshtein edit distance (explained below), the Text::Levenshtein
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	157 and Text::LevenshteinXS modules in \s-1CPAN\s0. See also Text::WagnerFischer
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	158 and Text::PhraseDistance. (There are functions for this in String::Approx,
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	159 e.g. \fIadist()\fR, but their results sometimes differ from the bare Levenshtein
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	160 et al.)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	161 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	162 If you want to compare things like text or source code, consisting of
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	163 \&\fBwords\fR or \fBtokens\fR and \fBphrases\fR and \fBsentences\fR, or
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	164 \&\fBexpressions\fR and \fBstatements\fR, you should probably use some other
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	165 tool than String::Approx, like for example the standard \s-1UNIX\s0 \fIdiff\fR\\|(1)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	166 tool, or the Algorithm::Diff module from \s-1CPAN\s0.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	167 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	168 The measure of \fBapproximateness\fR is the \fILevenshtein edit distance\fR.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	169 It is the total number of \(L"edits\(R": insertions,
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	170 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	171 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	172 \& word world
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	173 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	174 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	175 deletions,
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	176 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	177 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	178 \& monkey money
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	179 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	180 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	181 and substitutions
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	182 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	183 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	184 \& sun fun
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	185 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	186 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	187 required to transform a string to another string. For example, to
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	188 transform \fI\(L"lead\(R"\fR into \fI\(L"gold\(R"\fR, you need three edits:
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	189 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	190 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	191 \& lead gead goad gold
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	192 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	193 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	194 The edit distance of \(L"lead\(R" and \(L"gold\(R" is therefore three, or 75%.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	195 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	196 \&\fBString::Approx\fR uses the Levenshtein edit distance as its measure, but
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	197 String::Approx is not well-suited for comparing strings of different
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	198 length, in other words, if you want a \(L"fuzzy eq\(R", see above.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	199 String::Approx is more like regular expressions or \fIindex()\fR, it finds
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	200 substrings that are close matches.>
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	201 .SH "MATCH"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	202 .IX Header "MATCH"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	203 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	204 \& use String::Approx \(Aqamatch\(Aq;
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	205 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	206 \& $matched = amatch("pattern")
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	207 \& $matched = amatch("pattern", [ modifiers ])
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	208 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	209 \& $any_matched = amatch("pattern", @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	210 \& $any_matched = amatch("pattern", [ modifiers ], @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	211 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	212 \& @match = amatch("pattern")
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	213 \& @match = amatch("pattern", [ modifiers ])
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	214 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	215 \& @matches = amatch("pattern", @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	216 \& @matches = amatch("pattern", [ modifiers ], @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	217 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	218 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	219 Match \fBpattern\fR approximately. In list context return the matched
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	220 \&\fB\f(CB@inputs\fB\fR. If no inputs are given, match against the \fB\f(CB$_\fB\fR. In scalar
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	221 context return true if \fIany\fR of the inputs match, false if none match.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	222 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	223 Notice that the pattern is a string. Not a regular expression. None
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	224 of the regular expression notations (^, ., *, and so on) work. They
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	225 are characters just like the others. Note-on-note: some limited form
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	226 of \fI\(L"regular expressionism\(R"\fR is planned in future: for example
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	227 character classes ([abc]) and \fIany-chars\fR (.). But that feature will
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	228 be turned on by a special \fImodifier\fR (just a guess: \(L"r\(R"), so there
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	229 should be no backward compatibility problem.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	230 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	231 Notice also that matching is not symmetric. The inputs are matched
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	232 against the pattern, not the other way round. In other words: the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	233 pattern can be a substring, a submatch, of an input element. An input
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	234 element is always a superstring of the pattern.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	235 .SS "\s-1MODIFIERS\s0"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	236 .IX Subsection "MODIFIERS"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	237 With the modifiers you can control the amount of approximateness and
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	238 certain other control variables. The modifiers are one or more
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	239 strings, for example \fB\(L"i\(R"\fR, within a string optionally separated by
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	240 whitespace. The modifiers are inside an anonymous array: the \fB[ ]\fR
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	241 in the syntax are not notational, they really do mean \fB[ ]\fR, for
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	242 example \fB[ \(L"i\(R", \(L"2\(R" ]\fR. \fB[\(L"2 i\(R"]\fR would be identical.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	243 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	244 The implicit default approximateness is 10%, rounded up. In other
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	245 words: every tenth character in the pattern may be an error, an edit.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	246 You can explicitly set the maximum approximateness by supplying a
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	247 modifier like
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	248 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	249 .Vb 2
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	250 \& number
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	251 \& number%
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	252 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	253 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	254 Examples: \fB\(L"3\(R"\fR, \fB\(L"15%\(R"\fR.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	255 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	256 Note that \f(CW\(C`0%\(C'\fR is not rounded up, it is equal to \f(CW0\fR.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	257 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	258 Using a similar syntax you can separately control the maximum number
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	259 of insertions, deletions, and substitutions by prefixing the numbers
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	260 with I, D, or S, like this:
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	261 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	262 .Vb 6
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	263 \& Inumber
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	264 \& Inumber%
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	265 \& Dnumber
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	266 \& Dnumber%
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	267 \& Snumber
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	268 \& Snumber%
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	269 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	270 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	271 Examples: \fB\(L"I2\(R"\fR, \fB\(L"D20%\(R"\fR, \fB\(L"S0\(R"\fR.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	272 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	273 You can ignore case (\fB\(L"A\(R"\fR becames equal to \fB\(L"a\(R"\fR and vice versa)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	274 by adding the \fB\(L"i\(R"\fR modifier.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	275 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	276 For example
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	277 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	278 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	279 \& [ "i 25%", "S0" ]
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	280 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	281 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	282 means \fIignore case\fR, \fIallow every fourth character to be \(L"an edit\(R"\fR,
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	283 but allow \fIno substitutions\fR. (See \s-1NOTES\s0 about disallowing
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	284 substitutions or insertions.)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	285 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	286 \&\s-1NOTE:\s0 setting \f(CW\(C`I0 D0 S0\(C'\fR is not equivalent to using \fIindex()\fR.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	287 If you want to use \fIindex()\fR, use \fIindex()\fR.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	288 .SH "SUBSTITUTE"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	289 .IX Header "SUBSTITUTE"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	290 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	291 \& use String::Approx \(Aqasubstitute\(Aq;
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	292 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	293 \& @substituted = asubstitute("pattern", "replacement")
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	294 \& @substituted = asubstitute("pattern", "replacement", @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	295 \& @substituted = asubstitute("pattern", "replacement", [ modifiers ])
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	296 \& @substituted = asubstitute("pattern", "replacement",
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	297 \& [ modifiers ], @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	298 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	299 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	300 Substitute approximate \fBpattern\fR with \fBreplacement\fR and return as a
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	301 list <copies> of \fB\f(CB@inputs\fB\fR, the substitutions having been made on the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	302 elements that did match the pattern. If no inputs are given,
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	303 substitute in the \fB\f(CB$_\fB\fR. The replacement can contain magic strings
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	304 \&\fB$&\fR, \fB$`\fR, \fB$'\fR that stand for the matched string, the string
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	305 before it, and the string after it, respectively. All the other
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	306 arguments are as in \f(CW\(C`amatch()\(C'\fR, plus one additional modifier, \fB\(L"g\(R"\fR
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	307 which means substitute globally (all the matches in an element and not
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	308 just the first one, as is the default).
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	309 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	310 See \(L"\s-1BAD\s0 \s-1NEWS\s0\(R" about the unfortunate stinginess of \f(CW\(C`asubstitute()\(C'\fR.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	311 .SH "INDEX"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	312 .IX Header "INDEX"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	313 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	314 \& use String::Approx \(Aqaindex\(Aq;
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	315 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	316 \& $index = aindex("pattern")
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	317 \& @indices = aindex("pattern", @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	318 \& $index = aindex("pattern", [ modifiers ])
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	319 \& @indices = aindex("pattern", [ modifiers ], @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	320 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	321 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	322 Like \f(CW\(C`amatch()\(C'\fR but returns the index/indices at which the pattern
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	323 matches approximately. In list context and if \f(CW@inputs\fR are used,
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	324 returns a list of indices, one index for each input element.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	325 If there's no approximate match, \f(CW\(C`\-1\(C'\fR is returned as the index.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	326 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	327 \&\s-1NOTE:\s0 if there is character repetition (e.g. \(L"aa\(R") either in
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	328 the pattern or in the text, the returned index might start
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	329 \&\(L"too early\(R". This is consistent with the goal of the module
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	330 of matching \(L"as early as possible\(R", just like regular expressions
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	331 (that there might be a \(L"less approximate\(R" match starting later is
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	332 of somewhat irrelevant).
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	333 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	334 There's also backwards-scanning \f(CW\(C`arindex()\(C'\fR.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	335 .SH "SLICE"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	336 .IX Header "SLICE"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	337 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	338 \& use String::Approx \(Aqaslice\(Aq;
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	339 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	340 \& ($index, $size) = aslice("pattern")
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	341 \& ([$i0, $s0], ...) = aslice("pattern", @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	342 \& ($index, $size) = aslice("pattern", [ modifiers ])
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	343 \& ([$i0, $s0], ...) = aslice("pattern", [ modifiers ], @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	344 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	345 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	346 Like \f(CW\(C`aindex()\(C'\fR but returns also the size (length) of the match.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	347 If the match fails, returns an empty list (when matching against \f(CW$_\fR)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	348 or an empty anonymous list corresponding to the particular input.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	349 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	350 \&\s-1NOTE:\s0 size of the match will very probably be something you did not
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	351 expect (such as longer than the pattern, or a negative number). This
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	352 may or may not be fixed in future releases. Also the beginning of the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	353 match may vary from the expected as with \fIaindex()\fR, see above.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	354 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	355 If the modifier
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	356 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	357 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	358 \& "minimal_distance"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	359 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	360 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	361 is used, the minimal possible edit distance is returned as the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	362 third element:
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	363 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	364 .Vb 2
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	365 \& ($index, $size, $distance) = aslice("pattern", [ modifiers ])
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	366 \& ([$i0, $s0, $d0], ...) = aslice("pattern", [ modifiers ], @inputs)
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	367 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	368 .SH "DISTANCE"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	369 .IX Header "DISTANCE"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	370 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	371 \& use String::Approx \(Aqadist\(Aq;
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	372 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	373 \& $dist = adist("pattern", $input);
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	374 \& @dist = adist("pattern", @input);
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	375 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	376 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	377 Return the \fIedit distance\fR or distances between the pattern and the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	378 input or inputs. Zero edit distance means exact match. (Remember
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	379 that the match can 'float' in the inputs, the match is a substring
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	380 match.) If the pattern is longer than the input or inputs, the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	381 returned distance or distances is or are negative.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	382 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	383 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	384 \& use String::Approx \(Aqadistr\(Aq;
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	385 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	386 \& $dist = adistr("pattern", $input);
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	387 \& @dist = adistr("pattern", @inputs);
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	388 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	389 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	390 Return the \fBrelative\fR \fIedit distance\fR or distances between the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	391 pattern and the input or inputs. Zero relative edit distance means
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	392 exact match, one means completely different. (Remember that the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	393 match can 'float' in the inputs, the match is a substring match.) If
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	394 the pattern is longer than the input or inputs, the returned distance
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	395 or distances is or are negative.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	396 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	397 You can use \fIadist()\fR or \fIadistr()\fR to sort the inputs according to their
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	398 approximateness:
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	399 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	400 .Vb 3
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	401 \& my %d;
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	402 \& @d{@inputs} = map { abs } adistr("pattern", @inputs);
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	403 \& my @d = sort { $d{$a} <=> $d{$b} } @inputs;
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	404 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	405 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	406 Now \f(CW@d\fR contains the inputs, the most like \f(CW"pattern"\fR first.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	407 .SH "CONTROLLING THE CACHE"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	408 .IX Header "CONTROLLING THE CACHE"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	409 \&\f(CW\(C`String::Approx\(C'\fR maintains a \s-1LU\s0 (least-used) cache that holds the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	410 \&'matching engines' for each instance of a \fIpattern+modifiers\fR. The
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	411 cache is intended to help the case where you match a small set of
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	412 patterns against a large set of string. However, the more engines you
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	413 cache the more you eat memory. If you have a lot of different
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	414 patterns or if you have a lot of memory to burn, you may want to
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	415 control the cache yourself. For example, allowing a larger cache
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	416 consumes more memory but probably runs a little bit faster since the
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	417 cache fills (and needs flushing) less often.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	418 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	419 The cache has two parameters: \fImax\fR and \fIpurge\fR. The first one
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	420 is the maximum size of the cache and the second one is the cache
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	421 flushing ratio: when the number of cache entries exceeds \fImax\fR,
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	422 \&\fImax\fR times \fIpurge\fR cache entries are flushed. The default
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	423 values are 1000 and 0.75, respectively, which means that when
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	424 the 1001st entry would be cached, 750 least used entries will
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	425 be removed from the cache. To access the parameters you can
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	426 use the calls
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	427 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	428 .Vb 2
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	429 \& $now_max = String::Approx::cache_max();
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	430 \& String::Approx::cache_max($new_max);
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	431 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	432 \& $now_purge = String::Approx::cache_purge();
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	433 \& String::Approx::cache_purge($new_purge);
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	434 \&
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	435 \& $limit = String::Approx::cache_n_purge();
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	436 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	437 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	438 To be honest, there are actually \fBtwo\fR caches: the first one is used
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	439 far the patterns with no modifiers, the second one for the patterns
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	440 with pattern modifiers. Using the standard parameters you will
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	441 therefore actually cache up to 2000 entries. The above calls control
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	442 both caches for the same price.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	443 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	444 To disable caching completely use
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	445 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	446 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	447 \& String::Approx::cache_disable();
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	448 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	449 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	450 Note that this doesn't flush any possibly existing cache entries,
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	451 to do that use
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	452 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	453 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	454 \& String::Approx::cache_flush_all();
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	455 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	456 .SH "NOTES"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	457 .IX Header "NOTES"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	458 Because matching is by \fIsubstrings\fR, not by whole strings, insertions
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	459 and substitutions produce often very similar results: \(L"abcde\(R" matches
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	460 \&\(L"axbcde\(R" either by insertion \fBor\fR substitution of \(L"x\(R".
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	461 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	462 The maximum edit distance is also the maximum number of edits.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	463 That is, the \fB\(L"I2\(R"\fR in
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	464 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	465 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	466 \& amatch("abcd", ["I2"])
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	467 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	468 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	469 is useless because the maximum edit distance is (implicitly) 1.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	470 You may have meant to say
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	471 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	472 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	473 \& amatch("abcd", ["2D1S1"])
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	474 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	475 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	476 or something like that.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	477 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	478 If you want to simulate transposes
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	479 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	480 .Vb 1
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	481 \& feet fete
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	482 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	483 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	484 you need to allow at least edit distance of two because in terms of
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	485 our edit primitives a transpose is first one deletion and then one
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	486 insertion.
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	487 .SS "\s-1TEXT\s0 \s-1POSITION\s0"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	488 .IX Subsection "TEXT POSITION"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	489 The starting and ending positions of matching, substituting, indexing, or
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	490 slicing can be changed from the beginning and end of the input(s) to
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	491 some other positions by using either or both of the modifiers
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	492 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	493 .Vb 2
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	494 \& "initial_position=24"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	495 \& "final_position=42"
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	496 .Ve
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	497 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	498 or the both the modifiers
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	499 .PP
8eb7d93f7e58 Uploaded plus91-technologies-pvt-ltd parents: diff changeset	500 .Vb 2

16

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

1 .\" Automatically generated by Pod::Man 2.25 (Pod::Simple 3.16)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

2 .\"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

3 .\" Standard preamble:

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

4 .\" ========================================================================

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

5 .de Sp \" Vertical space (when we can't use .PP)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

6 .if t .sp .5v

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

7 .if n .sp

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

8 ..

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

9 .de Vb \" Begin verbatim text

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

10 .ft CW

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

11 .nf

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

12 .ne \\$1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

13 ..

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

14 .de Ve \" End verbatim text

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

15 .ft R

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

16 .fi

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

17 ..

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

18 .\" Set up some character translations and predefined strings. \*(-- will

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

19 .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

20 .\" double quote, and \*(R" will give a right double quote. \*(C+ will

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

21 .\" give a nicer C++. Capital omega is used to do unbreakable dashes and

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

22 .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

23 .\" nothing in troff, for use with C<>.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

24 .tr \(*W-

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

25 .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

26 .ie n \{\

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

27 . ds -- \(*W-

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

28 . ds PI pi

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

29 . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

30 . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

31 . ds L" ""

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

32 . ds R" ""

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

33 . ds C` ""

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

34 . ds C' ""

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

35 'br\}

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

36 .el\{\

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

37 . ds -- \|\(em\|

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

38 . ds PI \(*p

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

39 . ds L" ``

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

40 . ds R" ''

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

41 'br\}

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

42 .\"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

43 .\" Escape single quotes in literal strings from groff's Unicode transform.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

44 .ie \n(.g .ds Aq \(aq

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

45 .el .ds Aq '

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

46 .\"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

47 .\" If the F register is turned on, we'll generate index entries on stderr for

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

48 .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

49 .\" entries marked with X<> in POD. Of course, you'll have to process the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

50 .\" output yourself in some meaningful fashion.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

51 .ie \nF \{\

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

52 . de IX

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

53 . tm Index:\\$1\t\\n%\t"\\$2"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

54 ..

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

55 . nr % 0

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

56 . rr F

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

57 .\}

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

58 .el \{\

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

59 . de IX

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

60 ..

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

61 .\}

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

62 .\"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

63 .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2).

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

64 .\" Fear. Run. Save yourself. No user-serviceable parts.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

65 . \" fudge factors for nroff and troff

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

66 .if n \{\

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

67 . ds #H 0

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

68 . ds #V .8m

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

69 . ds #F .3m

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

70 . ds #[ \f1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

71 . ds #] \fP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

72 .\}

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

73 .if t \{\

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

74 . ds #H ((1u-(\\\\n(.fu%2u))*.13m)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

75 . ds #V .6m

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

76 . ds #F 0

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

77 . ds #[ \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

78 . ds #] \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

79 .\}

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

80 . \" simple accents for nroff and troff

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

81 .if n \{\

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

82 . ds ' \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

83 . ds ` \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

84 . ds ^ \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

85 . ds , \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

86 . ds ~ ~

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

87 . ds /

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

88 .\}

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

89 .if t \{\

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

90 . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

91 . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

92 . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

93 . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

94 . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

95 . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

96 .\}

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

97 . \" troff and (daisy-wheel) nroff accents

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

98 .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

99 .ds 8 \h'\*(#H'\(*b\h'-\*(#H'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

100 .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#]

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

101 .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

102 .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

103 .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#]

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

104 .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#]

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

105 .ds ae a\h'-(\w'a'u*4/10)'e

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

106 .ds Ae A\h'-(\w'A'u*4/10)'E

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

107 . \" corrections for vroff

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

108 .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

109 .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

110 . \" for low resolution devices (crt and lpr)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

111 .if \n(.H>23 .if \n(.V>19 \

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

112 \{\

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

113 . ds : e

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

114 . ds 8 ss

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

115 . ds o a

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

116 . ds d- d\h'-1'\(ga

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

117 . ds D- D\h'-1'\(hy

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

118 . ds th \o'bp'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

119 . ds Th \o'LP'

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

120 . ds ae ae

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

121 . ds Ae AE

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

122 .\}

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

123 .rm #[ #] #H #V #F C

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

124 .\" ========================================================================

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

125 .\"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

126 .IX Title "Approx 3"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

127 .TH Approx 3 "2013-01-22" "perl v5.14.2" "User Contributed Perl Documentation"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

128 .\" For nroff, turn off justification. Always turn off hyphenation; it makes

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

129 .\" way too many mistakes in technical documents.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

130 .if n .ad l

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

131 .nh

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

132 .SH "NAME"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

133 String::Approx \- Perl extension for approximate matching (fuzzy matching)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

134 .SH "SYNOPSIS"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

135 .IX Header "SYNOPSIS"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

136 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

137 \& use String::Approx \*(Aqamatch\*(Aq;

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

138 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

139 \& print if amatch("foobar");

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

140 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

141 \& my @matches = amatch("xyzzy", @inputs);

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

142 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

143 \& my @catches = amatch("plugh", [\*(Aq2\*(Aq], @inputs);

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

144 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

145 .SH "DESCRIPTION"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

146 .IX Header "DESCRIPTION"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

147 String::Approx lets you match and substitute strings approximately.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

148 With this you can emulate errors: typing errorrs, speling errors,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

149 closely related vocabularies (colour color), genetic mutations (\s-1GAG\s0

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

150 \&\s-1ACT\s0), abbreviations (McScot, MacScot).

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

151 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

152 \&\s-1NOTE:\s0 String::Approx suits the task of \fBstring matching\fR, not

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

153 \&\fBstring comparison\fR, and it works for \fBstrings\fR, not for \fBtext\fR.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

154 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

155 If you want to compare strings for similarity, you probably just want

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

156 the Levenshtein edit distance (explained below), the Text::Levenshtein

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

157 and Text::LevenshteinXS modules in \s-1CPAN\s0. See also Text::WagnerFischer

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

158 and Text::PhraseDistance. (There are functions for this in String::Approx,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

159 e.g. \fIadist()\fR, but their results sometimes differ from the bare Levenshtein

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

160 et al.)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

161 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

162 If you want to compare things like text or source code, consisting of

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

163 \&\fBwords\fR or \fBtokens\fR and \fBphrases\fR and \fBsentences\fR, or

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

164 \&\fBexpressions\fR and \fBstatements\fR, you should probably use some other

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

165 tool than String::Approx, like for example the standard \s-1UNIX\s0 \fIdiff\fR\|(1)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

166 tool, or the Algorithm::Diff module from \s-1CPAN\s0.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

167 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

168 The measure of \fBapproximateness\fR is the \fILevenshtein edit distance\fR.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

169 It is the total number of \*(L"edits\*(R": insertions,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

170 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

171 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

172 \& word world

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

173 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

174 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

175 deletions,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

176 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

177 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

178 \& monkey money

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

179 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

180 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

181 and substitutions

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

182 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

183 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

184 \& sun fun

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

185 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

186 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

187 required to transform a string to another string. For example, to

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

188 transform \fI\*(L"lead\*(R"\fR into \fI\*(L"gold\*(R"\fR, you need three edits:

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

189 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

190 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

191 \& lead gead goad gold

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

192 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

193 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

194 The edit distance of \*(L"lead\*(R" and \*(L"gold\*(R" is therefore three, or 75%.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

195 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

196 \&\fBString::Approx\fR uses the Levenshtein edit distance as its measure, but

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

197 String::Approx is not well-suited for comparing strings of different

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

198 length, in other words, if you want a \*(L"fuzzy eq\*(R", see above.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

199 String::Approx is more like regular expressions or \fIindex()\fR, it finds

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

200 substrings that are close matches.>

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

201 .SH "MATCH"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

202 .IX Header "MATCH"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

203 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

204 \& use String::Approx \*(Aqamatch\*(Aq;

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

205 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

206 \& $matched = amatch("pattern")

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

207 \& $matched = amatch("pattern", [ modifiers ])

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

208 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

209 \& $any_matched = amatch("pattern", @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

210 \& $any_matched = amatch("pattern", [ modifiers ], @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

211 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

212 \& @match = amatch("pattern")

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

213 \& @match = amatch("pattern", [ modifiers ])

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

214 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

215 \& @matches = amatch("pattern", @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

216 \& @matches = amatch("pattern", [ modifiers ], @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

217 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

218 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

219 Match \fBpattern\fR approximately. In list context return the matched

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

220 \&\fB\f(CB@inputs\fB\fR. If no inputs are given, match against the \fB\f(CB$_\fB\fR. In scalar

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

221 context return true if \fIany\fR of the inputs match, false if none match.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

222 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

223 Notice that the pattern is a string. Not a regular expression. None

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

224 of the regular expression notations (^, ., *, and so on) work. They

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

225 are characters just like the others. Note-on-note: some limited form

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

226 of \fI\*(L"regular expressionism\*(R"\fR is planned in future: for example

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

227 character classes ([abc]) and \fIany-chars\fR (.). But that feature will

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

228 be turned on by a special \fImodifier\fR (just a guess: \*(L"r\*(R"), so there

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

229 should be no backward compatibility problem.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

230 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

231 Notice also that matching is not symmetric. The inputs are matched

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

232 against the pattern, not the other way round. In other words: the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

233 pattern can be a substring, a submatch, of an input element. An input

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

234 element is always a superstring of the pattern.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

235 .SS "\s-1MODIFIERS\s0"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

236 .IX Subsection "MODIFIERS"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

237 With the modifiers you can control the amount of approximateness and

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

238 certain other control variables. The modifiers are one or more

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

239 strings, for example \fB\*(L"i\*(R"\fR, within a string optionally separated by

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

240 whitespace. The modifiers are inside an anonymous array: the \fB[ ]\fR

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

241 in the syntax are not notational, they really do mean \fB[ ]\fR, for

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

242 example \fB[ \*(L"i\*(R", \*(L"2\*(R" ]\fR. \fB[\*(L"2 i\*(R"]\fR would be identical.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

243 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

244 The implicit default approximateness is 10%, rounded up. In other

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

245 words: every tenth character in the pattern may be an error, an edit.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

246 You can explicitly set the maximum approximateness by supplying a

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

247 modifier like

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

248 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

249 .Vb 2

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

250 \& number

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

251 \& number%

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

252 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

253 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

254 Examples: \fB\*(L"3\*(R"\fR, \fB\*(L"15%\*(R"\fR.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

255 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

256 Note that \f(CW\*(C`0%\*(C'\fR is not rounded up, it is equal to \f(CW0\fR.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

257 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

258 Using a similar syntax you can separately control the maximum number

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

259 of insertions, deletions, and substitutions by prefixing the numbers

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

260 with I, D, or S, like this:

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

261 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

262 .Vb 6

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

263 \& Inumber

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

264 \& Inumber%

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

265 \& Dnumber

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

266 \& Dnumber%

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

267 \& Snumber

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

268 \& Snumber%

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

269 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

270 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

271 Examples: \fB\*(L"I2\*(R"\fR, \fB\*(L"D20%\*(R"\fR, \fB\*(L"S0\*(R"\fR.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

272 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

273 You can ignore case (\fB\*(L"A\*(R"\fR becames equal to \fB\*(L"a\*(R"\fR and vice versa)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

274 by adding the \fB\*(L"i\*(R"\fR modifier.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

275 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

276 For example

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

277 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

278 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

279 \& [ "i 25%", "S0" ]

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

280 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

281 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

282 means \fIignore case\fR, \fIallow every fourth character to be \*(L"an edit\*(R"\fR,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

283 but allow \fIno substitutions\fR. (See \s-1NOTES\s0 about disallowing

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

284 substitutions or insertions.)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

285 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

286 \&\s-1NOTE:\s0 setting \f(CW\*(C`I0 D0 S0\*(C'\fR is not equivalent to using \fIindex()\fR.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

287 If you want to use \fIindex()\fR, use \fIindex()\fR.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

288 .SH "SUBSTITUTE"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

289 .IX Header "SUBSTITUTE"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

290 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

291 \& use String::Approx \*(Aqasubstitute\*(Aq;

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

292 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

293 \& @substituted = asubstitute("pattern", "replacement")

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

294 \& @substituted = asubstitute("pattern", "replacement", @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

295 \& @substituted = asubstitute("pattern", "replacement", [ modifiers ])

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

296 \& @substituted = asubstitute("pattern", "replacement",

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

297 \& [ modifiers ], @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

298 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

299 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

300 Substitute approximate \fBpattern\fR with \fBreplacement\fR and return as a

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

301 list <copies> of \fB\f(CB@inputs\fB\fR, the substitutions having been made on the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

302 elements that did match the pattern. If no inputs are given,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

303 substitute in the \fB\f(CB$_\fB\fR. The replacement can contain magic strings

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

304 \&\fB$&\fR, \fB$`\fR, \fB$'\fR that stand for the matched string, the string

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

305 before it, and the string after it, respectively. All the other

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

306 arguments are as in \f(CW\*(C`amatch()\*(C'\fR, plus one additional modifier, \fB\*(L"g\*(R"\fR

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

307 which means substitute globally (all the matches in an element and not

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

308 just the first one, as is the default).

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

309 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

310 See \*(L"\s-1BAD\s0 \s-1NEWS\s0\*(R" about the unfortunate stinginess of \f(CW\*(C`asubstitute()\*(C'\fR.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

311 .SH "INDEX"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

312 .IX Header "INDEX"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

313 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

314 \& use String::Approx \*(Aqaindex\*(Aq;

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

315 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

316 \& $index = aindex("pattern")

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

317 \& @indices = aindex("pattern", @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

318 \& $index = aindex("pattern", [ modifiers ])

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

319 \& @indices = aindex("pattern", [ modifiers ], @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

320 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

321 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

322 Like \f(CW\*(C`amatch()\*(C'\fR but returns the index/indices at which the pattern

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

323 matches approximately. In list context and if \f(CW@inputs\fR are used,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

324 returns a list of indices, one index for each input element.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

325 If there's no approximate match, \f(CW\*(C`\-1\*(C'\fR is returned as the index.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

326 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

327 \&\s-1NOTE:\s0 if there is character repetition (e.g. \*(L"aa\*(R") either in

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

328 the pattern or in the text, the returned index might start

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

329 \&\*(L"too early\*(R". This is consistent with the goal of the module

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

330 of matching \*(L"as early as possible\*(R", just like regular expressions

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

331 (that there might be a \*(L"less approximate\*(R" match starting later is

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

332 of somewhat irrelevant).

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

333 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

334 There's also backwards-scanning \f(CW\*(C`arindex()\*(C'\fR.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

335 .SH "SLICE"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

336 .IX Header "SLICE"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

337 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

338 \& use String::Approx \*(Aqaslice\*(Aq;

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

339 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

340 \& ($index, $size) = aslice("pattern")

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

341 \& ([$i0, $s0], ...) = aslice("pattern", @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

342 \& ($index, $size) = aslice("pattern", [ modifiers ])

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

343 \& ([$i0, $s0], ...) = aslice("pattern", [ modifiers ], @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

344 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

345 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

346 Like \f(CW\*(C`aindex()\*(C'\fR but returns also the size (length) of the match.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

347 If the match fails, returns an empty list (when matching against \f(CW$_\fR)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

348 or an empty anonymous list corresponding to the particular input.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

349 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

350 \&\s-1NOTE:\s0 size of the match will very probably be something you did not

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

351 expect (such as longer than the pattern, or a negative number). This

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

352 may or may not be fixed in future releases. Also the beginning of the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

353 match may vary from the expected as with \fIaindex()\fR, see above.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

354 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

355 If the modifier

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

356 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

357 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

358 \& "minimal_distance"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

359 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

360 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

361 is used, the minimal possible edit distance is returned as the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

362 third element:

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

363 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

364 .Vb 2

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

365 \& ($index, $size, $distance) = aslice("pattern", [ modifiers ])

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

366 \& ([$i0, $s0, $d0], ...) = aslice("pattern", [ modifiers ], @inputs)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

367 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

368 .SH "DISTANCE"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

369 .IX Header "DISTANCE"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

370 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

371 \& use String::Approx \*(Aqadist\*(Aq;

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

372 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

373 \& $dist = adist("pattern", $input);

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

374 \& @dist = adist("pattern", @input);

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

375 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

376 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

377 Return the \fIedit distance\fR or distances between the pattern and the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

378 input or inputs. Zero edit distance means exact match. (Remember

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

379 that the match can 'float' in the inputs, the match is a substring

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

380 match.) If the pattern is longer than the input or inputs, the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

381 returned distance or distances is or are negative.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

382 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

383 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

384 \& use String::Approx \*(Aqadistr\*(Aq;

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

385 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

386 \& $dist = adistr("pattern", $input);

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

387 \& @dist = adistr("pattern", @inputs);

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

388 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

389 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

390 Return the \fBrelative\fR \fIedit distance\fR or distances between the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

391 pattern and the input or inputs. Zero relative edit distance means

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

392 exact match, one means completely different. (Remember that the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

393 match can 'float' in the inputs, the match is a substring match.) If

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

394 the pattern is longer than the input or inputs, the returned distance

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

395 or distances is or are negative.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

396 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

397 You can use \fIadist()\fR or \fIadistr()\fR to sort the inputs according to their

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

398 approximateness:

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

399 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

400 .Vb 3

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

401 \& my %d;

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

402 \& @d{@inputs} = map { abs } adistr("pattern", @inputs);

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

403 \& my @d = sort { $d{$a} <=> $d{$b} } @inputs;

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

404 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

405 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

406 Now \f(CW@d\fR contains the inputs, the most like \f(CW"pattern"\fR first.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

407 .SH "CONTROLLING THE CACHE"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

408 .IX Header "CONTROLLING THE CACHE"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

409 \&\f(CW\*(C`String::Approx\*(C'\fR maintains a \s-1LU\s0 (least-used) cache that holds the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

410 \&'matching engines' for each instance of a \fIpattern+modifiers\fR. The

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

411 cache is intended to help the case where you match a small set of

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

412 patterns against a large set of string. However, the more engines you

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

413 cache the more you eat memory. If you have a lot of different

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

414 patterns or if you have a lot of memory to burn, you may want to

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

415 control the cache yourself. For example, allowing a larger cache

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

416 consumes more memory but probably runs a little bit faster since the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

417 cache fills (and needs flushing) less often.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

418 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

419 The cache has two parameters: \fImax\fR and \fIpurge\fR. The first one

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

420 is the maximum size of the cache and the second one is the cache

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

421 flushing ratio: when the number of cache entries exceeds \fImax\fR,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

422 \&\fImax\fR times \fIpurge\fR cache entries are flushed. The default

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

423 values are 1000 and 0.75, respectively, which means that when

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

424 the 1001st entry would be cached, 750 least used entries will

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

425 be removed from the cache. To access the parameters you can

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

426 use the calls

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

427 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

428 .Vb 2

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

429 \& $now_max = String::Approx::cache_max();

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

430 \& String::Approx::cache_max($new_max);

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

431 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

432 \& $now_purge = String::Approx::cache_purge();

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

433 \& String::Approx::cache_purge($new_purge);

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

434 \&

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

435 \& $limit = String::Approx::cache_n_purge();

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

436 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

437 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

438 To be honest, there are actually \fBtwo\fR caches: the first one is used

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

439 far the patterns with no modifiers, the second one for the patterns

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

440 with pattern modifiers. Using the standard parameters you will

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

441 therefore actually cache up to 2000 entries. The above calls control

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

442 both caches for the same price.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

443 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

444 To disable caching completely use

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

445 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

446 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

447 \& String::Approx::cache_disable();

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

448 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

449 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

450 Note that this doesn't flush any possibly existing cache entries,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

451 to do that use

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

452 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

453 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

454 \& String::Approx::cache_flush_all();

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

455 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

456 .SH "NOTES"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

457 .IX Header "NOTES"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

458 Because matching is by \fIsubstrings\fR, not by whole strings, insertions

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

459 and substitutions produce often very similar results: \*(L"abcde\*(R" matches

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

460 \&\*(L"axbcde\*(R" either by insertion \fBor\fR substitution of \*(L"x\*(R".

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

461 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

462 The maximum edit distance is also the maximum number of edits.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

463 That is, the \fB\*(L"I2\*(R"\fR in

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

464 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

465 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

466 \& amatch("abcd", ["I2"])

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

467 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

468 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

469 is useless because the maximum edit distance is (implicitly) 1.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

470 You may have meant to say

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

471 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

472 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

473 \& amatch("abcd", ["2D1S1"])

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

474 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

475 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

476 or something like that.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

477 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

478 If you want to simulate transposes

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

479 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

480 .Vb 1

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

481 \& feet fete

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

482 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

483 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

484 you need to allow at least edit distance of two because in terms of

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

485 our edit primitives a transpose is first one deletion and then one

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

486 insertion.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

487 .SS "\s-1TEXT\s0 \s-1POSITION\s0"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

488 .IX Subsection "TEXT POSITION"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

489 The starting and ending positions of matching, substituting, indexing, or

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

490 slicing can be changed from the beginning and end of the input(s) to

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

491 some other positions by using either or both of the modifiers

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

492 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

493 .Vb 2

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

494 \& "initial_position=24"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

495 \& "final_position=42"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

496 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

497 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

498 or the both the modifiers

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

499 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

500 .Vb 2

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

501 \& "initial_position=24"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

502 \& "position_range=10"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

503 .Ve

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

504 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

505 By setting the \fB\*(L"position_range\*(R"\fR to be zero you can limit

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

506 (anchor) the operation to happen only once (if a match is possible)

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

507 at the position.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

508 .SH "VERSION"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

509 .IX Header "VERSION"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

510 Major release 3.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

511 .SH "CHANGES FROM VERSION 2"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

512 .IX Header "CHANGES FROM VERSION 2"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

513 .SS "\s-1GOOD\s0 \s-1NEWS\s0"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

514 .IX Subsection "GOOD NEWS"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

515 .IP "The version 3 is 2\-3 times faster than version 2" 4

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

516 .IX Item "The version 3 is 2-3 times faster than version 2"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

517 .PD 0

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

518 .IP "No pattern length limitation" 4

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

519 .IX Item "No pattern length limitation"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

520 .PD

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

521 The algorithm is independent on the pattern length: its time

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

522 complexity is \fIO(kn)\fR, where \fIk\fR is the number of edits and \fIn\fR the

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

523 length of the text (input). The preprocessing of the pattern will of

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

524 course take some \fIO(m)\fR (\fIm\fR being the pattern length) time, but

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

525 \&\f(CW\*(C`amatch()\*(C'\fR and \f(CW\*(C`asubstitute()\*(C'\fR cache the result of this

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

526 preprocessing so that it is done only once per pattern.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

527 .SS "\s-1BAD\s0 \s-1NEWS\s0"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

528 .IX Subsection "BAD NEWS"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

529 .IP "You do need a C compiler to install the module" 4

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

530 .IX Item "You do need a C compiler to install the module"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

531 Perl's regular expressions are no more used; instead a faster and more

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

532 scalable algorithm written in C is used.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

533 .ie n .IP """asubstitute()"" is now always stingy" 4

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

534 .el .IP "\f(CWasubstitute()\fR is now always stingy" 4

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

535 .IX Item "asubstitute() is now always stingy"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

536 The string matched and substituted is now always stingy, as short

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

537 as possible. It used to be as long as possible. This is an unfortunate

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

538 change stemming from switching the matching algorithm. Example: with

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

539 edit distance of two and substituting for \fB\*(L"word\*(R"\fR from \fB\*(L"cork\*(R"\fR and

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

540 \&\fB\*(L"wool\*(R"\fR previously did match \fB\*(L"cork\*(R"\fR and \fB\*(L"wool\*(R"\fR. Now it does

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

541 match \fB\*(L"or\*(R"\fR and \fB\*(L"wo\*(R"\fR. As little as possible, or, in other words,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

542 with as much approximateness, as many edits, as possible. Because

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

543 there is no \fIneed\fR to match the \fB\*(L"c\*(R"\fR of \fB\*(L"cork\*(R"\fR, it is not matched.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

544 .ie n .IP "no more ""aregex()"" because regular expressions are no more used" 4

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

545 .el .IP "no more \f(CWaregex()\fR because regular expressions are no more used" 4

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

546 .IX Item "no more aregex() because regular expressions are no more used"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

547 .PD 0

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

548 .ie n .IP "no more ""compat1"" for String::Approx version 1 compatibility" 4

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

549 .el .IP "no more \f(CWcompat1\fR for String::Approx version 1 compatibility" 4

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

550 .IX Item "no more compat1 for String::Approx version 1 compatibility"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

551 .PD

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

552 .SH "ACKNOWLEDGEMENTS"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

553 .IX Header "ACKNOWLEDGEMENTS"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

554 The following people have provided valuable test cases, documentation

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

555 clarifications, and other feedback:

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

556 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

557 Jared August, Arthur Bergman, Anirvan Chatterjee, Steve A. Chervitz,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

558 Aldo Calpini, David Curiel, Teun van den Dool, Alberto Fontaneda,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

559 Rob Fugina, Dmitrij Frishman, Lars Gregersen, Kevin Greiner,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

560 B. Elijah Griffin, Mike Hanafey, Mitch Helle, Ricky Houghton,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

561 \&'idallen', Helmut Jarausch, Damian Keefe, Ben Kennedy, Craig Kelley,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

562 Franz Kirsch, Dag Kristian, Mark Land, J. D. Laub, John P. Linderman,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

563 Tim Maher, Juha Muilu, Sergey Novoselov, Andy Oram, Ji Y Park,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

564 Eric Promislow, Nikolaus Rath, Stefan Ram, Slaven Rezic,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

565 Dag Kristian Rognlien, Stewart Russell, Slaven Rezic, Chris Rosin,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

566 Pasha Sadri, Ilya Sandler, Bob J.A. Schijvenaars, Ross Smith,

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

567 Frank Tobin, Greg Ward, Rich Williams, Rick Wise.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

568 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

569 The matching algorithm was developed by Udi Manber, Sun Wu, and Burra

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

570 Gopal in the Department of Computer Science, University of Arizona.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

571 .SH "AUTHOR"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

572 .IX Header "AUTHOR"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

573 Jarkko Hietaniemi <jhi@iki.fi>

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

574 .SH "COPYRIGHT AND LICENSE"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

575 .IX Header "COPYRIGHT AND LICENSE"

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

577 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

578 This library is free software; you can redistribute it and/or modify

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

579 under either the terms of the Artistic License 2.0, or the \s-1GNU\s0 Library

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

580 General Public License, Version 2. See the files Artistic and \s-1LGPL\s0

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

581 for more details.

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

582 .PP

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

583 Furthermore: no warranties or obligations of any kind are given, and

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

584 the separate file \fI\s-1COPYRIGHT\s0\fR must be included intact in all copies

8eb7d93f7e58 Uploaded

plus91-technologies-pvt-ltd

parents:

diff changeset

585 and derived materials.

Mercurial > repos > plus91-technologies-pvt-ltd > softsearch

annotate 2.4/man/man3/String__Approx.3pm @ 16:8eb7d93f7e58 draft