I've been trying to learn UltraEdit, but I can't get this one feature to
work. I'm trying to tag the existing text on a line in a replace operation
so that I can duplicate it and put other stuff around the copies.
According to the documentation on regular expressions, stuff between ^( and
^) on the FIND WHAT line of the REPLACE dialog is supposed to get tagged so
that ^n on the REPLACE WITH line expands to the tagged text. N is a number
from 1-9 indicating the specific tagged expression. The first one is 1, the
second 2, etc. The documentation also says that * will match any sequence
of characters except newline. So I tried
FIND WHAT: ^(*^)
REPLACE WITH: if exist ^1 del ^1
then selected a bunch of lines that each contain a single string. I
expected a line containing "xxx", for example to be converted to "if exist
xxx del xxx". However it complains that the search expression couldn't be
found. This doesn't make any sense since * is supposed to match anything.
Yes I have regular expressions enabled and Unix style disabled. There is a
picture of the replace dialog at http://www.embedinc.com/temp/a.gif.
What am I missing?
*****************************************************************
Embed Inc, embedded system specialists in Littleton Massachusetts
(978) 742-9014, http://www.embedinc.com
> I've been trying to learn UltraEdit, but I can't get this one feature to
> work. I'm trying to tag the existing text on a line in a replace operation
> so that I can duplicate it and put other stuff around the copies.
>
> According to the documentation on regular expressions, stuff between ^( and
> ^) on the FIND WHAT line of the REPLACE dialog is supposed to get tagged so
> that ^n on the REPLACE WITH line expands to the tagged text. N is a number
> from 1-9 indicating the specific tagged expression. The first one is 1,
> the
> second 2, etc. The documentation also says that * will match any sequence
> of characters except newline. So I tried
>
> FIND WHAT: ^(*^)
> REPLACE WITH: if exist ^1 del ^1
>
> then selected a bunch of lines that each contain a single string. I
> expected a line containing "xxx", for example to be converted to "if exist
> xxx del xxx". However it complains that the search expression couldn't be
> found. This doesn't make any sense since * is supposed to match anything.
> Yes I have regular expressions enabled and Unix style disabled. There is a
> picture of the replace dialog at http://www.embedinc.com/temp/a.gif.
>
> What am I missing?
>
I gave it a try and couldn't get the pattern ^(*^) to work at all.
Then tried ^(*.*$^) which seemed to match ok, but the replacement
would come out
I believe <*> expressions obeys following rule ( citing from editplus help )
"Character to the left of asterisk in the expression should match 0 or more
times"
i.e. <*> cannot be used by itself only
If anything needs to be find I'm usually using <.*> combination ( no brakets )
WBR Dmitry.
PS. Brief help page from Editplus editor.
<help-begin>
Expression Description
\t Tab character.
\n New line.
. Matches any character.
| Either expression on its left and right side matches the target string. For
example, "a|b" matches "a" and "b".
[] Any of the enclosed characters may match the target character. For example,
"[ab]" matches "a" and "b". "[0-9]" matches any digit.
[^] None of the enclosed characters may match the target character. For example,
"[^ab]" matches all character EXCEPT "a" and "b". "[^0-9]" matches any non-digit
character.
* Character to the left of asterisk in the expression should match 0 or more
times. For example "be*" matches "b", "be" and "bee".
+ Character to the left of plus sign in the expression should match 1 or more
times. For example "be+" matches "be" and "bee" but not "b".
? Character to the left of question mark in the expression should match 0 or 1
time. For example "be?" matches "b" and "be" but not "bee".
^ Expression to the right of ^ matches only when it is at the beginning of line.
For example "^A" matches an "A" that is only at the beginning of line.
$ Expression to the left of $ matches only when it is at the end of line. For
example "e$" matches an "e" that is only at the end of line.
() Affects evaluation order of expression and also used for tagged expression.
\ Escape character. If you want to use character "\" itself, you should use
"\\".
The tagged expression is enclosed by (). Tagged expressions can be referenced by
\0, \1, \2, \3, etc. \0 indicates a tagged expression representing the entire
substring that was matched. \1 indicates the first tagged expression, \2 is the
second, etc. See following examples.
Original Search Replace Result
abc (ab)(c) \0-\1-\2 abc-ab-c
abc a(b)(c) \0-\1-\2 abc-b-c
abc (a)b(c) \0-\1-\2 abc-a-c
>
> Olin,
>
> It looks like you can't just do a match on just * , you'll need at least one
> fixed character to search on.
>
> However since you are only using one string per line, you can include the
> newline character. So..
>
> FIND WHAT: ^(*^p^)
> REPLACE WITH: if exist ^1 del ^1^p
>
> I've tested this on UltraEdit v7.20a
>
> Mark.
>
Mark Thatcher wrote:
> It looks like you can't just do a match on just * , you'll need at
> least one fixed character to search on.
Ah, that's it. I just tried doing a replace on * to xxx and that wouldn't
work either. I think this is a bug in UltraEdit though. The documentation
about * as a search pattern says "Matches any number of occurrences of any
character except newline.". That sounds to me like it should match the
whole line, whether empty or not.
> However since you are only using one string per line, you can include
> the newline character. So..
>
> FIND WHAT: ^(*^p^)
> REPLACE WITH: if exist ^1 del ^1^p
I also tried ^(?+^) as the search pattern and that worked without having to
specify end of lines. I still don't understand why * didn't work, but I
guess ?+ is the workaround for matching an entire non-empty line. I guess
?++ would match an entire line whether empty or not, but I haven't tried
that.
Thanks for your help.
*****************************************************************
Embed Inc, embedded system specialists in Littleton Massachusetts
(978) 742-9014, http://www.embedinc.com
Olin Lathrop wrote:
> I also tried ^(?+^) as the search pattern and that worked without having to
> specify end of lines. I still don't understand why * didn't work, but I
> guess ?+ is the workaround for matching an entire non-empty line. I guess
> ?++ would match an entire line whether empty or not, but I haven't tried
> that.
? - allow one (optional).
+ - one required, allow additional (optional)
?+ seems nonsensical. allow one (optional), one required, allow
additional (optional)? Require one (optional) allow additional? One
what?
?,+, * and {x,y} (range) are usually grouped together as quantifiers.
They need something before them to use in the match.
I guess it wouldn't suprise me if some implementations use ? to mean
.? or * for .* but that would be rather annoying. Would 5* mean "any
number of 5's (including zero)" or would it mean "5 followed by any
characters at all?" Guess it would depend on whether the developer
decided ?, + and * on their own are a special case or not.
I wouldn't count on the behavior you are seeing being portable.
Have you tried .+ yet?
In a great deal (most?) of the regex libraries, "." will match any
(single) character, "*" will match zero or more instances of the
_previous_ character (or grouped expression) and "+" will match one or
more instances of the _previous_ character (or grouped expression).
One problem you may run into is if "." happens to match newline. On
some implementations it does.
. - Match any (one) character. Possibly newline, but probably not.
.? - Match zero or one instance of any character.
.+ - Match one or more instance of any character.
.* - Match almost anything. (zero or more instances of any char)
.{2,5} - Match minimum of 2 or max of 5 instances of any char.
Tad Anhalt wrote:
> ? - allow one (optional).
> + - one required, allow additional (optional)
>
> ?+ seems nonsensical. allow one (optional), one required, allow
> additional (optional)? Require one (optional) allow additional? One
> what?
I don't know where you're getting this from. The description seems
reasonably clear to me (although * apparently doesn't work the way I
interpreted it). ? matches any one character except newline, and + matches
one or more occurrances of the previous character. Therefore ?+ matches any
string of one or more characters up to the next newline. I've tried it and
that does in fact work.
> ?,+, * and {x,y} (range) are usually grouped together as quantifiers.
> They need something before them to use in the match.
Again, this is contrary to the documentation and how it appears to work.
Where are you getting this from?
> I guess it wouldn't suprise me if some implementations ...
I've only ever used the one version (10.20c), so all I can go by is the help
for that version. So you are saying this was significantly different in
previous versions?
> Would 5* mean "any
> number of 5's (including zero)" or would it mean "5 followed by any
> characters at all?"
The latter, at least that's how I interpret the documentation.
> I wouldn't count on the behavior you are seeing being portable.
Of course not. Why would I? I've used enough editors to know that each has
their own details in this area. For example, the Apollo editor used * to
indicate 0 or more repetitions of the previous character whereas on many
other systems it is a wildcars string match.
> Have you tried .+ yet?
No, why? That would match one or more consecutive periods. How is this
remotely relevant to the problem?
> In a great deal (most?) of the regex libraries, "." will match any
> (single) character, "*" will match zero or more instances of the
> _previous_ character (or grouped expression) and "+" will match one or
> more instances of the _previous_ character (or grouped expression).
>
> One problem you may run into is if "." happens to match newline. On
> some implementations it does.
>
> . - Match any (one) character. Possibly newline, but probably not.
> .? - Match zero or one instance of any character.
> .+ - Match one or more instance of any character.
> .* - Match almost anything. (zero or more instances of any char)
> .{2,5} - Match minimum of 2 or max of 5 instances of any char.
Speculating how other systems might implement this is pointless. We're not
talking about other systems. We are talking about UltraEdit specifically
with "Unix style" regular expressions turned off.
*****************************************************************
Embed Inc, embedded system specialists in Littleton Massachusetts
(978) 742-9014, http://www.embedinc.com
Olin Lathrop wrote:
>> Have you tried .+ yet?
> No, why? That would match one or more consecutive periods. How is this
> remotely relevant to the problem?
Because, as you've found out sometimes the documentation is either
misleading or wrong.
> Speculating how other systems might implement this is pointless. We're not
> talking about other systems. We are talking about UltraEdit specifically
> with "Unix style" regular expressions turned off.
Tad Anhalt wrote:
>> We are talking about UltraEdit
>> specifically with "Unix style" regular expressions turned off.
>
> Maybe you should try turning them on.
I'm really trying to learn UltraEdit in its native form.
*****************************************************************
Embed Inc, embedded system specialists in Littleton Massachusetts
(978) 742-9014, http://www.embedinc.com
Olin Lathrop wrote:
> I'm really trying to learn UltraEdit in its native form.
Sorry if my reply sounded snippy. It wasn't intended that way, but
re-reading it... maybe it could have come across that way.
Personally, if a tool supports the form of regex that used across
most of the other tools in my inventory. I'd prefer to use it. At
least until I run into something that can't be done without diving into
the guts of the proprietary form.
You should, of course, do whatever makes sense to you. IMHO,
learning and using the "unix style" (if you haven't already) is well
worth it in the long run because it works across a huge range of tools.
Now that there are good libraries available for everything from C++ to
Java and beyond they'll only gain ground going forward.