Group string: Difference between revisions
imported>StDoodle mNo edit summary |
imported>Ulti navigation |
||
(6 intermediate revisions by 5 users not shown) | |||
Line 6: | Line 6: | ||
FunctionPage| | FunctionPage| | ||
name={{#var:name}}| | name={{#var:name}}| | ||
function1={{Function| | function1={{Function| | ||
Line 21: | Line 20: | ||
function_description=This function finds all (non-overlapping) matches of the regular expression {{pspan|group_by}} within the string {{pspan|group_me}}. A 2-dimensional array of matching strings is returned. The first index is the number of the match, starting with 0 for the first match. The second index is the number of the group within each match, with exactly the same meaning as the 2nd parameter to [[group|group()]]: element 0 is the entire text matched by the regular expression, element 1 is the text matched by the first set of grouping parentheses within the expression, element 2 is the second set of grouping parentheses, and so on.| | function_description=This function finds all (non-overlapping) matches of the regular expression {{pspan|group_by}} within the string {{pspan|group_me}}. A 2-dimensional array of matching strings is returned. The first index is the number of the match, starting with 0 for the first match. The second index is the number of the group within each match, with exactly the same meaning as the 2nd parameter to [[group|group()]]: element 0 is the entire text matched by the regular expression, element 1 is the text matched by the first set of grouping parentheses within the expression, element 2 is the second set of grouping parentheses, and so on.| | ||
code1={{CodeSample| | |||
title=Code Sample| | |||
description= In this example, the regex tries to match a number followed by a space followed by a word. The number and the word are placed into groups.| | |||
code= | |||
<syntaxhighlight> | |||
string input = "134 Muscle 97 Mysticality 102 Moxie"; | |||
string [int,int] test = group_string( input, "(\\d+) (\\w+)" ); | |||
foreach i,j in test | |||
print("test[" + i + "][" + j + "] = " + test[i][j]); | |||
</syntaxhighlight> }} | |||
{{CodeSample| | |||
description= The first match was "134 Muscle", the first group was "134" and the second group was "Muscle". The resulting output is: | | |||
code= | |||
<syntaxhighlight> | |||
> call test.ash | |||
more_info= See [http://kolmafia.us/showthread.php?t=318 this thread] for details.| | test[0][0] = 134 Muscle | ||
test[0][1] = 134 | |||
test[0][2] = Muscle | |||
test[1][0] = 97 Mysticality | |||
test[1][1] = 97 | |||
test[1][2] = Mysticality | |||
test[2][0] = 102 Moxie | |||
test[2][1] = 102 | |||
test[2][2] = Moxie | |||
</syntaxhighlight> }} | |||
{{CodeSample| | |||
description= '''group_string''' exists for convenience purposes, it doesn't extend ash's capabilities any more so than what is already possible with [[create_matcher]], the following is functionality equivalent, albeit uglier: | | |||
code= | |||
<syntaxhighlight> | |||
string input = "134 Muscle 97 Mysticality 102 Moxie"; | |||
string [int,int] test; | |||
//populate the long way | |||
matcher m=create_matcher("(\\d+) (\\w+)",input); | |||
int i=0; | |||
while(m.find()) | |||
{ | |||
for x from 0 to m.group_count() | |||
{ | |||
test[i][x]=m.group(x); | |||
} | |||
i+=1; | |||
} | |||
foreach i,j in test | |||
print("test[" + i + "][" + j + "] = " + test[i][j]); | |||
</syntaxhighlight> }} | |||
{{CodeSample| | |||
description= The output is the same as before. To better understand what's going on, try removing the parenthesis from the regular expression | | |||
code= | |||
<syntaxhighlight> | |||
string input = "134 Muscle 97 Mysticality 102 Moxie"; | |||
string [int,int] test = group_string( input, "\\d+ \\w+" ); | |||
foreach i,j in test | |||
print("test[" + i + "][" + j + "] = " + test[i][j]); | |||
</syntaxhighlight> }} | |||
{{CodeSample| | |||
description= The resulting output is: | | |||
code= | |||
<syntaxhighlight> | |||
> call test.ash | |||
test[0][0] = 134 Muscle | |||
test[1][0] = 97 Mysticality | |||
test[2][0] = 102 Moxie | |||
</syntaxhighlight> }} | |||
As you can see, all '''group_string''' does is loop the regular expression populating a map with the [[group]] results along the way. The first ''int'' is the loop index while the second ''int'' is the [[group]] index with index 0 representing the full match exactly how [[group]] works. | | |||
more_info=Please refer to [[Regular Expressions]] for general tips. See [http://kolmafia.us/showthread.php?t=318 this thread] for details on this function.| | |||
see_also={{SeeAlso|create_matcher}}| | |||
}} | }} | ||
[[Category:String Handling Routines]] |
Latest revision as of 17:14, 14 October 2014
Function Syntax
string [int, int] group_string(string group_me ,string group_by )
- group_me is the string to split into groups
- group_by is the regular expression to group by
This function finds all (non-overlapping) matches of the regular expression group_by within the string group_me. A 2-dimensional array of matching strings is returned. The first index is the number of the match, starting with 0 for the first match. The second index is the number of the group within each match, with exactly the same meaning as the 2nd parameter to group(): element 0 is the entire text matched by the regular expression, element 1 is the text matched by the first set of grouping parentheses within the expression, element 2 is the second set of grouping parentheses, and so on.
Code Sample
In this example, the regex tries to match a number followed by a space followed by a word. The number and the word are placed into groups.
string input = "134 Muscle 97 Mysticality 102 Moxie";
string [int,int] test = group_string( input, "(\\d+) (\\w+)" );
foreach i,j in test
print("test[" + i + "][" + j + "] = " + test[i][j]);
The first match was "134 Muscle", the first group was "134" and the second group was "Muscle". The resulting output is:
> call test.ash
test[0][0] = 134 Muscle
test[0][1] = 134
test[0][2] = Muscle
test[1][0] = 97 Mysticality
test[1][1] = 97
test[1][2] = Mysticality
test[2][0] = 102 Moxie
test[2][1] = 102
test[2][2] = Moxie
group_string exists for convenience purposes, it doesn't extend ash's capabilities any more so than what is already possible with create_matcher, the following is functionality equivalent, albeit uglier:
string input = "134 Muscle 97 Mysticality 102 Moxie";
string [int,int] test;
//populate the long way
matcher m=create_matcher("(\\d+) (\\w+)",input);
int i=0;
while(m.find())
{
for x from 0 to m.group_count()
{
test[i][x]=m.group(x);
}
i+=1;
}
foreach i,j in test
print("test[" + i + "][" + j + "] = " + test[i][j]);
The output is the same as before. To better understand what's going on, try removing the parenthesis from the regular expression
string input = "134 Muscle 97 Mysticality 102 Moxie";
string [int,int] test = group_string( input, "\\d+ \\w+" );
foreach i,j in test
print("test[" + i + "][" + j + "] = " + test[i][j]);
The resulting output is:
> call test.ash
test[0][0] = 134 Muscle
test[1][0] = 97 Mysticality
test[2][0] = 102 Moxie
As you can see, all group_string does is loop the regular expression populating a map with the group results along the way. The first int is the loop index while the second int is the group index with index 0 representing the full match exactly how group works.
See Also
More Information
Please refer to Regular Expressions for general tips. See this thread for details on this function.