Difference between revisions of "Group string"

From Kolmafia
Jump to navigation Jump to search
imported>StDoodle
imported>Ulti
(navigation)
 
(9 intermediate revisions by 6 users not shown)
Line 1: Line 1:
string [int,int] group_string( string, string )
+
{{
 +
#vardefine:name|group_string}}{{
 +
#vardefine:return_type|string [int, int]}}{{
 +
#vardefine:aggregate|yes}}{{
  
{{#vardefine:name|group_string}}
+
FunctionPage|
{{#vardefine:return_type|string [int, int]}}
 
{{#vardefine:aggregate|yes}}
 
 
 
{{FunctionPage|
 
 
name={{#var:name}}|
 
name={{#var:name}}|
function_category=String Handling Routines|
 
  
 
function1={{Function|
 
function1={{Function|
Line 20: Line 18:
 
}}|
 
}}|
  
function_description=This function returns a map keyed by an integer pair of text in {{pspan|group_me}} matching the regular expression {{pspan|group_by}}. Exactly how is unclear.|
+
function_description=This function finds all (non-overlapping) matches of the regular expression {{pspan|group_by}} within the string {{pspan|group_me}}.  A 2-dimensional array of matching strings is returned.  The first index is the number of the match, starting with 0 for the first match.  The second index is the number of the group within each match, with exactly the same meaning as the 2nd parameter to [[group|group()]]: element 0 is the entire text matched by the regular expression, element 1 is the text matched by the first set of grouping parentheses within the expression, element 2 is the second set of grouping parentheses, and so on.|
 +
 
 +
code1={{CodeSample|
 +
title=Code Sample|
 +
description= In this example, the regex tries to match a number followed by a space followed by a word. The number and the word are placed into groups.|
 +
code=
 +
<syntaxhighlight>
 +
string input = "134 Muscle 97 Mysticality 102 Moxie";
 +
string [int,int] test = group_string( input, "(\\d+) (\\w+)" );
 +
foreach i,j in test
 +
print("test[" + i + "][" + j + "] = " + test[i][j]);
 +
</syntaxhighlight>  }}
 +
{{CodeSample|
 +
description= The first match was "134 Muscle", the first group was "134" and the second group was "Muscle". The resulting output is: |
 +
code=
 +
<syntaxhighlight>
 +
> call test.ash
 +
 
 +
test[0][0] = 134 Muscle
 +
test[0][1] = 134
 +
test[0][2] = Muscle
 +
test[1][0] = 97 Mysticality
 +
test[1][1] = 97
 +
test[1][2] = Mysticality
 +
test[2][0] = 102 Moxie
 +
test[2][1] = 102
 +
test[2][2] = Moxie
 +
</syntaxhighlight> }}
 +
{{CodeSample|
 +
description= '''group_string''' exists for convenience purposes, it doesn't extend ash's capabilities any more so than what is already possible with [[create_matcher]], the following is functionality equivalent, albeit uglier: |
 +
code=
 +
<syntaxhighlight>
 +
string input = "134 Muscle 97 Mysticality 102 Moxie";
 +
string [int,int] test;
 +
//populate the long way
 +
matcher m=create_matcher("(\\d+) (\\w+)",input);
 +
int i=0;
 +
while(m.find())
 +
{
 +
for x from 0 to m.group_count()
 +
{
 +
test[i][x]=m.group(x);
 +
}
 +
i+=1;
 +
}
 +
foreach i,j in test
 +
print("test[" + i + "][" + j + "] = " + test[i][j]);
 +
</syntaxhighlight>  }}
 +
{{CodeSample|
 +
description= The output is the same as before. To better understand what's going on, try removing the parenthesis from the regular expression |
 +
code=
 +
<syntaxhighlight>
 +
string input = "134 Muscle 97 Mysticality 102 Moxie";
 +
string [int,int] test = group_string( input, "\\d+ \\w+" );
 +
foreach i,j in test
 +
print("test[" + i + "][" + j + "] = " + test[i][j]);
 +
</syntaxhighlight>  }}
 +
{{CodeSample|
 +
description= The resulting output is: |
 +
code=
 +
<syntaxhighlight>
 +
> call test.ash
 +
 
 +
test[0][0] = 134 Muscle
 +
test[1][0] = 97 Mysticality
 +
test[2][0] = 102 Moxie
 +
</syntaxhighlight> }}
 +
As you can see, all '''group_string''' does is loop the regular expression populating a map with the [[group]] results along the way. The first ''int'' is the loop index while the second ''int'' is the [[group]] index with index 0 representing the full match exactly how [[group]] works. |
  
needscode=yes|
 
  
more_info= See [http://kolmafia.us/showthread.php?t=318 this thread] for details.|
+
more_info=Please refer to [[Regular Expressions]] for general tips. See [http://kolmafia.us/showthread.php?t=318 this thread] for details on this function.|
 +
see_also={{SeeAlso|create_matcher}}|
 
}}
 
}}
  
{{RFI|The regex isn't too hard to parse, but the results are.|For results, I always see [int,0] and [int,1] keys; huh?}}
+
[[Category:String Handling Routines]]

Latest revision as of 17:14, 14 October 2014

Function Syntax

string [int, int] group_string(string group_me ,string group_by )

  • group_me is the string to split into groups
  • group_by is the regular expression to group by

This function finds all (non-overlapping) matches of the regular expression group_by within the string group_me. A 2-dimensional array of matching strings is returned. The first index is the number of the match, starting with 0 for the first match. The second index is the number of the group within each match, with exactly the same meaning as the 2nd parameter to group(): element 0 is the entire text matched by the regular expression, element 1 is the text matched by the first set of grouping parentheses within the expression, element 2 is the second set of grouping parentheses, and so on.

Code Sample

In this example, the regex tries to match a number followed by a space followed by a word. The number and the word are placed into groups.

string input = "134 Muscle 97 Mysticality 102 Moxie";
string [int,int] test = group_string( input, "(\\d+) (\\w+)" );
foreach i,j in test
	print("test[" + i + "][" + j + "] = " + test[i][j]);

The first match was "134 Muscle", the first group was "134" and the second group was "Muscle". The resulting output is:

> call test.ash

test[0][0] = 134 Muscle
test[0][1] = 134
test[0][2] = Muscle
test[1][0] = 97 Mysticality
test[1][1] = 97
test[1][2] = Mysticality
test[2][0] = 102 Moxie
test[2][1] = 102
test[2][2] = Moxie

group_string exists for convenience purposes, it doesn't extend ash's capabilities any more so than what is already possible with create_matcher, the following is functionality equivalent, albeit uglier:

string input = "134 Muscle 97 Mysticality 102 Moxie";
string [int,int] test;
//populate the long way
	matcher m=create_matcher("(\\d+) (\\w+)",input);
	int i=0;
	while(m.find())
	{
		for x from 0 to m.group_count()
		{
			test[i][x]=m.group(x);
		}
		i+=1;
	}
foreach i,j in test
	print("test[" + i + "][" + j + "] = " + test[i][j]);

The output is the same as before. To better understand what's going on, try removing the parenthesis from the regular expression

string input = "134 Muscle 97 Mysticality 102 Moxie";
string [int,int] test = group_string( input, "\\d+ \\w+" );
foreach i,j in test
	print("test[" + i + "][" + j + "] = " + test[i][j]);

The resulting output is:

> call test.ash

test[0][0] = 134 Muscle
test[1][0] = 97 Mysticality
test[2][0] = 102 Moxie

As you can see, all group_string does is loop the regular expression populating a map with the group results along the way. The first int is the loop index while the second int is the group index with index 0 representing the full match exactly how group works.

See Also

create_matcher()

More Information

Please refer to Regular Expressions for general tips. See this thread for details on this function.