Talk:String Handling Routines
So wtf does group_string actually do? The linked "descriptive" post has an utterly unhelpful example. Has anyone ever used it for anything?
Groups a string into a map using a regular expression. To understand the function you must know. 1. What maps are and how they are used. 2. Understand what regular expressions are and how to create them.
Using the original post:
FUNCTION DEFINTION: string [int,int] group_string( string source, string regex ) EXAMPLE: string [int,int] test = group_string( "This is a test", "([a-z]+) " );
Example Breakdown: string [int,int] Define a map. Two dimensional. The indices are integers. The data is stored as a string. test Define the map with name test. group_string Call the function. "This is a Test" Feeding the function a sample string. "([a-z]+) " Your regular expression.
Regular expressions deal with pattern matching. You want the function to find a particular pattern. The function then returns that pattern, or the stuff before it, or the stuff after it, or splits them appart, or squeezes them together. So what does this regular expression look for? The Parenthesis (): Tell the function this is a group of characters. [a-z]: Tell us they will be lower case letters. +: Tell us to look for one or more characters. That space between the ) and " Tells us the pattern ends in a space.
Thus reading down the string. T = Does not match [a-z] is a capital letter. h = Matches [a-z]. Starting Group i = Matches [a-z] s = Matches [a-z]
= Matches space. First group found and is "his "
i = Matches [a-z]. Starting Group s = Matches [a-z]
= Matches space. Second group found, and is "is "
a = Matches [a-z]. Starting Group
= Matches space. Third group found, and is "a "
t = Matches [a-z]. Starting Group e = Matches [a-z] s = Matches [a-z] t = Matches [a-z] End of line. No more matches. Stop.
Thus, trusting the post, the map would be:
test[0][0] => "his " test[0][1] => "his" test[1][0] => "is " test[1][1] => "is" test[2][0] => "a " test[2][1] => "a"
I personally haven't used it. Would be used in parsing a page by hand.