For example, take the following snippet of CSV file.
username, password,home folder,default editor, favorite color,web address
Notice that the delimiter is inconsistent in this example. Sometimes I have tokens separated with a single comma (",") and sometimes with a comma followed by a space (", "). If the delimiter were consistent it would be a simple matter to get all tokens with the following code:
String tokens = line.split(",");
First of all, let's note how much simpler this is than the old method of using a StringTokenizer to loop through the text scanning for more tokens. String's split method added in the Java 1.4 release is a great improvement. The second thing you should note is that the split method accepts one argument, and that argument is a regular expression. This should be a clue to solving our CSV formatting problem. How do we specify that the delimiter in our file is a comma that might sometimes be followed by a space? By harnessing the power of regular expressions. (Ok, that's overstating the case by quite a bit. We're really only harnessing a tiny fraction of the power of regular expressions.)
String tokens = line.split(",\\s*");
That says split the line on any comma followed by zero or more spaces (check out Mastering Regular Expressions for an in-depth guide to regular expression syntax). This single line of code should have the desired effect of splitting the line into the following array.
Try it out by creating a simple class that reads a single line of comma-delimited text and prints out the tokens. If you really want to see what an improvement String's split method brings, try writing the same function using a StringTokenizer instead.