Counting the number of words can be a bit tricky based on requirements, should you split word on whitespace, a specified length or should you only count a distinct set of words. This example will show how to count the number of words in a string by breaking it apart by whitespace. A related example shows how to count distinct words in a file.
Straight up Java
StringTokenizer uses the space character, the tab character, the newline character, the carriage-return character, and the form-feed character (" \t\n\r\f") as a set of predefined delimiters to split a string. countTokens()
will return the number of tokens remaining in the current delimiter set which results in the number of words in a string. One thing to note is that the delimiters themselves will not be treated as tokens.
Java 8
Calling the String.split()
passing in a space as a delimiter will break the string into an array of words. Calling Arrays.stream().count()
is a reduction operation to count the number of elements OR we could use the array.length
attribute.
Google Guava
Guava Splitter will split a string on whitespace and then calling the size on the list will give us the number of words in a string.