com.reuters.msgtest.load.output.csv
Class StringHelper

java.lang.Object
  extended bycom.reuters.msgtest.load.output.csv.StringHelper

public class StringHelper
extends java.lang.Object

Utilities for String formatting, manipulation, and queries. More information about this class is available from ostermiller.org.

Since:
ostermillerutils 1.00.00
Author:
Stephen Ostermiller http://ostermiller.org/contact.pl?regarding=Java+Utilities

Constructor Summary
StringHelper()
           
 
Method Summary
static boolean containsAny(java.lang.String s, java.lang.String[] terms)
          Tests to see if the given string contains any of the given terms.
static boolean containsAnyIgnoreCase(java.lang.String s, java.lang.String[] terms)
          Tests to see if the given string contains any of the given terms.
static boolean endsWithAny(java.lang.String s, java.lang.String[] terms)
          Tests to see if the given string ends with any of the given terms.
static boolean endsWithAnyIgnoreCase(java.lang.String s, java.lang.String[] terms)
          Tests to see if the given string ends with any of the given terms.
static boolean equalsAny(java.lang.String s, java.lang.String[] terms)
          Tests to see if the given string equals any of the given terms.
static boolean equalsAnyIgnoreCase(java.lang.String s, java.lang.String[] terms)
          Tests to see if the given string equals any of the given terms.
static java.lang.String escapeHTML(java.lang.String s)
          Replaces characters that may be confused by a HTML parser with their equivalent character entity references.
static java.lang.String escapeJavaLiteral(java.lang.String s)
          Replaces characters that are not allowed in a Java style string literal with their escape characters.
static java.lang.String escapeRegularExpressionLiteral(java.lang.String s)
          Escapes characters that have special meaning to regular expressions
static java.lang.String escapeSQL(java.lang.String s)
          Replaces characters that may be confused by an SQL parser with their equivalent escape characters.
static java.util.regex.Pattern getContainsAnyIgnoreCasePattern(java.lang.String[] terms)
          Compile a pattern that can will match a string if the string contains any of the given terms.
static java.util.regex.Pattern getContainsAnyPattern(java.lang.String[] terms)
          Compile a pattern that can will match a string if the string contains any of the given terms.
static java.util.regex.Pattern getEndsWithAnyIgnoreCasePattern(java.lang.String[] terms)
          Compile a pattern that can will match a string if the string ends with any of the given terms.
static java.util.regex.Pattern getEndsWithAnyPattern(java.lang.String[] terms)
          Compile a pattern that can will match a string if the string ends with any of the given terms.
static java.util.regex.Pattern getEqualsAnyIgnoreCasePattern(java.lang.String[] terms)
          Compile a pattern that can will match a string if the string equals any of the given terms.
static java.util.regex.Pattern getEqualsAnyPattern(java.lang.String[] terms)
          Compile a pattern that can will match a string if the string equals any of the given terms.
static java.util.regex.Pattern getStartsWithAnyIgnoreCasePattern(java.lang.String[] terms)
          Compile a pattern that can will match a string if the string starts with any of the given terms.
static java.util.regex.Pattern getStartsWithAnyPattern(java.lang.String[] terms)
          Compile a pattern that can will match a string if the string starts with any of the given terms.
static java.lang.String midpad(java.lang.String s, int length)
          Pad the beginning and end of the given String with spaces until the String is of the given length.
static java.lang.String midpad(java.lang.String s, int length, char c)
          Pad the beginning and end of the given String with the given character until the result is the desired length.
static java.lang.String postpad(java.lang.String s, int length)
          Pad the end of the given String with spaces until the String is of the given length.
static java.lang.String postpad(java.lang.String s, int length, char c)
          Append the given character to the String until the result is the desired length.
static java.lang.String prepad(java.lang.String s, int length)
          Pad the beginning of the given String with spaces until the String is of the given length.
static java.lang.String prepad(java.lang.String s, int length, char c)
          Pre-pend the given character to the String until the result is the desired length.
static java.lang.String replace(java.lang.String s, java.lang.String find, java.lang.String replace)
          Replace occurrences of a substring.
static java.lang.String[] split(java.lang.String s, java.lang.String delimiter)
          Split the given String into tokens.
static boolean startsWithAny(java.lang.String s, java.lang.String[] terms)
          Tests to see if the given string starts with any of the given terms.
static boolean startsWithAnyIgnoreCase(java.lang.String s, java.lang.String[] terms)
          Tests to see if the given string starts with any of the given terms.
static java.lang.String trim(java.lang.String s, java.lang.String c)
          Trim any of the characters contained in the second string from the beginning and end of the first.
static java.lang.String unescapeHTML(java.lang.String s)
          Turn any HTML escape entities in the string into characters and return the resulting string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StringHelper

public StringHelper()
Method Detail

prepad

public static java.lang.String prepad(java.lang.String s,
                                      int length)
Pad the beginning of the given String with spaces until the String is of the given length.

If a String is longer than the desired length, it will not be truncated, however no padding will be added.

Parameters:
s - String to be padded.
length - desired length of result.
Returns:
padded String.
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

prepad

public static java.lang.String prepad(java.lang.String s,
                                      int length,
                                      char c)
Pre-pend the given character to the String until the result is the desired length.

If a String is longer than the desired length, it will not be truncated, however no padding will be added.

Parameters:
s - String to be padded.
length - desired length of result.
c - padding character.
Returns:
padded String.
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

postpad

public static java.lang.String postpad(java.lang.String s,
                                       int length)
Pad the end of the given String with spaces until the String is of the given length.

If a String is longer than the desired length, it will not be truncated, however no padding will be added.

Parameters:
s - String to be padded.
length - desired length of result.
Returns:
padded String.
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

postpad

public static java.lang.String postpad(java.lang.String s,
                                       int length,
                                       char c)
Append the given character to the String until the result is the desired length.

If a String is longer than the desired length, it will not be truncated, however no padding will be added.

Parameters:
s - String to be padded.
length - desired length of result.
c - padding character.
Returns:
padded String.
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

midpad

public static java.lang.String midpad(java.lang.String s,
                                      int length)
Pad the beginning and end of the given String with spaces until the String is of the given length. The result is that the original String is centered in the middle of the new string.

If the number of characters to pad is even, then the padding will be split evenly between the beginning and end, otherwise, the extra character will be added to the end.

If a String is longer than the desired length, it will not be truncated, however no padding will be added.

Parameters:
s - String to be padded.
length - desired length of result.
Returns:
padded String.
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

midpad

public static java.lang.String midpad(java.lang.String s,
                                      int length,
                                      char c)
Pad the beginning and end of the given String with the given character until the result is the desired length. The result is that the original String is centered in the middle of the new string.

If the number of characters to pad is even, then the padding will be split evenly between the beginning and end, otherwise, the extra character will be added to the end.

If a String is longer than the desired length, it will not be truncated, however no padding will be added.

Parameters:
s - String to be padded.
length - desired length of result.
c - padding character.
Returns:
padded String.
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

split

public static java.lang.String[] split(java.lang.String s,
                                       java.lang.String delimiter)
Split the given String into tokens.

This method is meant to be similar to the split function in other programming languages but it does not use regular expressions. Rather the String is split on a single String literal.

Unlike java.util.StringTokenizer which accepts multiple character tokens as delimiters, the delimiter here is a single String literal.

Each null token is returned as an empty String. Delimiters are never returned as tokens.

If there is no delimiter because it is either empty or null, the only element in the result is the original String.

StringHelper.split("1-2-3", "-");
result: {"1", "2", "3"}
StringHelper.split("-1--2-", "-");
result: {"", "1", ,"", "2", ""}
StringHelper.split("123", "");
result: {"123"}
StringHelper.split("1-2---3----4", "--");
result: {"1-2", "-3", "", "4"}

Parameters:
s - String to be split.
delimiter - String literal on which to split.
Returns:
an array of tokens.
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

replace

public static java.lang.String replace(java.lang.String s,
                                       java.lang.String find,
                                       java.lang.String replace)
Replace occurrences of a substring. StringHelper.replace("1-2-3", "-", "|");
result: "1|2|3"
StringHelper.replace("-1--2-", "-", "|");
result: "|1||2|"
StringHelper.replace("123", "", "|");
result: "123"
StringHelper.replace("1-2---3----4", "--", "|");
result: "1-2|-3||4"
StringHelper.replace("1-2---3----4", "--", "---");
result: "1-2----3------4"

Parameters:
s - String to be modified.
find - String to find.
replace - String to replace.
Returns:
a string with all the occurrences of the string to find replaced.
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

escapeHTML

public static java.lang.String escapeHTML(java.lang.String s)
Replaces characters that may be confused by a HTML parser with their equivalent character entity references.

Any data that will appear as text on a web page should be be escaped. This is especially important for data that comes from untrusted sources such as Internet users. A common mistake in CGI programming is to ask a user for data and then put that data on a web page. For example:

 Server: What is your name?
 User: <b>Joe<b>
 Server: Hello Joe, Welcome
If the name is put on the page without checking that it doesn't contain HTML code or without sanitizing that HTML code, the user could reformat the page, insert scripts, and control the the content on your web server.

This method will replace HTML characters such as > with their HTML entity reference (&gt;) so that the html parser will be sure to interpret them as plain text rather than HTML or script.

This method should be used for both data to be displayed in text in the html document, and data put in form elements. For example:
<html><body>This in not a &lt;tag&gt; in HTML</body></html>
and
<form><input type="hidden" name="date" value="This data could be &quot;malicious&quot;"></form>
In the second example, the form data would be properly be resubmitted to your cgi script in the URLEncoded format:
This data could be %22malicious%22

Parameters:
s - String to be escaped
Returns:
escaped String
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

escapeSQL

public static java.lang.String escapeSQL(java.lang.String s)
Replaces characters that may be confused by an SQL parser with their equivalent escape characters.

Any data that will be put in an SQL query should be be escaped. This is especially important for data that comes from untrusted sources such as Internet users.

For example if you had the following SQL query:
"SELECT * FROM addresses WHERE name='" + name + "' AND private='N'"
Without this function a user could give " OR 1=1 OR ''='" as their name causing the query to be:
"SELECT * FROM addresses WHERE name='' OR 1=1 OR ''='' AND private='N'"
which will give all addresses, including private ones.
Correct usage would be:
"SELECT * FROM addresses WHERE name='" + StringHelper.escapeSQL(name) + "' AND private='N'"

Another way to avoid this problem is to use a PreparedStatement with appropriate placeholders.

Parameters:
s - String to be escaped
Returns:
escaped String
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

escapeJavaLiteral

public static java.lang.String escapeJavaLiteral(java.lang.String s)
Replaces characters that are not allowed in a Java style string literal with their escape characters. Specifically quote ("), single quote ('), new line (\n), carriage return (\r), and backslash (\), and tab (\t) are escaped.

Parameters:
s - String to be escaped
Returns:
escaped String
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

trim

public static java.lang.String trim(java.lang.String s,
                                    java.lang.String c)
Trim any of the characters contained in the second string from the beginning and end of the first.

Parameters:
s - String to be trimmed.
c - list of characters to trim from s.
Returns:
trimmed String.
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

unescapeHTML

public static java.lang.String unescapeHTML(java.lang.String s)
Turn any HTML escape entities in the string into characters and return the resulting string.

Parameters:
s - String to be unescaped.
Returns:
unescaped String.
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.00.00

escapeRegularExpressionLiteral

public static java.lang.String escapeRegularExpressionLiteral(java.lang.String s)
Escapes characters that have special meaning to regular expressions

Parameters:
s - String to be escaped
Returns:
escaped String
Throws:
java.lang.NullPointerException - if s is null.
Since:
ostermillerutils 1.02.25

getContainsAnyPattern

public static java.util.regex.Pattern getContainsAnyPattern(java.lang.String[] terms)
Compile a pattern that can will match a string if the string contains any of the given terms.

Usage:
boolean b = getContainsAnyPattern(terms).matcher(s).matches();

If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

Parameters:
terms - Array of search strings.
Returns:
Compiled pattern that can be used to match a string to see if it contains any of the terms.
Since:
ostermillerutils 1.02.25

getEqualsAnyPattern

public static java.util.regex.Pattern getEqualsAnyPattern(java.lang.String[] terms)
Compile a pattern that can will match a string if the string equals any of the given terms.

Usage:
boolean b = getEqualsAnyPattern(terms).matcher(s).matches();

If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

Parameters:
terms - Array of search strings.
Returns:
Compiled pattern that can be used to match a string to see if it equals any of the terms.
Since:
ostermillerutils 1.02.25

getStartsWithAnyPattern

public static java.util.regex.Pattern getStartsWithAnyPattern(java.lang.String[] terms)
Compile a pattern that can will match a string if the string starts with any of the given terms.

Usage:
boolean b = getStartsWithAnyPattern(terms).matcher(s).matches();

If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

Parameters:
terms - Array of search strings.
Returns:
Compiled pattern that can be used to match a string to see if it starts with any of the terms.
Since:
ostermillerutils 1.02.25

getEndsWithAnyPattern

public static java.util.regex.Pattern getEndsWithAnyPattern(java.lang.String[] terms)
Compile a pattern that can will match a string if the string ends with any of the given terms.

Usage:
boolean b = getEndsWithAnyPattern(terms).matcher(s).matches();

If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

Parameters:
terms - Array of search strings.
Returns:
Compiled pattern that can be used to match a string to see if it ends with any of the terms.
Since:
ostermillerutils 1.02.25

getContainsAnyIgnoreCasePattern

public static java.util.regex.Pattern getContainsAnyIgnoreCasePattern(java.lang.String[] terms)
Compile a pattern that can will match a string if the string contains any of the given terms.

Case is ignored when matching using Unicode case rules.

Usage:
boolean b = getContainsAnyPattern(terms).matcher(s).matches();

If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

Parameters:
terms - Array of search strings.
Returns:
Compiled pattern that can be used to match a string to see if it contains any of the terms.
Since:
ostermillerutils 1.02.25

getEqualsAnyIgnoreCasePattern

public static java.util.regex.Pattern getEqualsAnyIgnoreCasePattern(java.lang.String[] terms)
Compile a pattern that can will match a string if the string equals any of the given terms.

Case is ignored when matching using Unicode case rules.

Usage:
boolean b = getEqualsAnyPattern(terms).matcher(s).matches();

If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

Parameters:
terms - Array of search strings.
Returns:
Compiled pattern that can be used to match a string to see if it equals any of the terms.
Since:
ostermillerutils 1.02.25

getStartsWithAnyIgnoreCasePattern

public static java.util.regex.Pattern getStartsWithAnyIgnoreCasePattern(java.lang.String[] terms)
Compile a pattern that can will match a string if the string starts with any of the given terms.

Case is ignored when matching using Unicode case rules.

Usage:
boolean b = getStartsWithAnyPattern(terms).matcher(s).matches();

If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

Parameters:
terms - Array of search strings.
Returns:
Compiled pattern that can be used to match a string to see if it starts with any of the terms.
Since:
ostermillerutils 1.02.25

getEndsWithAnyIgnoreCasePattern

public static java.util.regex.Pattern getEndsWithAnyIgnoreCasePattern(java.lang.String[] terms)
Compile a pattern that can will match a string if the string ends with any of the given terms.

Case is ignored when matching using Unicode case rules.

Usage:
boolean b = getEndsWithAnyPattern(terms).matcher(s).matches();

If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

Parameters:
terms - Array of search strings.
Returns:
Compiled pattern that can be used to match a string to see if it ends with any of the terms.
Since:
ostermillerutils 1.02.25

containsAny

public static boolean containsAny(java.lang.String s,
                                  java.lang.String[] terms)
Tests to see if the given string contains any of the given terms.

This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

Parameters:
s - String that may contain any of the given terms.
terms - list of substrings that may be contained in the given string.
Returns:
true iff one of the terms is a substring of the given string.
Since:
ostermillerutils 1.02.25
See Also:
getContainsAnyPattern(String[])

equalsAny

public static boolean equalsAny(java.lang.String s,
                                java.lang.String[] terms)
Tests to see if the given string equals any of the given terms.

This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

Parameters:
s - String that may equal any of the given terms.
terms - list of strings that may equal the given string.
Returns:
true iff one of the terms is equal to the given string.
Since:
ostermillerutils 1.02.25
See Also:
getEqualsAnyPattern(String[])

startsWithAny

public static boolean startsWithAny(java.lang.String s,
                                    java.lang.String[] terms)
Tests to see if the given string starts with any of the given terms.

This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

Parameters:
s - String that may start with any of the given terms.
terms - list of strings that may start with the given string.
Returns:
true iff the given string starts with one of the given terms.
Since:
ostermillerutils 1.02.25
See Also:
getStartsWithAnyPattern(String[])

endsWithAny

public static boolean endsWithAny(java.lang.String s,
                                  java.lang.String[] terms)
Tests to see if the given string ends with any of the given terms.

This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

Parameters:
s - String that may end with any of the given terms.
terms - list of strings that may end with the given string.
Returns:
true iff the given string ends with one of the given terms.
Since:
ostermillerutils 1.02.25
See Also:
getEndsWithAnyPattern(String[])

containsAnyIgnoreCase

public static boolean containsAnyIgnoreCase(java.lang.String s,
                                            java.lang.String[] terms)
Tests to see if the given string contains any of the given terms.

Case is ignored when matching using Unicode case rules.

This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

Parameters:
s - String that may contain any of the given terms.
terms - list of substrings that may be contained in the given string.
Returns:
true iff one of the terms is a substring of the given string.
Since:
ostermillerutils 1.02.25
See Also:
getContainsAnyIgnoreCasePattern(String[])

equalsAnyIgnoreCase

public static boolean equalsAnyIgnoreCase(java.lang.String s,
                                          java.lang.String[] terms)
Tests to see if the given string equals any of the given terms.

Case is ignored when matching using Unicode case rules.

This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

Parameters:
s - String that may equal any of the given terms.
terms - list of strings that may equal the given string.
Returns:
true iff one of the terms is equal to the given string.
Since:
ostermillerutils 1.02.25
See Also:
getEqualsAnyIgnoreCasePattern(String[])

startsWithAnyIgnoreCase

public static boolean startsWithAnyIgnoreCase(java.lang.String s,
                                              java.lang.String[] terms)
Tests to see if the given string starts with any of the given terms.

Case is ignored when matching using Unicode case rules.

This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

Parameters:
s - String that may start with any of the given terms.
terms - list of strings that may start with the given string.
Returns:
true iff the given string starts with one of the given terms.
Since:
ostermillerutils 1.02.25
See Also:
getStartsWithAnyIgnoreCasePattern(String[])

endsWithAnyIgnoreCase

public static boolean endsWithAnyIgnoreCase(java.lang.String s,
                                            java.lang.String[] terms)
Tests to see if the given string ends with any of the given terms.

Case is ignored when matching using Unicode case rules.

This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

Parameters:
s - String that may end with any of the given terms.
terms - list of strings that may end with the given string.
Returns:
true iff the given string ends with one of the given terms.
Since:
ostermillerutils 1.02.25
See Also:
getEndsWithAnyIgnoreCasePattern(String[])


Copyright © 2002-2004 The RVTest Team. All Rights Reserved.