Class BidiFormatter

java.lang.Object
com.google.gwt.i18n.shared.BidiFormatterBase
com.google.gwt.i18n.shared.BidiFormatter

public class BidiFormatter extends BidiFormatterBase
Utility class for formatting text for display in a potentially opposite-direction context without garbling. The direction of the context is set at formatter creation and the direction of the text can be either estimated or passed in when known. Provides the following functionality:

1. BiDi Wrapping: When text in one language is mixed into a document in another, opposite-direction language, e.g. when an English business name is embedded in a Hebrew web page, both the inserted string and the text following it may be displayed incorrectly unless the inserted string is explicitly separated from the surrounding text in a "wrapper" that declares its direction at the start and then resets it back at the end. This wrapping can be done in HTML mark-up (e.g. a 'span dir=rtl' tag) or - only in contexts where mark-up cannot be used - in Unicode BiDi formatting codes (LRE|RLE and PDF). Optionally, the mark-up can be inserted even when the direction is the same, in order to keep the DOM structure more stable. Providing such wrapping services is the basic purpose of the BiDi formatter.

2. Direction estimation: How does one know whether a string about to be inserted into surrounding text has the same direction? Well, in many cases, one knows that this must be the case when writing the code doing the insertion, e.g. when a localized message is inserted into a localized page. In such cases there is no need to involve the BiDi formatter at all. In some other cases, it need not be the same as the context, but is either constant (e.g. urls are always LTR) or otherwise known. In the remaining cases, e.g. when the string is user-entered or comes from a database, the language of the string (and thus its direction) is not known a priori, and must be estimated at run-time. The BiDi formatter can do this automatically.

3. Escaping: When wrapping plain text - i.e. text that is not already HTML or HTML-escaped - in HTML mark-up, the text must first be HTML-escaped to prevent XSS attacks and other nasty business. This of course is always true, but the escaping can not be done after the string has already been wrapped in mark-up, so the BiDi formatter also serves as a last chance and includes escaping services.

Thus, in a single call, the formatter will escape the input string as specified, determine its direction, and wrap it as necessary. It is then up to the caller to insert the return value in the output.

  • Method Details

    • getInstance

      public static BidiFormatter getInstance(boolean rtlContext)
      Factory for creating an instance of BidiFormatter given the context direction. The default behavior of spanWrap(java.lang.String) and its variations is set to avoid span wrapping unless it's necessary ('dir' attribute needs to be set).
      Parameters:
      rtlContext - Whether the context direction is RTL. In one simple use case, the context direction would simply be the locale direction, which can be retrieved using LocaleInfo.getCurrentLocale().isRTL()
    • getInstance

      public static BidiFormatter getInstance(boolean rtlContext, boolean alwaysSpan)
      Factory for creating an instance of BidiFormatter given the context direction and the desired span wrapping behavior (see below).
      Parameters:
      rtlContext - Whether the context direction is RTL. See an example of a simple use case at getInstance(boolean)
      alwaysSpan - Whether spanWrap(java.lang.String) (and its variations) should always use a 'span' tag, even when the input direction is neutral or matches the context, so that the DOM structure of the output does not depend on the combination of directions
    • getInstance

      public static BidiFormatter getInstance(HasDirection.Direction contextDir)
      Factory for creating an instance of BidiFormatter given the context direction. The default behavior of spanWrap(java.lang.String) and its variations is set to avoid span wrapping unless it's necessary ('dir' attribute needs to be set).
      Parameters:
      contextDir - The context direction. See an example of a simple use case at getInstance(boolean). Note: Direction.DEFAULT indicates unknown context direction. Try not to use it, since it is impossible to reset the direction back to the context when it is unknown
    • getInstance

      public static BidiFormatter getInstance(HasDirection.Direction contextDir, boolean alwaysSpan)
      Factory for creating an instance of BidiFormatter given the context direction and the desired span wrapping behavior (see below).
      Parameters:
      contextDir - The context direction. See an example of a simple use case at getInstance(boolean). Note: Direction.DEFAULT indicates unknown context direction. Try not to use it, since it is impossible to reset the direction back to the context when it is unknown
      alwaysSpan - Whether spanWrap(java.lang.String) (and its variations) should always use a 'span' tag, even when the input direction is neutral or matches the context, so that the DOM structure of the output does not depend on the combination of directions
    • getInstanceForCurrentLocale

      public static BidiFormatter getInstanceForCurrentLocale()
      Factory for creating an instance of BidiFormatter whose context direction matches the current locale's direction. The default behavior of spanWrap(java.lang.String) and its variations is set to avoid span wrapping unless it's necessary ('dir' attribute needs to be set).
    • getInstanceForCurrentLocale

      public static BidiFormatter getInstanceForCurrentLocale(boolean alwaysSpan)
      Factory for creating an instance of BidiFormatter whose context direction matches the current locale's direction, and given the desired span wrapping behavior (see below).
      Parameters:
      alwaysSpan - Whether spanWrap(java.lang.String) (and its variations) should always use a 'span' tag, even when the input direction is neutral or matches the context, so that the DOM structure of the output does not depend on the combination of directions
    • dirAttr

      public String dirAttr(String str)
      Like dirAttr(String, boolean), but assumes isHtml is false.
      Parameters:
      str - String whose direction is to be estimated
      Returns:
      "dir=rtl" for RTL text in non-RTL context; "dir=ltr" for LTR text in non-LTR context; else, the empty string.
    • dirAttr

      public String dirAttr(String str, boolean isHtml)
      Returns "dir=ltr" or "dir=rtl", depending on str's estimated direction, if it is not the same as the context direction. Otherwise, returns the empty string.
      Parameters:
      str - String whose direction is to be estimated
      isHtml - Whether str is HTML / HTML-escaped
      Returns:
      "dir=rtl" for RTL text in non-RTL context; "dir=ltr" for LTR text in non-LTR context; else, the empty string.
    • endEdge

      public String endEdge()
      Returns "left" for RTL context direction. Otherwise (LTR or default / unknown context direction) returns "right".
    • knownDirAttr

      public String knownDirAttr(HasDirection.Direction dir)
      Returns "dir=ltr" or "dir=rtl", depending on the given direction, if it is not the same as the context direction. Otherwise, returns the empty string.
      Parameters:
      dir - Given direction
      Returns:
      "dir=rtl" for RTL text in non-RTL context; "dir=ltr" for LTR text in non-LTR context; else, the empty string.
    • mark

      public String mark()
      Returns the Unicode BiDi mark matching the context direction (LRM for LTR context direction, RLM for RTL context direction), or the empty string for default / unknown context direction.
    • markAfter

      public String markAfter(String str)
      Like markAfter(String, boolean), but assumes isHtml is false.
      Parameters:
      str - String after which the mark may need to appear
      Returns:
      LRM for RTL text in LTR context; RLM for LTR text in RTL context; else, the empty string.
    • markAfter

      public String markAfter(String str, boolean isHtml)
      Returns a Unicode BiDi mark matching the context direction (LRM or RLM) if either the direction or the exit direction of str is opposite to the context direction. Otherwise returns the empty string.
      Parameters:
      str - String after which the mark may need to appear
      isHtml - Whether str is HTML / HTML-escaped
      Returns:
      LRM for RTL text in LTR context; RLM for LTR text in RTL context; else, the empty string.
    • spanWrap

      public String spanWrap(String str)
      Like spanWrap(String, boolean, boolean), but assumes isHtml is false and dirReset is true.
      Parameters:
      str - The input string
      Returns:
      Input string after applying the above processing.
    • spanWrap

      public String spanWrap(String str, boolean isHtml)
      Like spanWrap(String, boolean, boolean), but assumes dirReset is true.
      Parameters:
      str - The input string
      isHtml - Whether str is HTML / HTML-escaped
      Returns:
      Input string after applying the above processing.
    • spanWrap

      public String spanWrap(String str, boolean isHtml, boolean dirReset)
      Formats a string of unknown direction for use in HTML output of the context direction, so an opposite-direction string is neither garbled nor garbles what follows it.

      The algorithm: estimates the direction of input argument str. In case its direction doesn't match the context direction, wraps it with a 'span' tag and adds a "dir" attribute (either 'dir=rtl' or 'dir=ltr').

      If setAlwaysSpan(true) was used, the input is always wrapped with 'span', skipping just the dir attribute when it's not needed.

      If dirReset, and if the overall direction or the exit direction of str are opposite to the context direction, a trailing unicode BiDi mark matching the context direction is appended (LRM or RLM).

      If !isHtml, HTML-escapes str regardless of wrapping.

      Parameters:
      str - The input string
      isHtml - Whether str is HTML / HTML-escaped
      dirReset - Whether to append a trailing unicode bidi mark matching the context direction, when needed, to prevent the possible garbling of whatever may follow str
      Returns:
      Input string after applying the above processing.
    • spanWrapWithKnownDir

      public String spanWrapWithKnownDir(HasDirection.Direction dir, String str)
      Parameters:
      dir - str's direction
      str - The input string
      Returns:
      Input string after applying the above processing.
    • spanWrapWithKnownDir

      public String spanWrapWithKnownDir(HasDirection.Direction dir, String str, boolean isHtml)
      Parameters:
      dir - str's direction
      str - The input string
      isHtml - Whether str is HTML / HTML-escaped
      Returns:
      Input string after applying the above processing.
    • spanWrapWithKnownDir

      public String spanWrapWithKnownDir(HasDirection.Direction dir, String str, boolean isHtml, boolean dirReset)
      Formats a string of given direction for use in HTML output of the context direction, so an opposite-direction string is neither garbled nor garbles what follows it.

      The algorithm: estimates the direction of input argument str. In case its direction doesn't match the context direction, wraps it with a 'span' tag and adds a "dir" attribute (either 'dir=rtl' or 'dir=ltr').

      If setAlwaysSpan(true) was used, the input is always wrapped with 'span', skipping just the dir attribute when it's not needed.

      If dirReset, and if the overall direction or the exit direction of str are opposite to the context direction, a trailing unicode BiDi mark matching the context direction is appended (LRM or RLM).

      If !isHtml, HTML-escapes str regardless of wrapping.

      Parameters:
      dir - str's direction
      str - The input string
      isHtml - Whether str is HTML / HTML-escaped
      dirReset - Whether to append a trailing unicode bidi mark matching the context direction, when needed, to prevent the possible garbling of whatever may follow str
      Returns:
      Input string after applying the above processing.
    • startEdge

      public String startEdge()
      Returns "right" for RTL context direction. Otherwise (LTR or default / unknown context direction) returns "left".
    • unicodeWrap

      public String unicodeWrap(String str)
      Like unicodeWrap(String, boolean, boolean), but assumes isHtml is false and dirReset is true.
      Parameters:
      str - The input string
      Returns:
      Input string after applying the above processing.
    • unicodeWrap

      public String unicodeWrap(String str, boolean isHtml)
      Like unicodeWrap(String, boolean, boolean), but assumes dirReset is true.
      Parameters:
      str - The input string
      isHtml - Whether str is HTML / HTML-escaped
      Returns:
      Input string after applying the above processing.
    • unicodeWrap

      public String unicodeWrap(String str, boolean isHtml, boolean dirReset)
      Formats a string of unknown direction for use in plain-text output of the context direction, so an opposite-direction string is neither garbled nor garbles what follows it. As opposed to spanWrap(java.lang.String), this makes use of Unicode BiDi formatting characters. In HTML, its *only* valid use is inside of elements that do not allow mark-up, e.g. an 'option' tag.

      The algorithm: estimates the direction of input argument str. In case it doesn't match the context direction, wraps it with Unicode BiDi formatting characters: RLE+str+PDF for RTL text, or LRE+ str+PDF for LTR text.

      If opt_dirReset, and if the overall direction or the exit direction of str are opposite to the context direction, a trailing unicode BiDi mark matching the context direction is appended (LRM or RLM).

      Does *not* do HTML-escaping regardless of the value of isHtml.

      Parameters:
      str - The input string
      isHtml - Whether str is HTML / HTML-escaped
      dirReset - Whether to append a trailing unicode bidi mark matching the context direction, when needed, to prevent the possible garbling of whatever may follow str
      Returns:
      Input string after applying the above processing.
    • unicodeWrapWithKnownDir

      public String unicodeWrapWithKnownDir(HasDirection.Direction dir, String str)
      Parameters:
      dir - str's direction
      str - The input string
      Returns:
      Input string after applying the above processing.
    • unicodeWrapWithKnownDir

      public String unicodeWrapWithKnownDir(HasDirection.Direction dir, String str, boolean isHtml)
      Parameters:
      dir - str's direction
      str - The input string
      isHtml - Whether str is HTML / HTML-escaped
      Returns:
      Input string after applying the above processing.
    • unicodeWrapWithKnownDir

      public String unicodeWrapWithKnownDir(HasDirection.Direction dir, String str, boolean isHtml, boolean dirReset)
      Formats a string of given direction for use in plain-text output of the context direction, so an opposite-direction string is neither garbled nor garbles what follows it. As opposed to spanWrapWithKnownDir(com.google.gwt.i18n.client.HasDirection.Direction, java.lang.String), this makes use of unicode BiDi formatting characters. In HTML, its *only* valid use is inside of elements that do not allow mark-up, e.g. an 'option' tag.

      The algorithm: estimates the direction of input argument str. In case it doesn't match the context direction, wraps it with Unicode BiDi formatting characters: RLE+str+PDF for RTL text, or LRE+ str+PDF for LTR text.

      If opt_dirReset, and if the overall direction or the exit direction of str are opposite to the context direction, a trailing unicode BiDi mark matching the context direction is appended (LRM or RLM).

      Does *not* do HTML-escaping regardless of the value of isHtml.

      Parameters:
      dir - str's direction
      str - The input string
      isHtml - Whether str is HTML / HTML-escaped
      dirReset - Whether to append a trailing unicode bidi mark matching the context direction, when needed, to prevent the possible garbling of whatever may follow str
      Returns:
      Input string after applying the above processing.