Browser

Use soft hyphens to properly display text

The soft hyphen feature of HTML is available to ensure the proper presentation of text on a Web page. Tony Patton describes how to use this rather obscure HTML feature.

Firefox 3 Release Candidate 1 has many Web developers excited about the prospect of a final release in the foreseeable future. An interesting HTML feature to be supported is the soft hyphen, which is used to designate where a word may be broken for display. I didn't know this existed as part of Web standards until the recent information regarding Firefox.

What is a soft hyphen?

A soft hyphen lets the system know where a word may be broken, if needed, for display purposes. The HTML standard states that a hyphen character should be displayed at the end of the line where the break occurs if a line is broken at a soft hyphen. On the other hand, nothing is displayed if the line is not broken at a soft hyphen.

The key to the soft hyphen is its dynamic nature; it is only displayed when necessary -- that is, when the display of the text results in wrapping at the point where the soft hyphen is defined. The soft hyphen is not considered a character on the page in terms of display, so it is not recognized by search engines.

Presentation

The soft hyphen feature of HTML is available to ensure the proper presentation of text on a Web page. It allows you to aid the justification of text by specifying how a word may be split across two lines. Soft hyphens are another one of the more obscure features of HTML that allow you to make sure text is properly displayed.

You have no control over a user's screen resolution or how the user may resize the browser window, but you may develop text to present a word accordingly if and when it is broken across two lines.

Standards

Soft hyphens are not new to Web standards; they are included in the HTML 4 specification. In addition, the ISO Latin 1 character code (also known as ISO 8859-1) contains a character named soft hyphen, abbreviated SHY. It is defined as a graphic character that is imaged by a graphic symbol identical with or similar to that representing hyphen for use when a line break is established within a word.

The ECMA-94 standard includes the soft hyphen as well. It defines it as a graphic character that is imaged by a graphic symbol identical with or similar to that representing hyphen for use when a line break is permitted in the text as presented.

It is interesting to peruse these standards and discover that the soft hyphen has been supported (in theory) for many years; however, as most Web developers know, reality does not always agree. For instance, specifications are not always followed. This was true with soft hyphen support in Firefox before the current release candidate. With Firefox on board, all major browsers including Internet Explorer, Safari, and Opera will support soft hyphens. (Note: The support of soft hyphens in the latest version of Firefox doesn't seem to extend to text within HTML tables, but this may be corrected before its final release.)

Syntax

It uses the normal approach for inserting special characters in a Web page -- it begins with an ampersand (&) and ends with a semicolon(;). The syntax for inserting a soft return is the ampersand (&) and the word shy and the semicolon (shy;). You may also use the hexadecimal equivalent (­­) as well.

I have seen some developers utilize the <br /> element to force a break in the middle of a word, but this does not scale to different resolutions, so the break is present regardless of whether it is necessary.

The following sample Web page uses a soft hyphen within the paragraph on the page. Where it wraps depends on your screen resolution, although it has been set to wrap with a resolution of 800X600. You can resize the browser window until it wraps at the word with the soft hyphen (background) to see the insertion of the hyphen at the designated position.

<html><head>

<title>Soft Hyphens</title>

</head><body>

This is a paragraph of text that will wrap since if and when you resize the screen to make the test wrap. In the word <span style="color:red;">back­ground</span> has a soft hyphen inserted for proper display, so watch and marvel at the power of the soft hyphen.

</body></html>

The wrapped text appears like this.

This is a paragraph of text that will wrap since if and when you resize the screen to make the test wrap. In the word back­ground has a soft hyphen inserted for proper display, so watch and marvel at the power of the soft hyphen.

The hyphen appears at the edge of the first line when displayed in the browser.

Caveats

An interesting (or problematic, depending upon your view) aspect of working with hyphens in Internet Explorer is that it treats any hyphen as a potential word break. For instance, Internet Explorer will break the following text at the hyphen when/if the text wraps at that point.

This is a paragraph of text that will wrap if and when you resize the screen to make the text wrap. The word double-cross contains a hyphen that IE sees as a candidate for a breaking point to wrap text across lines.

You can override this behavior through the use of the non-standard no break tag <nobr>. You enclose the word containing the hyphen within the nobr opening and closing tags. The following paragraph shows its usage.

This is a paragraph of text that will not wrap if and when you resize the screen to force it. The word <nobr>double-cross</nobr> contains a hyphen that IE will not wrap due to its inclusion in the nobr element.

The nobr element is non-standard. It was invented by Netscape, but it is widely supported.

Moving ahead

Even though soft hyphens are a small HTML feature, it is nice to see support for them in all major browsers. The next step is ironing out the differences in CSS and so forth.

Have you used soft hyphens in your Web pages? What other features of the HTML specification do you want embraced across the industry? Share your thoughts with the Web Developer community.

Additional Firefox resources from TechRepublic

Tony Patton began his professional career as an application developer earning Java, VB, Lotus, and XML certifications to bolster his knowledge.

---------------------------------------------------------------------------------------

Get weekly development tips in your inbox Keep your developer skills sharp by signing up for TechRepublic's free Web Developer newsletter, delivered each Tuesday. Automatically subscribe today!

About

Tony Patton has worn many hats over his 15+ years in the IT industry while witnessing many technologies come and go. He currently focuses on .NET and Web Development while trying to grasp the many facets of supporting such technologies in a productio...

10 comments
joeclark
joeclark

First of all, the soft-hyphen character is extremely problematic from the standpoint of character definitions ??? check the extensive discussion in _Unicode Explained_ by Korpela. You would conclude from the foregoing that it???s also a Unicode character; you should use that, not a character entity. In any event, the Web is not desktop publishing. Hyphenation should be handled by the display engine (tantamount to automatic hyphenation in Quark or InDesign). It is not your job to load up your words with characters that might or might not display properly just to equalize the rag of a block of text. H&J is more valuable in full-justified text (a no-no on the Web), so why are we even talking about it?

MaryWeilage
MaryWeilage

Our publishing tool (WordPress) wiped out some of the syntax in the first paragraph of the Syntax section. We are working on resolving this issue. I apologize for the inconvenience. Thank you, Mary Weilage

bnb
bnb

while in general, hyphenation should be avoided in web pages, sometimes it's necessary, and the browsers i've have experience with do a generally lousy job. (even really competent composition programs are unable to consistently hyphenate correctly in all circumstances. consider the different hyphenation between the verb produce and the noun produce; this needs either a competent lexical analyzer or human attention. more important to me, actually, is <nobr>, to suppress hyphenation in, for example, an isbn, where hyphens are part of the conventional presentation. it drives me mad when an (x)html validates perfectly except for <nobr> errors. i keep hoping this gets recognized in the standard.) if one anticipates a narrow window, and the text contains long technical words, an occasional discretionary hyphen is a good thing. i guess it depends on what sort of content one must work with.

Justin James
Justin James

... who author HTML, and they are extremely concerned about the precision of formatting, display, etc. These are the folks who use a 1px "spacer" gif in a thousand places to force things to look "just so". I think these are the folks to whom the soft hyphen appeals to. While I personally do not agree with that school of though, and I do not foresee myself using the soft hypen anywhere, I know that some clients are often quite demanding in some very odd ways, and I *have* been asked to do hyphenation in the past before... J.Ja

CreepinJesus
CreepinJesus

I personally hate text that has load of hyphens on the edges just so its justified. I much prefer a normal left-align, where if a word doesn't fit on the line, it starts a new line.

markm
markm

In the 80's I worked with an enterprise word processing/document management system called Atex. The keyboard had a special character called a "discretionary hyphen." That got condensed into "dishy," pronounced "Dish-eeee."

Justin James
Justin James

Mary - I was wondering about that, and about to email you, thanks for the update! J.Ja

aspatton
aspatton

The syntax for using soft returns is the ampersand (&) followed by shy;

Justin James
Justin James

If you need valid (X)HTML and an analogue to nobr, use the "wrapping" CSS property (I think "none" is the value you want). :) .isbn { wrapping: none; } ... should do it! J.Ja

pivert
pivert

Those people also freak-out when you show their site and start increasing/decreasing the font site or have the site been read for the blind. How sexy when you hear "image spacer dot gif".

Editor's Picks