General discussion

Locked

HTML entities

By apotheon ·
I've noticed something that needs fixing:

When HTML entities are used in a message, they don't register properly when the message is first posted. For instance, if you type — in a message and post the message, all that shows up is —, not ?. On the other hand, if you then open the message for editing and repost it, all character entities that previously existed are now properly rendered, but any new character entities are still shown as raw code. This means that for the same code, you'll now have ? where previously it said —, but any new — still looks like —.

What seems to be happening is that character entities are only being rendered as the characters they are meant to represent when the message is opened using the edit button. Since they are rendered the way they're meant to be in the text field, that means that they stay that way when the edited message is reposted. New HTML character entities, of course, have not yet been translated by this process.

I have a tendency to use some character entities in online discussion. I've noticed that some other people on this site do as well, and that sometimes they don't notice that their character entities were not properly rendered (or, being lazy, don't care). I'd like to see all character entities properly rendered on the first pass, if that could be made to happen.

This conversation is currently closed to new comments.

4 total posts (Page 1 of 1)  
| Thread display: Collapse - | Expand +

All Comments

Collapse -

Flaw is the opposite - I think

by house In reply to HTML entities

They've opened up the doors to a bit of html. All I knew was that formatting was available.

I think that you discovered the flaw regarding "allowing that to happen" rather than the "not being able to render it" part.

house - ?2004 Eh?

Collapse -

please clarify

by apotheon In reply to Flaw is the opposite - I ...

I'm not sure what you mean. Are you saying that some HTML is being allowed, but not other, and that HTML entities are meant to be blocked?

It seems to me that HTML entities should be the first thing allowed, because A) there would have to be code written specifically to stop HTML character entities to render properly and B) HTML character entities won't screw with page formatting the way HTML tags can.

I'm guessing this site is run using a MySQL database back-end. Assuming that's the case, it seems the low-overhead way to do that would be to store code AS ENTERED in the database, possibly parsing it for stuff to edit out if you don't trust the people posting to be responsible and competent with their HTML tags (and I don't recommend trusting everyone with all HTML tags by any stretch).

For everything that has to be edited out by scripts, however, code has to be written. Granted, that code isn't necessarily much (particularly if you're using Perl with regexes), but trying to prevent rendering of character entities with a script would be one of the more annoying tasks to tackle, and prone to failure without putting a heckuva lot of time into it. Unfortunately, after all that, it seems that the net result would be exactly what we've now got (most of the time): occasional messages appearing with code showing rather than the characters the poster intended.

Maybe that's just me, though. I suppose there might be other reasons to run things the way they are, of which I'm not aware. All things considered, though, I'd like to be able to insert emdashes, and perhaps the occasional nonbreaking space, by way of HTML character entities.

Collapse -

Clarification - of something I don't fully understand

by house In reply to please clarify

I don't claim to know much about the way that this site is run. All I know is that there was no functionality before beyond plain text. I've only recently discovered that formatting was allowed here.

Keep in mind too that I am a long way from the web development and programming aspect of the IT field. I am more of a "Network Engineer" and a "Microsoft products" end user support type guy.

From my understanding, TR added the smily and formatting functions just recently while solving a historic issue regarding hyperlinks. Perhaps the formatting was always there through trickery like you have just discovered.

http://tinyurl.com/4w4pd

house

Collapse -

yeah . . .

by apotheon In reply to Clarification - of someth ...

I saw that post you linked, but I have yet to see HTML for formatting work here. I even experimented with using underline tags on something, and it didn't work. I haven't been able to figure out how they're doing it. Maybe they've enabled CSS. I dunno.

Anyone? An idea?

HTML character entities are strings of characters starting with an ampersand and ending with a semicolon in HTML that are used to produce nonstandard characters when a page is parsed by a browser, such as the emdash, endash, n-with-a-tilde, accented characters, umlauted characters, non-breaking spaces, and so on.

Back to Community Forum
4 total posts (Page 1 of 1)  

Related Discussions

Related Forums