Web Development

General discussion


Special Character Remover

By coopcaborojo ·

I'm working on a migration from Cobol (85) to Oracle. I convert the Cobol data to ascii but Im having problem with special characters on the original data. I was thinking about developing a spec.char.remover routine on Cobol but, i know is gonna be a pain... Does anyone knows a "quick and dirty" way to remove special characters from a text file?


This conversation is currently closed to new comments.

Thread display: Collapse - | Expand +

All Comments

Collapse -

"Strip" from MaresWare

by gralfus In reply to Special Character Remover

This program strips out all unprintable characters:

They have many other nifty programs as well.

Collapse -


by coopcaborojo In reply to "Strip" from MaresWare

I d/l the program but it doesn't work... When I double click on it it appears a black window and then dissapear...

Collapse -

You've been using strings to store binary data

by Tony Hopkinson In reply to Special Character Remover

haven't you !

Come on admit it !


Collapse -

No I haven't...

by coopcaborojo In reply to You've been using strings ...

Sometimes during the data entry users tend to press keys like /f1/ or /page up/ which are considered by the accept command as "thrash data"(or as special characters)and the applications stores it in the X fields...

Collapse -

How they are stored?

by klaasvanbe In reply to No I haven't...

Are those characters stored as ASCII or another character set e.g. EBCDIC?
In other words: What's the platform?

Collapse -

COBOL and Text

by LuckyLeatherneck In reply to Special Character Remover

First define the full text area, like PIC X(132),and redefine it as PIC X(001) OCCURS 132 TIMES.

Then in your procedure, move your data into the full text field.

Now, within a perform index through the single byte occurrances doing the IF NUMERIC and IF ALPHABETIC checks to remove anything that's neither numeric (F0 - F9) or alphabetic (C1 - E9). All that is left is the good text, unless you want to also remove the plusses, minuses, etc., which you can easily look for with IF '+' and so on.

If you are using SPF as the editor for your program, there are ways to have it search your program or text document for non-display characters and automatically replace them with whatever you want -- usually a space.

By the way, EBCDIC to ASCII conversion is only good for display data -- not packed.

let me know if this helps.


Collapse -

I will try...

by coopcaborojo In reply to COBOL and Text

I'll let you know...

Thank you for your help!

Collapse -

In Oracle ...

by johnnymegabyte In reply to I will try...

In Oracle you can create a series of variables, each with the assigned HEX of the special characters.
Example: TAB = chr(9)
X_TAB char(1) := chr(9);

Then, use a REPLACE statement on your input data string, to remove the special characters to a blank or null.

You may want to use an INSTR command to check if a special character exists first, then perform the REPLACE in a LOOP an work character-by-character.

That's how I would do it. And I'm sure there are more ways.

Collapse -

Any language, but Perl will be the quickest

by Justin James Contributor In reply to Special Character Remover

Any language can do this, but using Perl will make it go super fast. Just use a readline() loop to get the file line by line, use a regex to retain only the characters you want:

$line =~ s/[^characters to keep in here]//mxs;

Then print $line out to a new file, or to the screen and redirect the output.


Collapse -

Perl-Love/Hate Relationship

by JohnnySacks In reply to Any language, but Perl wi ...

Perl can be a pain but it shines for this kind of Sh#t work (cleaning up other peoples messes)

All on one command line:
perl -p -ibak -e"s/<expression for chars to remove>//g" <file to operate on>

You're left with original renamed with bak appended and modified file same name as input file

Related Discussions

Related Forums