Ask for help, format Unicode text file

heiyubaiclub

New Member
Joined
Jan 10, 2020
Messages
4
Reaction score
0
Credits
0
For example, there is a Unicode text file that contains:
not for easy understanding, but for...

I manually insert a glyph:
not fo▲r easy under▲standing, but for...

It is required to be formatted in a program, and the output is as follows:
not ___ easy _____________, but for...

That is, the 3-letter for, becomes 3 underscores; the 13-letter understanding, becomes 13 underscores.……
 


iconv -f ISO-8859-1 -t UTF-8//TRANSLIT input.file -o out.file

I'm not sure what character set you want, but you can list all the options this way.

iconv -l
 
I need a command line that can complete the formatting work. I've heard that using shell and awk, can format a lot of text.
 
Sorry, I'm still not exactly sure what you want.

sed -i "s/not fo▲r easy under▲standing/not ___ easy _____________, but for/g' myfile.txt

Will replace all instances of the first string and replace them with the second string in myfile.txt
 
Problem solved!
$ awk -f formatting.awk sample.txt
not ___ easy _____________, but for...
$ more sample.txt
not fo▲r easy under▲standing, but for...
$ more formatting.awk
{
for (i = 1; i <= NF; i++)
if ($i ~ /▲/) {
gsub(/▲/, "", $i)
gsub(/\w/, "_", $i)
}

print
}
 
Well I am so glad we got that sorted out.....

Welcome to linux.org !!!
 
Yea. "problem solved", after they got a solution at another forum, and didn't even go back there to post this after getting someone else to do this for them.
 
That article makes for an interesting read, top to bottom, Jake :). Nice find - I usually find these duplications, but am still catching up after a 10-day internet outage I had a couple of weeks ago.

I hope the OP took on board some of the comments made by those there, on protocol.

It is thoughtless to post the same issue at multiple venues simultaneously, given helpers cannot be expected to know what advice is being given elsewhere, and maybe investing a lot of time in devising a solution for a problem that has already been solved.

I am glad you have a solution, @heiyubaiclub but do spare a thought for others in future.

Chris Turner
wizardfromoz
 

Members online


Top