Script to remove certain strings from SRT files?

rado84

Well-Known Member
Joined
Feb 25, 2019
Messages
688
Reaction score
564
Credits
4,122
Can anyone give me a script to remove square brackets (these: [ ]) and everything BETWEEN those brackets from SRT files? And ofc to save the changes as if you've pressed Ctrl+S. The idea is to put the script into an .sh file and to use it to do its thing on all SRT files in a directory (bulk processing).

For unknown reasons all English speaking subtitle makers have the annoying habit to describe every single sound or action, such as "chuckles", "panting", "chanting" or to write notes whenever there's music, as if we're imbeciles and we don't know what we're hearing or seeing.
 


dos2unix

Well-Known Member
Joined
May 3, 2019
Messages
2,112
Reaction score
1,728
Credits
15,288
cat test.file
abcde[abcdef]abdef
ghjijk[nhjrki]vbgfh

cat test.file | cut -f2 -d[ | cut -f1 -d]

abcdef
nhjrki
 

osprey

Well-Known Member
Joined
Apr 15, 2022
Messages
1,125
Reaction score
1,102
Credits
10,705
Here is a one liner that will change the text within the file named: srtfile.txt, by removing the square brackets and the text between them in that file, and replace the file:
Code:
[flip@flop ~]$ cat srtfile.txt
the quick brown fox jumps [ whoops ] over the lazy dog

[flip@flop]$ sed -i 's/\[.*\]//' srtfile.txt


[flip@flop ~]$ cat srtfile.txt
the quick brown fox jumps  over the lazy dog
 

wizardfromoz

Administrator
Staff member
Gold Supporter
Joined
Apr 30, 2017
Messages
9,186
Reaction score
8,138
Credits
39,448
Moving this to Command Line

Wizard
 

SciTecDC

New Member
Joined
Jul 9, 2023
Messages
15
Reaction score
12
Credits
164
For unknown reasons all English speaking subtitle makers have the annoying habit to describe every single sound or action, such as "chuckles", "panting", "chanting" or to write notes whenever there's music, as if we're imbeciles and we don't know what we're hearing or seeing.
These .srt files are usually generated automatically by some free software, but unfortunately their authors are often too lazy to search the file for superfluous entries. Unfortunately, typos, wrong translations, bad English and missing parts of sentences are also very common. :rolleyes:
 
OP
rado84

rado84

Well-Known Member
Joined
Feb 25, 2019
Messages
688
Reaction score
564
Credits
4,122
Here is a one liner that will change the text within the file named: srtfile.txt, by removing the square brackets and the text between them in that file, and replace the file:
Code:
[flip@flop ~]$ cat srtfile.txt
the quick brown fox jumps [ whoops ] over the lazy dog

[flip@flop]$ sed -i 's/\[.*\]//' srtfile.txt


[flip@flop ~]$ cat srtfile.txt
the quick brown fox jumps  over the lazy dog
From everything I was given today on the internet, only the second thing with sed worked. The lines with concatenate either didn't do anything or kept spitting error about not being able to find the file.
So, thanks for the script! ;)

These .srt files are usually generated automatically by some free software, but unfortunately their authors are often too lazy to search the file for superfluous entries. Unfortunately, typos, wrong translations, bad English and missing parts of sentences are also very common.
Nah, the English is just fine - no machine English in them. But for some reason the person who made the subs thinks that we all have a room temperature IQ, therefore we need every single sound described and stylised as either square or rounded brackets. So when the sed script worked, I copied it to another file and made one for rounded brackets as well. :)
 

f33dm3bits

Gold Member
Gold Supporter
Joined
Dec 11, 2019
Messages
6,407
Reaction score
4,864
Credits
47,079
For unknown reasons all English speaking subtitle makers have the annoying habit to describe every single sound or action, such as "chuckles", "panting", "chanting" or to write notes whenever there's music, as if we're imbeciles and we don't know what we're hearing or seeing.
Those are actually hearing impaired subtitles, for people that have hearing problems or people that are deaf: sdh = subtitles for deaf or hard of hearing, hi = hearing impaired, cc = closed captioned. They will usually have the format of something like this.
Code:
en.sdh.srt
en.hi.srt
en.cc.srt
If you check for example the listed subtitles for this movie, you will see that there are several files with an "ear" icon next to it. That is the symbol for hearing impaired subtitles. Just avoid subtitles with anything that icon and the extensions "sdh, hi, and cc" and you will not get those "annoying" types of subtitles and you won't have to use a script either to remove them.
 
Last edited:
OP
rado84

rado84

Well-Known Member
Joined
Feb 25, 2019
Messages
688
Reaction score
564
Credits
4,122
I was wondering what SDH was. I've seen that on some movies. But the problem is many TV shows and movies have ONLY SDH subs with them, no normal subs, so having these scripts a fast way to turn them into normal subs.
 

Members online


Top