@carlarogers - From your description, I’m assuming the text fields look something like this:
Code:
2,5,"bananas\",15.5,"I don’t know\",17
So you have numeric types (or perhaps boolean types) and then string values in certain places, surrounded by double quotes, but with \" for the closing quotes?
And you want to replace all instances of
\"
at the end of the strings with
"
?
If so, you need to run the following command:
Bash:
sed -i.bak 's/\\\"/\"/g' /path/to/*.csv
Where
/path/to/
is the path to the directory containing the .csv files and
*.csv
is a globbing pattern, which tells sed to edit ALL of the .csv files in the specified directory.
Again, I'm assuming that you want sed to edit ALL of the files in one go. If that is not the case and you only want to edit one at a time, or to only edit specific files, you could list the files individually after the sed command, instead of using a globbing pattern as I have.
In the above
sed
command we use the
-i.bak
option, which tells sed to edit and overwrite the original file (edit in place). We also specify
.bak
as a parameter to the
-i
switch. NOTE: There is no space between
-i
and
.bak
. It's
-i.bak
. The addition of
.bak
tells sed to create a backup file of each file it edits (JIC any of my assumptions about your files are incorrect.),which will be called originalfile.ext.bak.
So if your file was called
mycsvfile.csv
, then
sed
will create a backup of the original called
mycsvfile.csv.bak
before overwriting the original file.
If you use
sed -i
without specifying a backup extension to use, it simply overwrites the original file without creating a backup. Normally before using sed - I take some time to make sure that my search/replace patterns are correct and are not going to bork things. And if I'm ever slightly unsure, I use sed's -i.bak option to edit in place AND create a backup.
And I accept no responsibility for borked files, so at least this way - if any of my assumptions about your files are wrong, you at least end up with an unmolested backup of the original files!
The search/replace pattern
's/\\\"/\"/g'
can be broken down like this:
'
is the opening single quote, specifying the start of the actions sed should take
s
tells sed that we're doing a search and replace operation
/
is the field separator between the command (search+replace) and the search pattern
\\\"
is the search pattern - i.e. The pattern of characters we're searching for
Where:
\\
is the escape sequence for a literal backslash characer
\
\"
is the escape sequence for a literal double quote character
"
/
is the field separator between the search pattern and the replace pattern
\"
is the replace pattern, which is the escape sequence for a literal double quote character
"
/
is another field separator
g
tells sed to perform the search/replace pattern globally***
'
is the closing single quote, marking the end of the operations
sed
should perform
*** Without the
g
at the end,
sed
would only search/replace the first occurrence of the pattern on each line. So if there are multiple strings on a single line, then you'll NEED to specify
g
at the end.
So essentially it says 'find the pattern
\"
and replace it globally with
"
'
So after running the
sed
command I've posted, my initial example-string would end up looking like this:
Code:
2,5,"bananas",15.5,"I don’t know",17
I've just mocked up a small .csv file and checked it. And the
sed
command I've posted works properly for me. So it should work for you. At least as long as all of my assumptions about your .csv file are correct!
If anything is not quite right, please give me a more concrete example of the data you're working with and I can help you to work out an appropriate
sed
command to use.