I understand it's a regular expression, but the problem is, ! is a reserved Bash word (negation), brackets (list) enclose a list which is executed in a subshell, both of these come directly from the Bash man page.
What you are describing is
globbing, or filename expansion, which is a subset of regexp.
Every command line command expects parameters as input, cp being no exception.
They do not, however actually expand filename expressions, that is done by the shell, so one wildcard in a cp on the command line may result in hundreds or thousands of parameters to cp.
The supplied cp 'regexp' is not processed by cp, nor is it a valid filename expression for bash. If it was a valid expression for another shell, then it they would be expanded and become parameters to the cp.
With regard to even typing the supplied cp on a bash command line, the '
readline' facility is jumping in and grabbing the ! and interpreting it
I even tried running the cp as a script (#!/bin/bash), which results in a unexpected token '(' error, basically because there is no space after the !.
I completely fail to see how the supplied cp will work within a
Bash environment.
Feel free to modify the following script so that it runs
#!/bin/bash
trap "rm -fr /tmp/cptest" EXIT
cd /tmp
mkdir -p {cptest,cptest/A,cptest/B,cptest/C}
cp -r !(/tmp/cptest/A|/tmp/cptest/B) /tmp/cptest/backup
ls -laR /tmp/cptest/backup
because, when I run it, I see this...
lyall@Lyalls-PC /tmp
$ bash -x ./cp.sh
+ trap 'rm -fr /tmp/cptest' EXIT
+ cd /tmp
+ mkdir -p cptest cptest/A cptest/B cptest/C
./cp.sh: line 5: syntax error near unexpected token `('
./cp.sh: line 5: `cp -r !(/tmp/cptest/A|/tmp/cptest/B) /tmp/cptest/backup'
+ rm -fr /tmp/cptest
lyall@Lyalls-PC /tmp
$
Maybe zsh or csh or some other shell, but I don't use them nor do I have them installed to fiddle with.
Sorry, I'll have to hold my hands up and admit - I didn't try the cp command in Wiz's example. I just assumed it worked because
A. Wiz posted it
and
B. Because Lucan said it worked for him.
And yes, you're correct, it does appear to be missing a space between the ! and the opening brace "(". And perhaps I was slightly inaccurate with some of my wording. I wasn't sure if it was the shell, or readline, or what it was that did the evaluation. But going by the assumption that the command does work (which it does, but only on recent versions of bash it would seem) - something would have obtained a list of files from the pattern when it was evaluated and then it would feed cp a list of file-names - which is pretty much what I alluded to in my original post. In a round-about way!
I'm not fully up to speed with all of the jargon when it comes to writing shell scripts - I'm usually too busy writing scripts to be worried about the correct terminology for things - a bit of a failing on my part. Nobody is perfect. Especially not me!
I was on my phone yesterday (and today), not on a Linux PC, so I didn't have the means to test it myself. AFAIK it is quite new functionality in bash. So if you have a slightly older version of bash it might not work for you if you added a space after the !.
However, if wiz's solution doesn't work for you - there's always more than one way to skin a cat!
Another way of excluding those directories from your example would be to do something this in your script:
Code:
#!/bin/bash
trap "rm -fr /tmp/cptest" EXIT
cd /tmp
mkdir -p {cptest,cptest/A,cptest/B,cptest/C,cptest/backup}
cd -
cp -r $( \find /tmp/cptest/* | \egrep -v "t/A$|t/B$|t/backup$" ) /tmp/cptest/backup/
ls -laR /tmp/cptest/backup
Which unike Wiz's solution means using a sub-shell. But it does the job!
I'm sure Lyall understands what's going on in that script, but for the benefit of the non-scripters.
There are a few unusual things going on in there:
I have used \find and \egrep to escape/avoid any aliases that the user may have set up for find or egrep.
If the user has set up an alias for egrep which includes line numbers, or any fancy formatting - the output from egrep will not be suitable for us. We just want the plain file-names/directory names to feed to the cp command.
Similarly for find - the user may have an alias set up which sets certain options. We just want to use find, with no options and with a simple path/pattern.
So we have escaped any aliases by prepending a backslash to the find and egrep commands.
The cp command itself is fairly straightforward.
But to get our list of input files, we have our "\find | \egrep" command in a subshell.
e.g. this part of the line:
Code:
$( \find /tmp/cptest/* | \egrep -v "t/A$|t/B$|t/backup$" )
This will provide the cp command with the list of target files and directories to copy to the backup folder.
Because no options have been set - find will not recurse into any subdirectories. It will only list items that are in the top level of the search directory.
The reason for the * at the end of the search regex e.g. "/tmp/cptest/*" is because we dont want to get /tmp/cptest included in the results. We only want to see items that are in /tmp/cptest/ NOT the entirety of tmp/cptest itself. Otherwise we'd accidentally end up recursively copying everything in cptest/ into the cptest/backup/ directory - which would completely negate what we're trying to do.
So find simply lists everything in the top-level of /temp/cptest/
After getting results from find, we pipe them to \egrep which filters out the files or directories that we want to exclude. The -v option will negate the results to yield only files/directories that DO NOT match the specified regex.
The directories we want to exclude are:
A, B and backup (because the backup directory is also in the main folder we're searching in, so we need to exclude that too)
Taking another look at the regex pattern:
"t/A$|t/B$|t/backup$"
Each exclusion in the regex is separated by a pipe character '|' - which represents a logical OR.
The "t/" part in each refers to the "t/" characters at the end of "cptest/" in the path. And the $ at the end of each one tells \egrep that the pattern is at the end of the line.
So we are looking to exlude any results from find that end in:
t/A
t/B
t/backup
Which will exclude /tmp/cptest/A, /tmp/cptest/B and /tmp/cptest/backup.
We could have put the full paths into our regex:
"/tmp/cptest/A|/tmp/cptest/B|/tmp/cptest/backup"
But I decided to whittle it down to t/A$ etc.
Getting back on point:
The only directories we have in /tmp/cptest/ are A, B, C and backup.
egrep has now filtered and excluded A, B and backup from the results from find - so the only directory that gets passed to the cp command and copied to the /tmp/cptest/backup/ directory is /tmp/cptest/C/
So at the end of the script, /tmp/cptest/backup/ will only contain the C/ directory.
You could also use ls and egrep instead of find in a subshell too. But the globbing from Wiz's post is the nicest solution - if you have a recent version of bash!
It should also be possible to do it as a one liner using find.