Question: Recently I had to copy huge number of files occupying storage space of 7TB to another location on my CentOS box. I used ‘cp’ command, but for some reason (may be a network issue), the command exited after copying few files and directories. Now if I resume the copy operation, the command starts all over again copying the files that have been already copied (which again takes hell lot of time and cpu). So my question is, can I tell ‘cp‘ command to ignore the files and directories that have been already copied (so that I can save some time and cpu). Any suggestions?
Answer: You can use either ‘cp‘ command or ‘rsync‘ to copy files.
Solution1: Firstly we’ll see CP command:
cp -u --force <src> <dst>
Here –u option copies a file only if the SOURCE file is newer than the destination file or when the destination file is missing.
Note: In case, if the above command prompts as below:
cp : overwrite '<filename>'
Then, the ‘cp‘ command would be an alias with an interactive mode! To find the alias of ‘cp’, type the below command:
# type cp
cp is aliased to `cp -i'
Since it is interactive, it prompted to provide user input even if you had specified –force option. To avoid the above and do non-interactive copy, use the absolute path of cp command as shown below:
/usr/bin/cp -u --force <src> <dst>
Solution 2: Using rsync command:
# rsync -av --ignore-existing<src> <dst>
Note: rsync will build a file list up front. If your source has large no. of files, then rsync takes hours and hours and hours to create such a large file list! and you might think the process has hung, instead add -p option and you’ll see a progress bar and you really don’t have to think rsync is hanging!
You can also refer our article to add a progress bar to copy and move commands in Linux.