Parallelise rsync using GNU Parallel

up vote
10
down vote

favorite

I have been using a rsync script to synchronize data at one host with the data at another host. The data has numerous small-sized files that contribute to almost 1.2TB.

In order to sync those files, I have been using rsync command as follows:

rsync -avzm --stats --human-readable --include-from proj.lst /data/projects REMOTEHOST:/data/

The contents of proj.lst are as follows:

+ proj1

+ proj1/*

+ proj1/*/*

+ proj1/*/*/*.tar

+ proj1/*/*/*.pdf

+ proj2

+ proj2/*

+ proj2/*/*

+ proj2/*/*/*.tar

+ proj2/*/*/*.pdf

...

...

...

- *

As a test, I picked up two of those projects (8.5GB of data) and I executed the command above. Being a sequential process, it tool 14 minutes 58 seconds to complete. So, for 1.2TB of data it would take several hours.

If I would could multiple rsync processes in parallel (using &, xargs or parallel), it would save my time.

I tried with below command with parallel (after cding to source directory) and it took 12 minutes 37 seconds to execute:

parallel --will-cite -j 5 rsync -avzm --stats --human-readable {} REMOTEHOST:/data/ ::: .

This should have taken 5 times less time, but it didn't. I think, I'm going wrong somewhere.

How can I run multiple rsync processes in order to reduce the execution time?

asked Mar 13 '15 at 6:51

Mandar Shinde

1,40782747

1

Are you limited by network bandwidth? Disk iops? Disk bandwidth?
– Ole Tange
Mar 13 '15 at 7:25

If possible, we would want to use 50% of total bandwidth. But, parallelising multiple rsyncs is our first priority.
– Mandar Shinde
Mar 13 '15 at 7:32

Can you let us know your: Network bandwidth, disk iops, disk bandwidth, and the bandwidth actually used?
– Ole Tange
Mar 13 '15 at 7:41

In fact, I do not know about above parameters. For the time being, we can neglect the optimization part. Multiple rsyncs in parallel is the primary focus now.
– Mandar Shinde
Mar 13 '15 at 7:47

No point in going parallel if the limitation isn't the CPU. It can/will even make matters worse (conflicting disk arm movements on source or target disk).
– xenoid
Nov 22 at 15:55

add a comment |

up vote
10
down vote

favorite

I have been using a rsync script to synchronize data at one host with the data at another host. The data has numerous small-sized files that contribute to almost 1.2TB.

In order to sync those files, I have been using rsync command as follows:

rsync -avzm --stats --human-readable --include-from proj.lst /data/projects REMOTEHOST:/data/

The contents of proj.lst are as follows:

+ proj1

+ proj1/*

+ proj1/*/*

+ proj1/*/*/*.tar

+ proj1/*/*/*.pdf

+ proj2

+ proj2/*

+ proj2/*/*

+ proj2/*/*/*.tar

+ proj2/*/*/*.pdf

...

...

...

- *

If I would could multiple rsync processes in parallel (using &, xargs or parallel), it would save my time.

I tried with below command with parallel (after cding to source directory) and it took 12 minutes 37 seconds to execute:

parallel --will-cite -j 5 rsync -avzm --stats --human-readable {} REMOTEHOST:/data/ ::: .

This should have taken 5 times less time, but it didn't. I think, I'm going wrong somewhere.

How can I run multiple rsync processes in order to reduce the execution time?

asked Mar 13 '15 at 6:51

Mandar Shinde

1,40782747

1

Are you limited by network bandwidth? Disk iops? Disk bandwidth?
– Ole Tange
Mar 13 '15 at 7:25

If possible, we would want to use 50% of total bandwidth. But, parallelising multiple rsyncs is our first priority.
– Mandar Shinde
Mar 13 '15 at 7:32

Can you let us know your: Network bandwidth, disk iops, disk bandwidth, and the bandwidth actually used?
– Ole Tange
Mar 13 '15 at 7:41

In fact, I do not know about above parameters. For the time being, we can neglect the optimization part. Multiple rsyncs in parallel is the primary focus now.
– Mandar Shinde
Mar 13 '15 at 7:47

No point in going parallel if the limitation isn't the CPU. It can/will even make matters worse (conflicting disk arm movements on source or target disk).
– xenoid
Nov 22 at 15:55

add a comment |

up vote
10
down vote

favorite

I have been using a rsync script to synchronize data at one host with the data at another host. The data has numerous small-sized files that contribute to almost 1.2TB.

In order to sync those files, I have been using rsync command as follows:

rsync -avzm --stats --human-readable --include-from proj.lst /data/projects REMOTEHOST:/data/

The contents of proj.lst are as follows:

+ proj1

+ proj1/*

+ proj1/*/*

+ proj1/*/*/*.tar

+ proj1/*/*/*.pdf

+ proj2

+ proj2/*

+ proj2/*/*

+ proj2/*/*/*.tar

+ proj2/*/*/*.pdf

...

...

...

- *

If I would could multiple rsync processes in parallel (using &, xargs or parallel), it would save my time.

I tried with below command with parallel (after cding to source directory) and it took 12 minutes 37 seconds to execute:

parallel --will-cite -j 5 rsync -avzm --stats --human-readable {} REMOTEHOST:/data/ ::: .

This should have taken 5 times less time, but it didn't. I think, I'm going wrong somewhere.

How can I run multiple rsync processes in order to reduce the execution time?

asked Mar 13 '15 at 6:51

Mandar Shinde

1,40782747

I have been using a rsync script to synchronize data at one host with the data at another host. The data has numerous small-sized files that contribute to almost 1.2TB.

In order to sync those files, I have been using rsync command as follows:

rsync -avzm --stats --human-readable --include-from proj.lst /data/projects REMOTEHOST:/data/

The contents of proj.lst are as follows:

+ proj1

+ proj1/*

+ proj1/*/*

+ proj1/*/*/*.tar

+ proj1/*/*/*.pdf

+ proj2

+ proj2/*

+ proj2/*/*

+ proj2/*/*/*.tar

+ proj2/*/*/*.pdf

...

...

...

- *

If I would could multiple rsync processes in parallel (using &, xargs or parallel), it would save my time.

I tried with below command with parallel (after cding to source directory) and it took 12 minutes 37 seconds to execute:

parallel --will-cite -j 5 rsync -avzm --stats --human-readable {} REMOTEHOST:/data/ ::: .

This should have taken 5 times less time, but it didn't. I think, I'm going wrong somewhere.

How can I run multiple rsync processes in order to reduce the execution time?

linux rhel rsync gnu-parallel

asked Mar 13 '15 at 6:51

Mandar Shinde

1,40782747

asked Mar 13 '15 at 6:51

Mandar Shinde

1,40782747

asked Mar 13 '15 at 6:51

Mandar Shinde

1,40782747

asked Mar 13 '15 at 6:51

Mandar Shinde

1,40782747

asked Mar 13 '15 at 6:51

Mandar Shinde

1,40782747

1

Are you limited by network bandwidth? Disk iops? Disk bandwidth?
– Ole Tange
Mar 13 '15 at 7:25

If possible, we would want to use 50% of total bandwidth. But, parallelising multiple rsyncs is our first priority.
– Mandar Shinde
Mar 13 '15 at 7:32

Can you let us know your: Network bandwidth, disk iops, disk bandwidth, and the bandwidth actually used?
– Ole Tange
Mar 13 '15 at 7:41

In fact, I do not know about above parameters. For the time being, we can neglect the optimization part. Multiple rsyncs in parallel is the primary focus now.
– Mandar Shinde
Mar 13 '15 at 7:47

No point in going parallel if the limitation isn't the CPU. It can/will even make matters worse (conflicting disk arm movements on source or target disk).
– xenoid
Nov 22 at 15:55

add a comment |

1

Are you limited by network bandwidth? Disk iops? Disk bandwidth?
– Ole Tange
Mar 13 '15 at 7:25

If possible, we would want to use 50% of total bandwidth. But, parallelising multiple rsyncs is our first priority.
– Mandar Shinde
Mar 13 '15 at 7:32

Can you let us know your: Network bandwidth, disk iops, disk bandwidth, and the bandwidth actually used?
– Ole Tange
Mar 13 '15 at 7:41

In fact, I do not know about above parameters. For the time being, we can neglect the optimization part. Multiple rsyncs in parallel is the primary focus now.
– Mandar Shinde
Mar 13 '15 at 7:47

No point in going parallel if the limitation isn't the CPU. It can/will even make matters worse (conflicting disk arm movements on source or target disk).
– xenoid
Nov 22 at 15:55

Are you limited by network bandwidth? Disk iops? Disk bandwidth?
– Ole Tange
Mar 13 '15 at 7:25

If possible, we would want to use 50% of total bandwidth. But, parallelising multiple rsyncs is our first priority.
– Mandar Shinde
Mar 13 '15 at 7:32

Can you let us know your: Network bandwidth, disk iops, disk bandwidth, and the bandwidth actually used?
– Ole Tange
Mar 13 '15 at 7:41

In fact, I do not know about above parameters. For the time being, we can neglect the optimization part. Multiple rsyncs in parallel is the primary focus now.
– Mandar Shinde
Mar 13 '15 at 7:47

No point in going parallel if the limitation isn't the CPU. It can/will even make matters worse (conflicting disk arm movements on source or target disk).
– xenoid
Nov 22 at 15:55

add a comment |

6 Answers
6

active

oldest

votes

up vote
11
down vote

accepted

Following steps did the job for me:

Run the rsync --dry-run first in order to get the list of files those would be affected.

rsync -avzm --stats --safe-links --ignore-existing --dry-run --human-readable /data/projects REMOTE-HOST:/data/ > /tmp/transfer.log

I fed the output of cat transfer.log to parallel in order to run 5 rsyncs in parallel, as follows:

cat /tmp/transfer.log | parallel --will-cite -j 5 rsync -avzm --relative --stats --safe-links --ignore-existing --human-readable {} REMOTE-HOST:/data/ > result.log

Here, --relative option (link) ensured that the directory structure for the affected files, at the source and destination, remains the same (inside /data/ directory), so the command must be run in the source folder (in example, /data/projects).

edited Apr 13 '17 at 12:36

Community♦

answered Apr 11 '15 at 13:53

Mandar Shinde

1,40782747

4

That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/
– Sandip Bhattacharya
Nov 17 '16 at 21:22

How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/.
– Mike D
Sep 19 '17 at 16:42

1

On newer versions of rsync (3.1.0+), you can use --info=name in place of -v, and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them.
– Cheetah
Oct 12 '17 at 5:31

add a comment |

up vote
7
down vote

I would strongly discourage anybody from using the accepted answer, a better solution is to crawl the top level directory and launch a proportional number of rync operations.

I have a large zfs volume and my source was was a cifs mount. Both are linked with 10G, and in some benchmarks can saturate the link. Performance was evaluated using zpool iostat 1.

The source drive was mounted like:

mount -t cifs -o username=,password= //static_ip/70tb /mnt/Datahoarder_Mount/ -o vers=3.0

Using a single rsync process:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/ /StoragePod

the io meter reads:

StoragePod  30.0T   144T      0  1.61K      0   130M

StoragePod  30.0T   144T      0  1.61K      0   130M

StoragePod  30.0T   144T      0  1.62K      0   130M

This in synthetic benchmarks (crystal disk), performance for sequential write approaches 900 MB/s which means the link is saturated. 130MB/s is not very good, and the difference between waiting a weekend and two weeks.

So, I built the file list and tried to run the sync again (I have a 64 core machine):

cat /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount.log | parallel --will-cite -j 16 rsync -avzm --relative --stats --safe-links --size-only --human-readable {} /StoragePod/ > /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount_result.log

and it had the same performance!

StoragePod  29.9T   144T      0  1.63K      0   130M

StoragePod  29.9T   144T      0  1.62K      0   130M

StoragePod  29.9T   144T      0  1.56K      0   129M

As an alternative I simply ran rsync on the root folders:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/Marcello_zinc_bone /StoragePod/Marcello_zinc_bone

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/fibroblast_growth /StoragePod/fibroblast_growth

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/QDIC /StoragePod/QDIC

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/sexy_dps_cell /StoragePod/sexy_dps_cell

This actually boosted performance:

StoragePod  30.1T   144T     13  3.66K   112K   343M

StoragePod  30.1T   144T     24  5.11K   184K   469M

StoragePod  30.1T   144T     25  4.30K   196K   373M

In conclusion, as @Sandip Bhattacharya brought up, write a small script to get the directories and parallel that. Alternatively, pass a file list to rsync. But don't create new instances for each file.

answered Apr 10 '17 at 3:28

Mikhail

17013

add a comment |

up vote
6
down vote

I personally use this simple one:

ls -1 | parallel rsync -a {} /destination/directory/

Which only is usefull when you have more than a few non-near-empty directories, else you'll end up having almost every rsync terminating and the last one doing all the job alone.

answered May 25 '16 at 14:15

Julien Palard

26635

add a comment |

up vote
4
down vote

A tested way to do the parallelized rsync is: http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir
on the server fooserver:
cd src-dir; find . -type f -size +100000 | 

parallel -v ssh fooserver mkdir -p /dest-dir/{//}; 

  rsync -s -Havessh {} fooserver:/dest-dir/{} 
The directories created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:
rsync -Havessh src-dir/ fooserver:/dest-dir/ 
If you are unable to
push data, but need to pull them and the files are called digits.png
(e.g. 000000.png) you might be able to do:
seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/

edited Dec 11 '17 at 7:04

Ryan Long

1032

answered Mar 13 '15 at 7:25

Ole Tange

11.8k1448105

Any other alternative in order to avoid find?
– Mandar Shinde
Mar 13 '15 at 7:34

1

Limit the -maxdepth of find.
– Ole Tange
Mar 17 '15 at 9:20

If I use --dry-run option in rsync, I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process?
– Mandar Shinde
Apr 10 '15 at 3:47

1

cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}; rsync -s -Havessh {} fooserver:/dest-dir/{}
– Ole Tange
Apr 10 '15 at 5:51

Can you please explain the mkdir -p /dest-dir/{//}; part? Especially the {//} thing is a bit confusing.
– Mandar Shinde
Apr 10 '15 at 9:49

|
show 3 more comments

up vote
0
down vote

For multi destination syncs, I am using

parallel rsync -avi /path/to/source ::: host1: host2: host3:

Hint: All ssh connections are established with public keys in ~/.ssh/authorized_keys

answered Apr 10 '17 at 6:37

ingopingo

61944

add a comment |

up vote
0
down vote

I always google for parallel rsync as I always forget the full command, but no solution worked for me as I wanted - either it includes multiple steps or needs to install parallel. I ended up using this one-liner to sync multiple folders:

find dir/ -type d|xargs -P 5 -I % sh -c 'rsync -a --delete --bwlimit=50000 $(echo dir/%/ host:/dir/%/)'

-P 5 is the amount of processes you want to spawn - use 0 for unlimited (obviously not recommended).

--bwlimit to avoid using all bandwidth.

-I % argument provided by find (directory found in dir/)

$(echo dir/%/ host:/dir/%/) - prints source and destination directories which are read by rsync as arguments. % is replaced by xargs with directory name found by find.

Let's assume I have two directories in /home: dir1 and dir2. I run find /home -type d|xargs -P 5 -I % sh -c 'rsync -a --delete --bwlimit=50000 $(echo /home/%/ host:/home/%/)'. So rsync command will run as two processes (two processes because /home has two directories) with following arguments:

rsync -a --delete --bwlimit=50000 /home/dir1/ host:/home/dir1/

rsync -a --delete --bwlimit=50000 /home/dir1/ host:/home/dir1/

edited 2 days ago

answered Nov 22 at 15:43

Sebastjanas

New contributor

OK, can you explain $(echo dir/%/ host:/dir/%/) now?   Please do not respond in comments; edit your answer to make it clearer and more complete.
– Scott
Nov 22 at 16:16

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f189878%2fparallelise-rsync-using-gnu-parallel%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

6 Answers
6

active

oldest

votes

6 Answers
6

active

oldest

votes

up vote
11
down vote

accepted

Following steps did the job for me:

Run the rsync --dry-run first in order to get the list of files those would be affected.

rsync -avzm --stats --safe-links --ignore-existing --dry-run --human-readable /data/projects REMOTE-HOST:/data/ > /tmp/transfer.log

I fed the output of cat transfer.log to parallel in order to run 5 rsyncs in parallel, as follows:

cat /tmp/transfer.log | parallel --will-cite -j 5 rsync -avzm --relative --stats --safe-links --ignore-existing --human-readable {} REMOTE-HOST:/data/ > result.log

edited Apr 13 '17 at 12:36

Community♦

answered Apr 11 '15 at 13:53

Mandar Shinde

1,40782747

4

That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/
– Sandip Bhattacharya
Nov 17 '16 at 21:22

How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/.
– Mike D
Sep 19 '17 at 16:42

1

On newer versions of rsync (3.1.0+), you can use --info=name in place of -v, and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them.
– Cheetah
Oct 12 '17 at 5:31

add a comment |

up vote
11
down vote

accepted

Following steps did the job for me:

Run the rsync --dry-run first in order to get the list of files those would be affected.

rsync -avzm --stats --safe-links --ignore-existing --dry-run --human-readable /data/projects REMOTE-HOST:/data/ > /tmp/transfer.log

I fed the output of cat transfer.log to parallel in order to run 5 rsyncs in parallel, as follows:

cat /tmp/transfer.log | parallel --will-cite -j 5 rsync -avzm --relative --stats --safe-links --ignore-existing --human-readable {} REMOTE-HOST:/data/ > result.log

edited Apr 13 '17 at 12:36

Community♦

answered Apr 11 '15 at 13:53

Mandar Shinde

1,40782747

4

That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/
– Sandip Bhattacharya
Nov 17 '16 at 21:22

How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/.
– Mike D
Sep 19 '17 at 16:42

1

On newer versions of rsync (3.1.0+), you can use --info=name in place of -v, and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them.
– Cheetah
Oct 12 '17 at 5:31

add a comment |

up vote
11
down vote

accepted

Following steps did the job for me:

Run the rsync --dry-run first in order to get the list of files those would be affected.

rsync -avzm --stats --safe-links --ignore-existing --dry-run --human-readable /data/projects REMOTE-HOST:/data/ > /tmp/transfer.log

I fed the output of cat transfer.log to parallel in order to run 5 rsyncs in parallel, as follows:

cat /tmp/transfer.log | parallel --will-cite -j 5 rsync -avzm --relative --stats --safe-links --ignore-existing --human-readable {} REMOTE-HOST:/data/ > result.log

edited Apr 13 '17 at 12:36

Community♦

answered Apr 11 '15 at 13:53

Mandar Shinde

1,40782747

Following steps did the job for me:

Run the rsync --dry-run first in order to get the list of files those would be affected.

rsync -avzm --stats --safe-links --ignore-existing --dry-run --human-readable /data/projects REMOTE-HOST:/data/ > /tmp/transfer.log

I fed the output of cat transfer.log to parallel in order to run 5 rsyncs in parallel, as follows:

cat /tmp/transfer.log | parallel --will-cite -j 5 rsync -avzm --relative --stats --safe-links --ignore-existing --human-readable {} REMOTE-HOST:/data/ > result.log

edited Apr 13 '17 at 12:36

Community♦

answered Apr 11 '15 at 13:53

Mandar Shinde

1,40782747

edited Apr 13 '17 at 12:36

Community♦

edited Apr 13 '17 at 12:36

Community♦

edited Apr 13 '17 at 12:36

Community♦

answered Apr 11 '15 at 13:53

Mandar Shinde

1,40782747

answered Apr 11 '15 at 13:53

Mandar Shinde

1,40782747

answered Apr 11 '15 at 13:53

Mandar Shinde

1,40782747

4

That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/
– Sandip Bhattacharya
Nov 17 '16 at 21:22

How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/.
– Mike D
Sep 19 '17 at 16:42

1

On newer versions of rsync (3.1.0+), you can use --info=name in place of -v, and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them.
– Cheetah
Oct 12 '17 at 5:31

add a comment |

4

That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/
– Sandip Bhattacharya
Nov 17 '16 at 21:22

How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/.
– Mike D
Sep 19 '17 at 16:42

1

On newer versions of rsync (3.1.0+), you can use --info=name in place of -v, and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them.
– Cheetah
Oct 12 '17 at 5:31

That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/
– Sandip Bhattacharya
Nov 17 '16 at 21:22

How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/.
– Mike D
Sep 19 '17 at 16:42

On newer versions of rsync (3.1.0+), you can use --info=name in place of -v, and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them.
– Cheetah
Oct 12 '17 at 5:31

add a comment |

up vote
7
down vote

I would strongly discourage anybody from using the accepted answer, a better solution is to crawl the top level directory and launch a proportional number of rync operations.

I have a large zfs volume and my source was was a cifs mount. Both are linked with 10G, and in some benchmarks can saturate the link. Performance was evaluated using zpool iostat 1.

The source drive was mounted like:

mount -t cifs -o username=,password= //static_ip/70tb /mnt/Datahoarder_Mount/ -o vers=3.0

Using a single rsync process:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/ /StoragePod

the io meter reads:

StoragePod  30.0T   144T      0  1.61K      0   130M

StoragePod  30.0T   144T      0  1.61K      0   130M

StoragePod  30.0T   144T      0  1.62K      0   130M

So, I built the file list and tried to run the sync again (I have a 64 core machine):

cat /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount.log | parallel --will-cite -j 16 rsync -avzm --relative --stats --safe-links --size-only --human-readable {} /StoragePod/ > /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount_result.log

and it had the same performance!

StoragePod  29.9T   144T      0  1.63K      0   130M

StoragePod  29.9T   144T      0  1.62K      0   130M

StoragePod  29.9T   144T      0  1.56K      0   129M

As an alternative I simply ran rsync on the root folders:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/Marcello_zinc_bone /StoragePod/Marcello_zinc_bone

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/fibroblast_growth /StoragePod/fibroblast_growth

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/QDIC /StoragePod/QDIC

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/sexy_dps_cell /StoragePod/sexy_dps_cell

This actually boosted performance:

StoragePod  30.1T   144T     13  3.66K   112K   343M

StoragePod  30.1T   144T     24  5.11K   184K   469M

StoragePod  30.1T   144T     25  4.30K   196K   373M

answered Apr 10 '17 at 3:28

Mikhail

17013

add a comment |

up vote
7
down vote

I would strongly discourage anybody from using the accepted answer, a better solution is to crawl the top level directory and launch a proportional number of rync operations.

I have a large zfs volume and my source was was a cifs mount. Both are linked with 10G, and in some benchmarks can saturate the link. Performance was evaluated using zpool iostat 1.

The source drive was mounted like:

mount -t cifs -o username=,password= //static_ip/70tb /mnt/Datahoarder_Mount/ -o vers=3.0

Using a single rsync process:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/ /StoragePod

the io meter reads:

StoragePod  30.0T   144T      0  1.61K      0   130M

StoragePod  30.0T   144T      0  1.61K      0   130M

StoragePod  30.0T   144T      0  1.62K      0   130M

So, I built the file list and tried to run the sync again (I have a 64 core machine):

cat /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount.log | parallel --will-cite -j 16 rsync -avzm --relative --stats --safe-links --size-only --human-readable {} /StoragePod/ > /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount_result.log

and it had the same performance!

StoragePod  29.9T   144T      0  1.63K      0   130M

StoragePod  29.9T   144T      0  1.62K      0   130M

StoragePod  29.9T   144T      0  1.56K      0   129M

As an alternative I simply ran rsync on the root folders:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/Marcello_zinc_bone /StoragePod/Marcello_zinc_bone

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/fibroblast_growth /StoragePod/fibroblast_growth

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/QDIC /StoragePod/QDIC

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/sexy_dps_cell /StoragePod/sexy_dps_cell

This actually boosted performance:

StoragePod  30.1T   144T     13  3.66K   112K   343M

StoragePod  30.1T   144T     24  5.11K   184K   469M

StoragePod  30.1T   144T     25  4.30K   196K   373M

answered Apr 10 '17 at 3:28

Mikhail

17013

add a comment |

up vote
7
down vote

I would strongly discourage anybody from using the accepted answer, a better solution is to crawl the top level directory and launch a proportional number of rync operations.

I have a large zfs volume and my source was was a cifs mount. Both are linked with 10G, and in some benchmarks can saturate the link. Performance was evaluated using zpool iostat 1.

The source drive was mounted like:

mount -t cifs -o username=,password= //static_ip/70tb /mnt/Datahoarder_Mount/ -o vers=3.0

Using a single rsync process:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/ /StoragePod

the io meter reads:

StoragePod  30.0T   144T      0  1.61K      0   130M

StoragePod  30.0T   144T      0  1.61K      0   130M

StoragePod  30.0T   144T      0  1.62K      0   130M

So, I built the file list and tried to run the sync again (I have a 64 core machine):

cat /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount.log | parallel --will-cite -j 16 rsync -avzm --relative --stats --safe-links --size-only --human-readable {} /StoragePod/ > /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount_result.log

and it had the same performance!

StoragePod  29.9T   144T      0  1.63K      0   130M

StoragePod  29.9T   144T      0  1.62K      0   130M

StoragePod  29.9T   144T      0  1.56K      0   129M

As an alternative I simply ran rsync on the root folders:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/Marcello_zinc_bone /StoragePod/Marcello_zinc_bone

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/fibroblast_growth /StoragePod/fibroblast_growth

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/QDIC /StoragePod/QDIC

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/sexy_dps_cell /StoragePod/sexy_dps_cell

This actually boosted performance:

StoragePod  30.1T   144T     13  3.66K   112K   343M

StoragePod  30.1T   144T     24  5.11K   184K   469M

StoragePod  30.1T   144T     25  4.30K   196K   373M

answered Apr 10 '17 at 3:28

Mikhail

17013

I would strongly discourage anybody from using the accepted answer, a better solution is to crawl the top level directory and launch a proportional number of rync operations.

I have a large zfs volume and my source was was a cifs mount. Both are linked with 10G, and in some benchmarks can saturate the link. Performance was evaluated using zpool iostat 1.

The source drive was mounted like:

mount -t cifs -o username=,password= //static_ip/70tb /mnt/Datahoarder_Mount/ -o vers=3.0

Using a single rsync process:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/ /StoragePod

the io meter reads:

StoragePod  30.0T   144T      0  1.61K      0   130M

StoragePod  30.0T   144T      0  1.61K      0   130M

StoragePod  30.0T   144T      0  1.62K      0   130M

So, I built the file list and tried to run the sync again (I have a 64 core machine):

cat /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount.log | parallel --will-cite -j 16 rsync -avzm --relative --stats --safe-links --size-only --human-readable {} /StoragePod/ > /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount_result.log

and it had the same performance!

StoragePod  29.9T   144T      0  1.63K      0   130M

StoragePod  29.9T   144T      0  1.62K      0   130M

StoragePod  29.9T   144T      0  1.56K      0   129M

As an alternative I simply ran rsync on the root folders:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/Marcello_zinc_bone /StoragePod/Marcello_zinc_bone

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/fibroblast_growth /StoragePod/fibroblast_growth

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/QDIC /StoragePod/QDIC

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/sexy_dps_cell /StoragePod/sexy_dps_cell

This actually boosted performance:

StoragePod  30.1T   144T     13  3.66K   112K   343M

StoragePod  30.1T   144T     24  5.11K   184K   469M

StoragePod  30.1T   144T     25  4.30K   196K   373M

answered Apr 10 '17 at 3:28

Mikhail

17013

answered Apr 10 '17 at 3:28

Mikhail

17013

answered Apr 10 '17 at 3:28

Mikhail

17013

answered Apr 10 '17 at 3:28

Mikhail

17013

add a comment |

up vote
6
down vote

I personally use this simple one:

ls -1 | parallel rsync -a {} /destination/directory/

Which only is usefull when you have more than a few non-near-empty directories, else you'll end up having almost every rsync terminating and the last one doing all the job alone.

answered May 25 '16 at 14:15

Julien Palard

26635

add a comment |

up vote
6
down vote

I personally use this simple one:

ls -1 | parallel rsync -a {} /destination/directory/

Which only is usefull when you have more than a few non-near-empty directories, else you'll end up having almost every rsync terminating and the last one doing all the job alone.

answered May 25 '16 at 14:15

Julien Palard

26635

add a comment |

up vote
6
down vote

I personally use this simple one:

ls -1 | parallel rsync -a {} /destination/directory/

Which only is usefull when you have more than a few non-near-empty directories, else you'll end up having almost every rsync terminating and the last one doing all the job alone.

answered May 25 '16 at 14:15

Julien Palard

26635

I personally use this simple one:

ls -1 | parallel rsync -a {} /destination/directory/

Which only is usefull when you have more than a few non-near-empty directories, else you'll end up having almost every rsync terminating and the last one doing all the job alone.

answered May 25 '16 at 14:15

Julien Palard

26635

answered May 25 '16 at 14:15

Julien Palard

26635

answered May 25 '16 at 14:15

Julien Palard

26635

answered May 25 '16 at 14:15

Julien Palard

26635

add a comment |

up vote
4
down vote

A tested way to do the parallelized rsync is: http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir
on the server fooserver:
cd src-dir; find . -type f -size +100000 | 

parallel -v ssh fooserver mkdir -p /dest-dir/{//}; 

  rsync -s -Havessh {} fooserver:/dest-dir/{} 
The directories created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:
rsync -Havessh src-dir/ fooserver:/dest-dir/ 
If you are unable to
push data, but need to pull them and the files are called digits.png
(e.g. 000000.png) you might be able to do:
seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/

edited Dec 11 '17 at 7:04

Ryan Long

1032

answered Mar 13 '15 at 7:25

Ole Tange

11.8k1448105

Any other alternative in order to avoid find?
– Mandar Shinde
Mar 13 '15 at 7:34

1

Limit the -maxdepth of find.
– Ole Tange
Mar 17 '15 at 9:20

If I use --dry-run option in rsync, I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process?
– Mandar Shinde
Apr 10 '15 at 3:47

1

cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}; rsync -s -Havessh {} fooserver:/dest-dir/{}
– Ole Tange
Apr 10 '15 at 5:51

Can you please explain the mkdir -p /dest-dir/{//}; part? Especially the {//} thing is a bit confusing.
– Mandar Shinde
Apr 10 '15 at 9:49

|
show 3 more comments

up vote
4
down vote

A tested way to do the parallelized rsync is: http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir
on the server fooserver:
cd src-dir; find . -type f -size +100000 | 

parallel -v ssh fooserver mkdir -p /dest-dir/{//}; 

  rsync -s -Havessh {} fooserver:/dest-dir/{} 
The directories created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:
rsync -Havessh src-dir/ fooserver:/dest-dir/ 
If you are unable to
push data, but need to pull them and the files are called digits.png
(e.g. 000000.png) you might be able to do:
seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/

edited Dec 11 '17 at 7:04

Ryan Long

1032

answered Mar 13 '15 at 7:25

Ole Tange

11.8k1448105

Any other alternative in order to avoid find?
– Mandar Shinde
Mar 13 '15 at 7:34

1

Limit the -maxdepth of find.
– Ole Tange
Mar 17 '15 at 9:20

If I use --dry-run option in rsync, I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process?
– Mandar Shinde
Apr 10 '15 at 3:47

1

cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}; rsync -s -Havessh {} fooserver:/dest-dir/{}
– Ole Tange
Apr 10 '15 at 5:51

Can you please explain the mkdir -p /dest-dir/{//}; part? Especially the {//} thing is a bit confusing.
– Mandar Shinde
Apr 10 '15 at 9:49

|
show 3 more comments

up vote
4
down vote

A tested way to do the parallelized rsync is: http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir
on the server fooserver:
cd src-dir; find . -type f -size +100000 | 

parallel -v ssh fooserver mkdir -p /dest-dir/{//}; 

  rsync -s -Havessh {} fooserver:/dest-dir/{} 
The directories created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:
rsync -Havessh src-dir/ fooserver:/dest-dir/ 
If you are unable to
push data, but need to pull them and the files are called digits.png
(e.g. 000000.png) you might be able to do:
seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/

edited Dec 11 '17 at 7:04

Ryan Long

1032

answered Mar 13 '15 at 7:25

Ole Tange

11.8k1448105

A tested way to do the parallelized rsync is: http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir
on the server fooserver:
cd src-dir; find . -type f -size +100000 | 

parallel -v ssh fooserver mkdir -p /dest-dir/{//}; 

  rsync -s -Havessh {} fooserver:/dest-dir/{} 
The directories created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:
rsync -Havessh src-dir/ fooserver:/dest-dir/ 
If you are unable to
push data, but need to pull them and the files are called digits.png
(e.g. 000000.png) you might be able to do:
seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/

edited Dec 11 '17 at 7:04

Ryan Long

1032

answered Mar 13 '15 at 7:25

Ole Tange

11.8k1448105

edited Dec 11 '17 at 7:04

Ryan Long

1032

edited Dec 11 '17 at 7:04

Ryan Long

1032

edited Dec 11 '17 at 7:04

Ryan Long

1032

answered Mar 13 '15 at 7:25

Ole Tange

11.8k1448105

answered Mar 13 '15 at 7:25

Ole Tange

11.8k1448105

answered Mar 13 '15 at 7:25

Ole Tange

11.8k1448105

Any other alternative in order to avoid find?
– Mandar Shinde
Mar 13 '15 at 7:34

1

Limit the -maxdepth of find.
– Ole Tange
Mar 17 '15 at 9:20

If I use --dry-run option in rsync, I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process?
– Mandar Shinde
Apr 10 '15 at 3:47

1

cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}; rsync -s -Havessh {} fooserver:/dest-dir/{}
– Ole Tange
Apr 10 '15 at 5:51

Can you please explain the mkdir -p /dest-dir/{//}; part? Especially the {//} thing is a bit confusing.
– Mandar Shinde
Apr 10 '15 at 9:49

|
show 3 more comments

Any other alternative in order to avoid find?
– Mandar Shinde
Mar 13 '15 at 7:34

1

Limit the -maxdepth of find.
– Ole Tange
Mar 17 '15 at 9:20

If I use --dry-run option in rsync, I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process?
– Mandar Shinde
Apr 10 '15 at 3:47

1

cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}; rsync -s -Havessh {} fooserver:/dest-dir/{}
– Ole Tange
Apr 10 '15 at 5:51

Can you please explain the mkdir -p /dest-dir/{//}; part? Especially the {//} thing is a bit confusing.
– Mandar Shinde
Apr 10 '15 at 9:49

Any other alternative in order to avoid find?
– Mandar Shinde
Mar 13 '15 at 7:34

Limit the -maxdepth of find.
– Ole Tange
Mar 17 '15 at 9:20

If I use --dry-run option in rsync, I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process?
– Mandar Shinde
Apr 10 '15 at 3:47

cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}; rsync -s -Havessh {} fooserver:/dest-dir/{}
– Ole Tange
Apr 10 '15 at 5:51

Can you please explain the mkdir -p /dest-dir/{//}; part? Especially the {//} thing is a bit confusing.
– Mandar Shinde
Apr 10 '15 at 9:49

|
show 3 more comments

up vote
0
down vote

For multi destination syncs, I am using

parallel rsync -avi /path/to/source ::: host1: host2: host3:

Hint: All ssh connections are established with public keys in ~/.ssh/authorized_keys

answered Apr 10 '17 at 6:37

ingopingo

61944

add a comment |

up vote
0
down vote

For multi destination syncs, I am using

parallel rsync -avi /path/to/source ::: host1: host2: host3:

Hint: All ssh connections are established with public keys in ~/.ssh/authorized_keys

answered Apr 10 '17 at 6:37

ingopingo

61944

add a comment |

up vote
0
down vote

For multi destination syncs, I am using

parallel rsync -avi /path/to/source ::: host1: host2: host3:

Hint: All ssh connections are established with public keys in ~/.ssh/authorized_keys

answered Apr 10 '17 at 6:37

ingopingo

61944

For multi destination syncs, I am using

parallel rsync -avi /path/to/source ::: host1: host2: host3:

Hint: All ssh connections are established with public keys in ~/.ssh/authorized_keys

answered Apr 10 '17 at 6:37

ingopingo

61944

answered Apr 10 '17 at 6:37

ingopingo

61944

answered Apr 10 '17 at 6:37

ingopingo

61944

answered Apr 10 '17 at 6:37

ingopingo

61944

add a comment |

up vote
0
down vote

find dir/ -type d|xargs -P 5 -I % sh -c 'rsync -a --delete --bwlimit=50000 $(echo dir/%/ host:/dir/%/)'

-P 5 is the amount of processes you want to spawn - use 0 for unlimited (obviously not recommended).

--bwlimit to avoid using all bandwidth.

-I % argument provided by find (directory found in dir/)

$(echo dir/%/ host:/dir/%/) - prints source and destination directories which are read by rsync as arguments. % is replaced by xargs with directory name found by find.

rsync -a --delete --bwlimit=50000 /home/dir1/ host:/home/dir1/

rsync -a --delete --bwlimit=50000 /home/dir1/ host:/home/dir1/

edited 2 days ago

answered Nov 22 at 15:43

Sebastjanas

New contributor

OK, can you explain $(echo dir/%/ host:/dir/%/) now?   Please do not respond in comments; edit your answer to make it clearer and more complete.
– Scott
Nov 22 at 16:16

add a comment |

up vote
0
down vote

find dir/ -type d|xargs -P 5 -I % sh -c 'rsync -a --delete --bwlimit=50000 $(echo dir/%/ host:/dir/%/)'

-P 5 is the amount of processes you want to spawn - use 0 for unlimited (obviously not recommended).

--bwlimit to avoid using all bandwidth.

-I % argument provided by find (directory found in dir/)

$(echo dir/%/ host:/dir/%/) - prints source and destination directories which are read by rsync as arguments. % is replaced by xargs with directory name found by find.

rsync -a --delete --bwlimit=50000 /home/dir1/ host:/home/dir1/

rsync -a --delete --bwlimit=50000 /home/dir1/ host:/home/dir1/

edited 2 days ago

answered Nov 22 at 15:43

Sebastjanas

New contributor

OK, can you explain $(echo dir/%/ host:/dir/%/) now?   Please do not respond in comments; edit your answer to make it clearer and more complete.
– Scott
Nov 22 at 16:16

add a comment |

up vote
0
down vote

find dir/ -type d|xargs -P 5 -I % sh -c 'rsync -a --delete --bwlimit=50000 $(echo dir/%/ host:/dir/%/)'

-P 5 is the amount of processes you want to spawn - use 0 for unlimited (obviously not recommended).

--bwlimit to avoid using all bandwidth.

-I % argument provided by find (directory found in dir/)

$(echo dir/%/ host:/dir/%/) - prints source and destination directories which are read by rsync as arguments. % is replaced by xargs with directory name found by find.

rsync -a --delete --bwlimit=50000 /home/dir1/ host:/home/dir1/

rsync -a --delete --bwlimit=50000 /home/dir1/ host:/home/dir1/

edited 2 days ago

answered Nov 22 at 15:43

Sebastjanas

New contributor

find dir/ -type d|xargs -P 5 -I % sh -c 'rsync -a --delete --bwlimit=50000 $(echo dir/%/ host:/dir/%/)'

-P 5 is the amount of processes you want to spawn - use 0 for unlimited (obviously not recommended).

--bwlimit to avoid using all bandwidth.

-I % argument provided by find (directory found in dir/)

$(echo dir/%/ host:/dir/%/) - prints source and destination directories which are read by rsync as arguments. % is replaced by xargs with directory name found by find.

rsync -a --delete --bwlimit=50000 /home/dir1/ host:/home/dir1/

rsync -a --delete --bwlimit=50000 /home/dir1/ host:/home/dir1/

edited 2 days ago

answered Nov 22 at 15:43

Sebastjanas

New contributor

edited 2 days ago

answered Nov 22 at 15:43

Sebastjanas

New contributor

answered Nov 22 at 15:43

Sebastjanas

answered Nov 22 at 15:43

Sebastjanas

New contributor

Sebastjanas is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

OK, can you explain $(echo dir/%/ host:/dir/%/) now?   Please do not respond in comments; edit your answer to make it clearer and more complete.
– Scott
Nov 22 at 16:16

add a comment |

OK, can you explain $(echo dir/%/ host:/dir/%/) now?   Please do not respond in comments; edit your answer to make it clearer and more complete.
– Scott
Nov 22 at 16:16

OK, can you explain $(echo dir/%/ host:/dir/%/) now?   Please do not respond in comments; edit your answer to make it clearer and more complete.
– Scott
Nov 22 at 16:16

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Sstrhsrtj