Skip to content

User guide

Piotr Gregor edited this page Aug 30, 2016 · 21 revisions

##USAGE OF rsyncme

Content

  1. OVERVIEW

  2. EXAMPLES

    2.1 Basic example

    2.2 More advanced example

  3. OPTIONS

  4. LOCAL SYNCHRONIZATION

    4.1 OPTIONS

    4.1.1 -z OPTION

    4.1.2 --leave OPTION

    4.1.3 -l OPTION

    4.1.4 -t OPTION

    4.1.5 -a OPTION

  5. REMOTE SYNCHRONIZATION

##1. OVERVIEW

rsyncme can be used to synchronize data on the same machine (locally) or between endpoints connected over IP. In current version 0.1.2 only local sync is possible and you must wait for version 0.1.3 to be ready to synchronize data between machines. Even if local sync doesn't offer many benefits over simple copying of data in terms of time improvement, it is necessary step in development process and inspecting the results locally is great source of knowledge about algorithm's behavior.

There is always constant part of all commands using rsyncme, -x and -y options necessarily followed by file names, for example:

#!bash


    rsyncme push -x input.txt -y output.txt

In all possible commands -x gives always the name of the file that you want to use as a source of changes (new/current version of data) and -y gives the name of the file that is the old version of data. This name may be the name of target (so -y file will be changed) or output may be written to a new file and -y file deleted or left intact.

##2. EXAMPLES

2.1 Basic example

Let us assume two files, calc1.ods and calc2.ods are present in ~/Documents directory and you made some changes to calc1.ods (a value of single cell has been changed in this example). You now wish to synchronize calc2.ods to be same as just changed calc1.ods. You can do it with push command:

#!bash


    $ rsyncme push -x ~/Documents/calc1.ods -y ~/Documents/calc2.ods

    Local push.

    method      : DELTA_RECONSTRUCTION (block [512])
    bytes       : [15744] (by raw [7040], by refs [8704])
    deltas      : [31] (raw [14], refs [17])
    time        : real [0.002247]s, cpu [0.003514]s
    bandwidth   : [7.007268]MB/s

    OK.

It can be read from the output that source file x is 15744 bytes and delta reconstruction procedure has been used as the synchronization method (as opposed to sending bytes literally). This method has resulted in 31 delta elements being created. 14 of them contained raw bytes, while 17 contained references to matching blocks found in old version of data (y) and reused (there would be no sending over ethernet if it was remote sync). As a result 7040 bytes from new version of file would be used literally, avoiding copying of 8704 bytes, which would be copied at the other end of the link from referenced old version of file, if it was remote sync.

2.2 More advanced example

Let us synchronize just changed version of file with its previous version, leaving both files unchanged and producing result in new file. Moreover let us drive the algorithm by setting limit on size of file - all files with size more than this will be simply copied, skipping the procedure which searches for similar blocks in files. There will be also a limit on the final segment in file set - if searching procedure is used and less (or equal) than this limit bytes have left to process they are simply copied, terminating the searching procedure. The last parameter we set is the maximum size of each package containing bytes that didn't match.

#!bash


    rsyncme push -x current.ods -y previous.ods -z new.ods --l -a 5120 -t 512 -s 128

    Local push.

    method : DELTA_RECONSTRUCTION (block [512])
    bytes : [5197154] (by raw [3717474], by refs [1479680])
    deltas : [32155] (raw [29265], refs [2890])
    collisions : 1st [287171], 2nd [78], 3rd [0]
    time : real [1.146892]s, cpu [1.304541]s
    bandwidth : [4.531511]MB/s

    copy tail : fired

    OK.

##3. OPTIONS

#!text

push        Set up @x as source of changes, @y as result/output.

pull        Set up @y as source of changes, @x as result/output.

-x          The name of source of changes (if push, see -y if pull)

-y          The name of target to be synchronized to be same as @x, it is
            reference file (if push, see -x if pull).

-z          The new name of target.

--leave     Will leave reference file unchanged, producing output as @z.

--force     Will create @y if it doesn't exist.

--version   Will output version of program.

--help      Will print short help information.

-l          The size of block used in synchronization algorithm [in bytes].
            Default used is 512.

-a          Copy all threshold [in bytes]. The file will be sent as raw bytes
            if it's size is less than this. (If the copy tail threshold
            value is equal to this value and they are equal to file size
            than file is sent literally).

-t          Copy tail threashold [in bytes]. Bytes will be sent litearally
            instead of performing rolling match algorithm
            if bytes to be processed are equal or less than this value.

-s          Send threshold [in bytes]. This determines the maximum size
            of delta raw element. Bytes will be queued up to this number
            if the match is not found in @y. If the match is found at some
            point before the limit is reached and delta reference to matching
            block is just to be sent (or the EOF is reached), then already
            accumulated bytes are sent immediately.

##4. LOCAL SYNCHRONIZATION

If the -i option is not given in the command, that means this is local sync. The basic command to synchronize file @ x and file @ y on the same machine is

#!bash

    rsyncme push -x @x -y @y

This command will synchronize file @ y with @ x, that is @ y will be the same as @ x after this is done, and @ x is unchanged. The basic command will create sychronized version of @ y with the same name and in the same location, but this can be changed with -z and/or --leave option.

###4.1 OPTIONS

4.1.1 -z OPTION

#!bash

    rsyncme push -x @x -y @y -z @z

This will output new version of @ y as @ z, i.e. @ y becomes @ z.

4.1.2 --leave OPTION

#!bash


    rsyncme push -x @x -y @y -z @z --leave

will work as 4.1.1 but @ y file will not be deleted nor changed - it is left intact.

4.1.3 -l OPTION

#!bash


    rsyncme push -x @x -y @y -l block_size

Use different size of block for synchronization. This option influences number of matches found and number of raw bytes used. The general rule is less this value is more matches are found while simultaneously less this value is less space for performance improvement is left. Example:

#!bash

    rsyncme push -x @x -y @y -l 16

This will use 16 bytes as synchronization block size.

4.1.4 -t OPTION

#!bash


    rsyncme push -x @x -y @y -t number_of_bytes

Copy tail threshold sets up the number of bytes at the end of file that will be sent literally, as a raw data, straight away - without searching for a match in hashtable of checksums. If there is less or equal number of bytes to process, copy tail threshold is fired and bytes are sent literally - rolling procedure is interrupted and ended. The copy tail threshold can't be 0, therefore if set explicitly - this must be positive value. If this threshold is less than copy all threshold - copy all threshold is fired and all bytes are sent literally. If this value is equal to copy all threshold, copy tail threshold is fired to the same effect - all bytes are used literally. Example:

    rsyncme push -x @x -y @y -t 80

This will use 80 bytes as copy tail threshold.

4.1.5 -a OPTION

#!bash


    rsyncme push -x @x -y @y -a number_of_bytes

Copy all threshold sets up the trigger on the file size to be sent literally. If the file size is less than this value or is equal and copy tail thrreshold is equal to it - then whole file is sent as raw data. Example:

    rsyncme push -x @x -y @y -a 80

This will use 80 bytes as copy all threshold. All files of sizes less than 80 bytes are sent straight away, skipping the rolling checksum procedure.

##5. REMOTE SYNCHRONIZATION

Remote synchronization will be implemented in 0.1.3 version.

Clone this wiki locally