$treeview $search $mathjax $extrastylesheet
librsync
2.3.1
$projectbrief
|
$projectbrief
|
$searchbox |
00001 # Streaming API {#api_streaming} 00002 00003 A key design requirement for librsync is that it should handle data as 00004 and when the hosting application requires it. librsync can be used 00005 inside applications that do non-blocking IO or filtering of network 00006 streams, because it never does IO directly, or needs to block waiting 00007 for data. 00008 00009 Arbitrary-length input and output buffers are passed to the 00010 library by the application, through an instance of ::rs_buffers_t. The 00011 library proceeds as far as it can, and returns an ::rs_result value 00012 indicating whether it needs more data or space. 00013 00014 All the state needed by the library to resume processing when more 00015 data is available is kept in a small opaque ::rs_job_t structure. 00016 After creation of a job, repeated calls to rs_job_iter() in between 00017 filling and emptying the buffers keeps data flowing through the 00018 stream. The ::rs_result values returned may indicate 00019 00020 - ::RS_DONE: processing is complete 00021 - ::RS_BLOCKED: processing has blocked pending more data 00022 - one of various possible errors in processing (see ::rs_result.) 00023 00024 These can be converted to a human-readable string by rs_strerror(). 00025 00026 \note Smaller buffers have high relative handling costs. Application 00027 performance will be improved by using buffers of at least 32kb or so 00028 on each call. 00029 00030 \sa \ref api_whole - Simpler but more limited interface than the streaming 00031 interface. 00032 00033 \sa \ref api_pull - Intermediate-complexity callback interface. 00034 00035 \sa \ref api_callbacks - for reading from the basis file 00036 when doing a "patch" operation. 00037 00038 00039 ## Creating Jobs 00040 00041 All streaming librsync jobs are initiated using a `_begin` 00042 function to create a ::rs_job_t object, passing in any necessary 00043 initialization parameters. The various jobs available are: 00044 00045 - rs_sig_begin(): Calculate the signature of a file. 00046 - rs_loadsig_begin(): Load a signature into memory. 00047 - rs_delta_begin(): Calculate the delta between a signature and a new 00048 file. 00049 - rs_patch_begin(): Apply a delta to a basis to recreate the new 00050 file. 00051 00052 Additionally, the following helper functions can be used to get the 00053 recommended signature arguments from the input file's size. 00054 00055 - rs_sig_args(): Get the recommended sigature arguments from the file size. 00056 00057 After a signature has been loaded, before it can be used to calculate a delta, 00058 the hashtable needs to be initialized by calling 00059 00060 - rs_build_hash_table(): Initialized the signature hashtable. 00061 00062 The patch job accepts the patch as input, and uses a callback to look up 00063 blocks within the basis file. 00064 00065 You must configure read, write and basis callbacks after creating the 00066 job but before it is run. 00067 00068 00069 ## Running Jobs 00070 00071 The work of the operation is done when the application calls 00072 rs_job_iter(). This includes reading from input files via the callback, 00073 running the rsync algorithms, and writing output. 00074 00075 The IO callbacks are only called from inside rs_job_iter(). If any of 00076 them return an error, rs_job_iter() will generally return the same error. 00077 00078 When librsync needs to do input or output, it calls one of the callback 00079 functions. rs_job_iter() returns when the operation has completed or 00080 failed, or when one of the IO callbacks has blocked. 00081 00082 rs_job_iter() will usually be called in a loop, perhaps alternating 00083 librsync processing with other application functions. 00084 00085 00086 ## Deleting Jobs 00087 00088 A job is deleted and its memory freed up using rs_job_free(). 00089 00090 This is typically called when the job has completed or failed. It can be 00091 called earlier if the application decides it wants to cancel 00092 processing. 00093 00094 rs_job_free() does not delete the output of the job, such as the sumset 00095 loaded into memory. It does delete the job's statistics. 00096 00097 00098 ## State Machine Internals 00099 00100 Internally, the operations are implemented as state machines that move 00101 through various states as input and output buffers become available. 00102 00103 All computers and programs are state machines. So why is the 00104 representation as a state machine a little more explicit (and perhaps 00105 verbose) in librsync than other places? Because we need to be able to 00106 let the real computer go off and do something else like waiting for 00107 network traffic, while still remembering where it was in the librsync 00108 state machine. 00109 00110 librsync will never block waiting for IO, unless the callbacks do 00111 that. 00112 00113 The current state is represented by the private field 00114 ::rs_job_t::statefn, which points to a function with a name like 00115 `rs_OPERATION_s_STATE`. Every time librsync tries to make progress, 00116 it will call this function. 00117 00118 The state function returns one of the ::rs_result values. The 00119 most important values are 00120 00121 * ::RS_DONE: Completed successfully. 00122 00123 * ::RS_BLOCKED: Cannot make further progress at this point. 00124 00125 * ::RS_RUNNING: The state function has neither completed nor blocked but 00126 wants to be called again. **XXX**: Perhaps this should be removed? 00127 00128 States need to correspond to suspension points. The only place the 00129 job can resume after blocking is at the entry to a state function. 00130 00131 Therefore states must be "all or nothing" in that they can either 00132 complete, or restart without losing information. 00133 00134 Basically every state needs to work from one input buffer to one 00135 output buffer. 00136 00137 States should never generally return ::RS_DONE directly. Instead, they 00138 should call rs__job_done(), which sets the state function to 00139 rs__s_done(). This makes sure that any pending output is flushed out 00140 before ::RS_DONE is returned to the application.