Check-in [01e3e3f51e]
Not logged in
Overview

SHA1 Hash:01e3e3f51e6767ccc81b2a8ec5caf9837c2141a6
Date: 2007-09-10 00:43:02
User: drh
Comment:Merge in the delta encoder changes.
Timelines: ancestors | descendants | both | trunk
Other Links: files | ZIP archive | manifest

Tags And Properties
Changes
[hide diffs]

Modified src/delta.c from [4f6e9c70e9] to [e79479fc0f].

@@ -340,15 +340,16 @@
     hash_init(&h, &zOut[base]);
     i = 0;     /* Trying to match a landmark against zOut[base+i] */
     bestCnt = 0;
     while( 1 ){
       int hv;
+      int limit = 250;
 
       hv = hash_32bit(&h) & (MX_LANDMARK-1);
       DEBUG2( printf("LOOKING: %4d [%s]\n", base+i, print16(&zOut[base+i])); )
       iBlock = landmark[hv];
-      while( iBlock>=0 ){
+      while( iBlock>=0 && (limit--)>0 ){
         /*
         ** The hash window has identified a potential match against
         ** landmark block iBlock.  But we need to investigate further.
         **
         ** Look for a region in zOut that matches zSrc. Anchor the search
@@ -404,11 +405,11 @@
       }
 
       /* We have a copy command that does not cause the delta to be larger
       ** than a literal insert.  So add the copy command to the delta.
       */
-      if( bestCnt>0 && base+i>=bestOfst+NHASH ){
+      if( bestCnt>0 ){
         if( bestLitsz>0 ){
           /* Add an insert command before the copy */
           putInt(bestLitsz,&zDelta);
           *(zDelta++) = ':';
           memcpy(zDelta, &zOut[base], bestLitsz);

Deleted test/basic1.test version [d90e14fae8]

Modified www/delta_encoder_algorithm.html from [75d37d5e32] to [2fdda57151].

@@ -128,10 +128,16 @@
 backward from "slide" in the "target", and the candidate location in
 the "origin". This search is constrained on the side of the "target"
 by the "base" (backward search), and the end of the "target" (forward
 search), and on the side of the "origin" by the beginning and end of
 the "origin", respectively.</p>
+
+<p>There are input files for which the hash chains generated by the
+pre-processing step can become very long, leading to long search times
+and affecting the performance of the delta generator. To limit the
+effect such long chains can have the actual search for candidates is
+bounded, looking at most N candidates. Currently N is set to 250.</p>
 
 <p>From the ranges for all the candidates the best (= largest) common
 range is taken and it is determined how many bytes are needed to
 encode the bytes between the "base" and the end of that range. If the
 range extended back to the "base" then this can be done in a single