Hex Artifact Content
Not logged in

Artifact 13f4f929b286b6f5740b70414fd41f8c62493f5d:

File ci_cvs.txt part of check-in [f166b0a63c] - Added first code regarding import from cvs, processing a CVSROOT/history file. Looks good, except that the history I have is incomplete, truncated at the beginning. Extended my notes with results from this experiment, thinking about a possible different method. by aku on 2007-08-31 04:57:33.

0000: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================
0010: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================
0020: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================
0030: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================
0040: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 0a  ===============.
0050: 0a 46 69 72 73 74 20 65 78 70 65 72 69 6d 65 6e  .First experimen
0060: 74 61 6c 20 63 6f 64 65 73 20 2e 2e 2e 0a 0a 74  tal codes .....t
0070: 6f 6f 73 6c 2f 69 6d 70 6f 72 74 2d 63 76 73 2e  oosl/import-cvs.
0080: 74 63 6c 0a 74 6f 6f 6c 73 2f 6c 69 62 2f 72 63  tcl.tools/lib/rc
0090: 73 70 61 72 73 65 72 2e 74 63 6c 0a 0a 4e 6f 20  sparser.tcl..No 
00a0: 61 63 74 75 61 6c 20 69 6d 70 6f 72 74 2c 20 72  actual import, r
00b0: 69 67 68 74 20 6e 6f 77 20 6f 6e 6c 79 20 77 6f  ight now only wo
00c0: 72 6b 69 6e 67 20 6f 6e 20 67 65 74 74 69 6e 67  rking on getting
00d0: 20 63 73 65 74 73 20 72 69 67 68 74 2e 20 54 68   csets right. Th
00e0: 65 0a 63 6f 64 65 20 75 73 65 73 20 43 56 53 52  e.code uses CVSR
00f0: 4f 4f 54 2f 68 69 73 74 6f 72 79 20 61 73 20 66  OOT/history as f
0100: 6f 75 6e 64 61 74 69 6f 6e 2c 20 61 6e 64 20 61  oundation, and a
0110: 75 67 6d 65 6e 74 73 20 74 68 61 74 20 77 69 74  ugments that wit
0120: 68 20 64 61 74 61 0a 66 72 6f 6d 20 74 68 65 20  h data.from the 
0130: 69 6e 64 69 76 69 64 75 61 6c 20 52 43 53 20 66  individual RCS f
0140: 69 6c 65 73 20 28 63 6f 6d 6d 69 74 20 6d 65 73  iles (commit mes
0150: 73 61 67 65 73 29 2e 0a 0a 53 74 61 74 69 73 74  sages)...Statist
0160: 69 63 73 20 6f 66 20 61 20 72 75 6e 20 2e 2e 2e  ics of a run ...
0170: 0a 09 33 35 31 36 20 63 73 65 74 73 2e 0a 0a 09  ..3516 csets....
0180: 31 35 34 35 20 62 72 65 61 6b 73 20 6f 6e 20 75  1545 breaks on u
0190: 73 65 72 20 63 68 61 6e 67 65 0a 09 20 35 35 38  ser change.. 558
01a0: 20 62 72 65 61 6b 73 20 6f 6e 20 66 69 6c 65 20   breaks on file 
01b0: 64 75 70 6c 69 63 61 74 65 0a 09 20 20 31 33 20  duplicate..  13 
01c0: 62 72 65 61 6b 73 20 6f 6e 20 62 72 61 6e 63 68  breaks on branch
01d0: 2f 74 72 75 6e 6b 20 63 68 61 6e 67 65 0a 09 31  /trunk change..1
01e0: 34 30 32 20 62 72 65 61 6b 73 20 6f 6e 20 63 6f  402 breaks on co
01f0: 6d 6d 69 74 20 6d 65 73 73 61 67 65 20 63 68 61  mmit message cha
0200: 6e 67 65 0a 0a 54 69 6d 65 20 73 74 61 74 69 73  nge..Time statis
0210: 74 69 63 73 20 2e 2e 2e 0a 09 33 32 39 37 20 77  tics .....3297 w
0220: 65 72 65 20 70 72 6f 63 65 73 73 65 64 20 69 6e  ere processed in
0230: 20 3c 3d 20 31 20 73 65 63 6f 6e 64 73 20 28 39   <= 1 seconds (9
0240: 33 2e 37 37 25 29 0a 09 20 32 31 37 20 77 65 72  3.77%).. 217 wer
0250: 65 20 70 72 6f 63 65 73 73 65 64 20 69 6e 20 62  e processed in b
0260: 65 74 77 65 65 6e 20 32 20 73 65 63 6f 6e 64 73  etween 2 seconds
0270: 20 61 6e 64 20 31 34 20 6d 69 6e 75 74 65 73 2e   and 14 minutes.
0280: 0a 09 20 20 20 31 20 77 61 73 20 20 70 72 6f 63  ..   1 was  proc
0290: 65 73 73 65 64 20 69 6e 20 7e 34 31 20 6d 69 6e  essed in ~41 min
02a0: 75 74 65 73 0a 09 20 20 20 31 20 77 61 73 20 20  utes..   1 was  
02b0: 70 72 6f 63 65 73 73 65 64 20 69 6e 20 7e 32 32  processed in ~22
02c0: 20 68 6f 75 72 73 0a 0a 54 69 6d 65 20 66 75 7a   hours..Time fuz
02d0: 7a 20 2d 20 44 69 66 66 65 72 65 6e 63 65 73 20  z - Differences 
02e0: 62 65 74 77 65 65 6e 20 63 73 65 74 73 20 72 61  between csets ra
02f0: 6e 67 65 20 66 72 6f 6d 20 30 20 73 65 63 6f 6e  nge from 0 secon
0300: 64 73 20 74 6f 20 36 36 0a 64 61 79 73 2e 20 4e  ds to 66.days. N
0310: 65 65 64 73 20 73 74 61 74 73 20 61 6e 61 6c 79  eeds stats analy
0320: 73 69 73 20 74 6f 20 73 65 65 20 69 66 20 74 68  sis to see if th
0330: 65 72 65 20 69 73 20 61 6e 20 6f 62 76 69 6f 75  ere is an obviou
0340: 73 20 62 72 65 61 6b 2e 20 45 76 65 6e 0a 73 6f  s break. Even.so
0350: 20 74 68 65 20 74 69 6d 65 73 20 77 69 74 68 69   the times withi
0360: 6e 20 63 73 65 74 73 20 61 6e 64 20 62 65 74 77  n csets and betw
0370: 65 65 6e 20 63 73 65 74 73 20 6f 76 65 72 6c 61  een csets overla
0380: 70 20 61 20 67 72 65 61 74 20 64 65 61 6c 2c 0a  p a great deal,.
0390: 6d 61 6b 69 6e 67 20 74 69 6d 65 20 61 20 62 61  making time a ba
03a0: 64 20 63 72 69 74 65 72 69 75 6d 20 66 6f 72 20  d criterium for 
03b0: 63 73 65 74 20 73 65 70 61 72 61 74 69 6f 6e 2c  cset separation,
03c0: 20 49 4d 48 4f 2e 0a 0a 4c 65 61 76 69 6e 67 20   IMHO...Leaving 
03d0: 74 68 61 74 20 74 6f 70 69 63 2c 20 62 61 63 6b  that topic, back
03e0: 20 74 6f 20 74 68 65 20 63 75 72 72 65 6e 74 20   to the current 
03f0: 63 73 65 74 20 73 65 70 61 72 61 74 6f 72 20 2e  cset separator .
0400: 2e 2e 0a 0a 49 74 20 68 61 73 20 61 20 70 72 6f  ....It has a pro
0410: 62 6c 65 6d 3a 0a 09 54 68 65 20 68 69 73 74 6f  blem:..The histo
0420: 72 79 20 66 69 6c 65 20 69 73 20 6e 6f 74 20 73  ry file is not s
0430: 74 61 72 74 69 6e 67 20 61 74 20 74 68 65 20 72  tarting at the r
0440: 6f 6f 74 21 0a 0a 45 78 61 6d 70 6c 65 73 3a 0a  oot!..Examples:.
0450: 09 54 68 65 20 66 69 72 73 74 20 74 68 72 65 65  .The first three
0460: 20 63 68 61 6e 67 65 73 65 74 73 20 61 72 65 0a   changesets are.
0470: 0a 09 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ..==============
0480: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 2f  ===============/
0490: 75 73 65 72 0a 09 4d 20 7b 57 65 64 20 4e 6f 76  user..M {Wed Nov
04a0: 20 32 32 20 30 39 3a 32 38 3a 34 39 20 41 4d 20   22 09:28:49 AM 
04b0: 50 53 54 20 32 30 30 30 7d 20 65 72 69 63 6d 20  PST 2000} ericm 
04c0: 31 2e 34 20 74 63 6c 6c 69 62 2f 6d 6f 64 75 6c  1.4 tcllib/modul
04d0: 65 73 2f 66 74 70 64 2f 43 68 61 6e 67 65 4c 6f  es/ftpd/ChangeLo
04e0: 67 0a 09 4d 20 7b 57 65 64 20 4e 6f 76 20 32 32  g..M {Wed Nov 22
04f0: 20 30 39 3a 32 38 3a 34 39 20 41 4d 20 50 53 54   09:28:49 AM PST
0500: 20 32 30 30 30 7d 20 65 72 69 63 6d 20 31 2e 37   2000} ericm 1.7
0510: 20 74 63 6c 6c 69 62 2f 6d 6f 64 75 6c 65 73 2f   tcllib/modules/
0520: 66 74 70 64 2f 66 74 70 64 2e 74 63 6c 0a 09 66  ftpd/ftpd.tcl..f
0530: 69 6c 65 73 3a 20 32 0a 09 64 65 6c 74 61 3a 20  iles: 2..delta: 
0540: 30 0a 09 72 61 6e 67 65 3a 20 30 20 73 65 63 6f  0..range: 0 seco
0550: 6e 64 73 0a 09 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  nds..===========
0560: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================
0570: 3d 3d 2f 63 6d 73 67 0a 09 4d 20 7b 57 65 64 20  ==/cmsg..M {Wed 
0580: 4e 6f 76 20 32 39 20 30 32 3a 31 34 3a 33 33 20  Nov 29 02:14:33 
0590: 50 4d 20 50 53 54 20 32 30 30 30 7d 20 65 72 69  PM PST 2000} eri
05a0: 63 6d 20 31 2e 33 20 74 63 6c 6c 69 62 2f 61 63  cm 1.3 tcllib/ac
05b0: 6c 6f 63 61 6c 2e 6d 34 0a 09 66 69 6c 65 73 3a  local.m4..files:
05c0: 20 31 0a 09 64 65 6c 74 61 3a 20 0a 09 72 61 6e   1..delta: ..ran
05d0: 67 65 3a 20 30 20 73 65 63 6f 6e 64 73 0a 09 3d  ge: 0 seconds..=
05e0: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================
05f0: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 2f 63 6d 73  ============/cms
0600: 67 0a 09 4d 20 7b 53 75 6e 20 46 65 62 20 30 34  g..M {Sun Feb 04
0610: 20 31 32 3a 32 38 3a 33 35 20 41 4d 20 50 53 54   12:28:35 AM PST
0620: 20 32 30 30 31 7d 20 65 72 69 63 6d 20 31 2e 39   2001} ericm 1.9
0630: 20 74 63 6c 6c 69 62 2f 6d 6f 64 75 6c 65 73 2f   tcllib/modules/
0640: 6d 69 6d 65 2f 43 68 61 6e 67 65 4c 6f 67 0a 09  mime/ChangeLog..
0650: 4d 20 7b 53 75 6e 20 46 65 62 20 30 34 20 31 32  M {Sun Feb 04 12
0660: 3a 32 38 3a 33 35 20 41 4d 20 50 53 54 20 32 30  :28:35 AM PST 20
0670: 30 31 7d 20 65 72 69 63 6d 20 31 2e 31 32 20 74  01} ericm 1.12 t
0680: 63 6c 6c 69 62 2f 6d 6f 64 75 6c 65 73 2f 6d 69  cllib/modules/mi
0690: 6d 65 2f 6d 69 6d 65 2e 74 63 6c 0a 09 66 69 6c  me/mime.tcl..fil
06a0: 65 73 3a 20 32 0a 09 64 65 6c 74 61 3a 20 30 0a  es: 2..delta: 0.
06b0: 09 72 61 6e 67 65 3a 20 30 20 73 65 63 6f 6e 64  .range: 0 second
06c0: 73 0a 0a 41 6c 6c 20 63 73 65 74 73 20 6d 6f 64  s..All csets mod
06d0: 69 66 79 20 66 69 6c 65 73 20 77 68 69 63 68 20  ify files which 
06e0: 61 6c 72 65 61 64 79 20 68 61 76 65 20 73 65 76  already have sev
06f0: 65 72 61 6c 20 72 65 76 69 73 69 6f 6e 73 2e 20  eral revisions. 
0700: 57 65 20 68 61 76 65 0a 6e 6f 20 63 73 65 74 73  We have.no csets
0710: 20 66 72 6f 6d 20 62 65 66 6f 72 65 20 74 68 61   from before tha
0720: 74 20 69 6e 20 74 68 65 20 68 69 73 74 6f 72 79  t in the history
0730: 2c 20 62 75 74 20 74 68 65 73 65 20 63 73 65 74  , but these cset
0740: 73 20 61 72 65 20 69 6e 20 74 68 65 0a 52 43 53  s are in the.RCS
0750: 20 66 69 6c 65 73 2e 0a 0a 49 20 77 6f 6e 64 65   files...I wonde
0760: 72 2c 20 69 73 20 53 46 20 6d 61 79 62 65 20 72  r, is SF maybe r
0770: 65 6d 6f 76 69 6e 67 20 6f 6c 64 20 65 6e 74 72  emoving old entr
0780: 69 65 73 20 66 72 6f 6d 20 74 68 65 20 68 69 73  ies from the his
0790: 74 6f 72 79 20 77 68 65 6e 20 69 74 0a 67 72 6f  tory when it.gro
07a0: 77 73 20 74 6f 6f 20 6c 61 72 67 65 20 3f 0a 0a  ws too large ?..
07b0: 54 68 69 73 20 61 6c 73 6f 20 61 66 66 65 63 74  This also affect
07c0: 73 20 69 6e 63 72 65 6d 65 6e 74 61 6c 20 69 6d  s incremental im
07d0: 70 6f 72 74 20 2e 2e 2e 20 49 20 63 61 6e 6e 6f  port ... I canno
07e0: 74 20 61 73 73 75 6d 65 20 74 68 61 74 20 74 68  t assume that th
07f0: 65 0a 68 69 73 74 6f 72 79 20 61 6c 77 61 79 73  e.history always
0800: 20 67 72 6f 77 73 2e 20 49 74 20 6d 61 79 20 73   grows. It may s
0810: 68 72 69 6e 6b 20 2e 2e 2e 20 49 20 63 61 6e 6e  hrink ... I cann
0820: 6f 74 20 6b 65 65 70 20 61 6e 20 6f 66 66 73 65  ot keep an offse
0830: 74 2c 20 77 69 6c 6c 0a 68 61 76 65 20 74 6f 20  t, will.have to 
0840: 72 65 63 6f 72 64 20 74 68 65 20 74 69 6d 65 20  record the time 
0850: 6f 66 20 74 68 65 20 6c 61 73 74 20 65 6e 74 72  of the last entr
0860: 79 2c 20 6f 72 20 65 76 65 6e 20 74 68 65 20 66  y, or even the f
0870: 75 6c 6c 20 65 6e 74 72 79 0a 70 72 6f 63 65 73  ull entry.proces
0880: 73 65 64 20 6c 61 73 74 2c 20 74 6f 20 61 6c 6c  sed last, to all
0890: 6f 77 20 6d 65 20 74 6f 20 73 6b 69 70 20 61 68  ow me to skip ah
08a0: 65 61 64 20 74 6f 20 61 6e 79 74 68 69 6e 67 20  ead to anything 
08b0: 6e 6f 74 20 6b 6e 6f 77 6e 20 79 65 74 2e 0a 0a  not known yet...
08c0: 49 20 6d 69 67 68 74 20 68 61 76 65 20 74 6f 20  I might have to 
08d0: 74 72 79 20 74 6f 20 69 6d 70 6c 65 6d 65 6e 74  try to implement
08e0: 20 74 68 65 20 61 6c 67 6f 72 69 74 68 6d 20 6f   the algorithm o
08f0: 75 74 6c 69 6e 65 64 20 62 65 6c 6f 77 2c 0a 6d  utlined below,.m
0900: 61 74 63 68 69 6e 67 20 74 68 65 20 72 65 76 69  atching the revi
0910: 73 69 6f 6e 20 74 72 65 65 73 20 6f 66 20 74 68  sion trees of th
0920: 65 20 69 6e 64 69 76 69 64 75 61 6c 20 52 43 53  e individual RCS
0930: 20 66 69 6c 65 73 20 74 6f 20 65 61 63 68 20 6f   files to each o
0940: 74 68 65 72 0a 74 6f 20 66 6f 72 6d 20 74 68 65  ther.to form the
0950: 20 67 6c 6f 62 61 6c 20 74 72 65 65 20 6f 66 20   global tree of 
0960: 72 65 76 69 73 69 6f 6e 73 2e 20 4d 61 79 62 65  revisions. Maybe
0970: 20 77 65 20 63 61 6e 20 75 73 65 20 74 68 65 20   we can use the 
0980: 68 69 73 74 6f 72 79 20 74 6f 0a 68 65 6c 70 20  history to.help 
0990: 69 6e 20 74 68 65 20 6d 61 74 63 68 75 70 2c 20  in the matchup, 
09a0: 66 6f 72 20 74 68 65 20 70 61 72 74 73 20 77 68  for the parts wh
09b0: 65 72 65 20 77 65 20 64 6f 20 68 61 76 65 20 69  ere we do have i
09c0: 74 2e 0a 0a 57 61 69 74 2e 20 54 68 69 73 20 6d  t...Wait. This m
09d0: 69 67 68 74 20 62 65 20 65 61 73 69 65 72 20 2e  ight be easier .
09e0: 2e 2e 20 54 61 6b 65 20 74 68 65 20 64 65 6c 74  .. Take the delt
09f0: 61 20 69 6e 66 6f 72 6d 61 74 69 6f 6e 20 66 72  a information fr
0a00: 6f 6d 20 74 68 65 20 52 43 53 0a 66 69 6c 65 73  om the RCS.files
0a10: 20 61 6e 64 20 67 65 6e 65 72 61 74 65 20 61 20   and generate a 
0a20: 66 61 6b 65 20 68 69 73 74 6f 72 79 20 2e 2e 2e  fake history ...
0a30: 20 41 63 74 75 61 6c 6c 79 2c 20 74 68 69 73 20   Actually, this 
0a40: 6d 69 67 68 74 20 65 76 65 6e 20 61 6c 6c 6f 77  might even allow
0a50: 0a 75 73 20 74 6f 20 63 72 65 61 74 65 20 61 20  .us to create a 
0a60: 74 6f 74 61 6c 20 68 69 73 74 6f 72 79 20 2e 2e  total history ..
0a70: 2e 20 4e 6f 2c 20 6e 6f 74 20 71 75 69 74 65 2c  . No, not quite,
0a80: 20 74 68 65 20 6d 65 72 67 65 20 65 6e 74 72 69   the merge entri
0a90: 65 73 20 74 68 65 0a 61 63 74 75 61 6c 20 68 69  es the.actual hi
0aa0: 73 74 6f 72 79 20 6d 61 79 20 63 6f 6e 74 61 69  story may contai
0ab0: 6e 20 77 69 6c 6c 20 62 65 20 6d 69 73 73 69 6e  n will be missin
0ac0: 67 2e 20 54 68 65 73 65 20 77 65 20 63 61 6e 20  g. These we can 
0ad0: 6d 69 78 20 69 6e 20 66 72 6f 6d 0a 74 68 65 20  mix in from.the 
0ae0: 61 63 74 75 61 6c 20 68 69 73 74 6f 72 79 2c 20  actual history, 
0af0: 61 73 20 6d 75 63 68 20 61 73 20 77 65 20 68 61  as much as we ha
0b00: 76 65 2e 0a 0a 53 74 69 6c 6c 2c 20 6c 65 74 73  ve...Still, lets
0b10: 20 74 72 79 20 74 68 61 74 2c 20 61 20 66 61 6b   try that, a fak
0b20: 65 20 68 69 73 74 6f 72 79 2c 20 61 6e 64 20 74  e history, and t
0b30: 68 65 6e 20 72 75 6e 20 74 68 69 73 20 73 63 72  hen run this scr
0b40: 69 70 74 20 6f 6e 20 69 74 0a 74 6f 20 73 65 65  ipt on it.to see
0b50: 20 69 66 2f 77 68 65 72 65 20 61 72 65 20 64 69   if/where are di
0b60: 66 66 65 72 65 6e 63 65 73 2e 0a 0a 3d 3d 3d 3d  fferences...====
0b70: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================
0b80: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================
0b90: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================
0ba0: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================
0bb0: 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 0a 0a 0a 4e 6f  ===========...No
0bc0: 74 65 73 20 61 62 6f 75 74 20 43 56 53 20 69 6d  tes about CVS im
0bd0: 70 6f 72 74 2c 20 72 65 67 61 72 64 69 6e 67 20  port, regarding 
0be0: 43 56 53 2e 0a 0a 2d 20 50 72 6f 62 6c 65 6d 3a  CVS...- Problem:
0bf0: 20 43 56 53 20 64 6f 65 73 20 6e 6f 74 20 72 65   CVS does not re
0c00: 61 6c 6c 79 20 74 72 61 63 6b 20 63 68 61 6e 67  ally track chang
0c10: 65 73 65 74 73 2c 20 62 75 74 20 6f 6e 6c 79 20  esets, but only 
0c20: 69 6e 64 69 76 69 64 75 61 6c 0a 20 20 72 65 76  individual.  rev
0c30: 69 73 69 6f 6e 73 20 6f 66 20 66 69 6c 65 73 2e  isions of files.
0c40: 20 54 6f 20 72 65 63 6f 76 65 72 20 63 68 61 6e   To recover chan
0c50: 67 65 73 65 74 73 20 69 74 20 69 73 20 6e 65 63  gesets it is nec
0c60: 65 73 73 61 72 79 20 74 6f 20 6c 6f 6f 6b 20 61  essary to look a
0c70: 74 0a 20 20 61 75 74 68 6f 72 2c 20 62 72 61 6e  t.  author, bran
0c80: 63 68 2c 20 74 69 6d 65 73 74 61 6d 70 20 69 6e  ch, timestamp in
0c90: 66 6f 72 6d 61 74 69 6f 6e 2c 20 61 6e 64 20 74  formation, and t
0ca0: 68 65 20 63 6f 6d 6d 69 74 20 6d 65 73 73 61 67  he commit messag
0cb0: 65 73 2e 20 45 76 65 6e 0a 20 20 73 6f 20 74 68  es. Even.  so th
0cc0: 69 73 20 69 73 20 6f 6e 6c 79 20 68 65 75 72 69  is is only heuri
0cd0: 73 74 69 63 2c 20 6e 6f 74 20 66 6f 6f 6c 70 72  stic, not foolpr
0ce0: 6f 6f 66 2e 0a 0a 20 20 45 78 69 73 74 69 6e 67  oof...  Existing
0cf0: 20 74 6f 6f 6c 3a 20 63 76 73 70 73 2e 0a 0a 20   tool: cvsps... 
0d00: 20 50 72 6f 63 65 73 73 65 73 20 74 68 65 20 6f   Processes the o
0d10: 75 74 70 75 74 20 6f 66 20 27 63 76 73 20 6c 6f  utput of 'cvs lo
0d20: 67 27 20 74 6f 20 72 65 63 6f 76 65 72 20 63 68  g' to recover ch
0d30: 61 6e 67 65 73 65 74 73 2e 20 50 72 6f 62 6c 65  angesets. Proble
0d40: 6d 3a 0a 20 20 53 65 65 73 20 6f 6e 6c 79 20 61  m:.  Sees only a
0d50: 20 6c 69 6e 65 61 72 20 6c 69 73 74 20 6f 66 20   linear list of 
0d60: 72 65 76 69 73 69 6f 6e 73 2c 20 64 6f 65 73 20  revisions, does 
0d70: 6e 6f 74 20 73 65 65 20 62 72 61 6e 63 68 70 6f  not see branchpo
0d80: 69 6e 74 73 2c 0a 20 20 65 74 63 2e 20 43 61 6e  ints,.  etc. Can
0d90: 6e 6f 74 20 75 73 65 20 74 68 65 20 74 72 65 65  not use the tree
0da0: 20 73 74 72 75 63 74 75 72 65 20 74 6f 20 68 65   structure to he
0db0: 6c 70 20 69 6e 20 6d 61 6b 69 6e 67 20 74 68 65  lp in making the
0dc0: 20 64 65 63 69 73 69 6f 6e 73 2e 0a 0a 2d 20 50   decisions...- P
0dd0: 72 6f 62 6c 65 6d 3a 20 43 56 53 20 64 6f 65 73  roblem: CVS does
0de0: 20 6e 6f 74 20 74 72 61 63 6b 20 6d 65 72 67 65   not track merge
0df0: 2d 70 6f 69 6e 74 73 20 61 74 20 61 6c 6c 2e 20  -points at all. 
0e00: 52 65 63 6f 76 65 72 79 20 74 68 72 6f 75 67 68  Recovery through
0e10: 0a 20 20 68 65 75 72 69 73 74 69 63 73 20 69 73  .  heuristics is
0e20: 20 62 72 69 74 74 6c 65 20 61 74 20 62 65 73 74   brittle at best
0e30: 2c 20 6c 6f 6f 6b 69 6e 67 20 66 6f 72 20 6b 65  , looking for ke
0e40: 79 77 6f 72 64 73 20 69 6e 20 63 6f 6d 6d 69 74  ywords in commit
0e50: 0a 20 20 6d 65 73 73 61 67 65 73 20 77 68 69 63  .  messages whic
0e60: 68 20 6d 69 67 68 74 20 69 6e 64 69 63 61 74 65  h might indicate
0e70: 20 74 68 61 74 20 61 20 62 72 61 6e 63 68 20 77   that a branch w
0e80: 61 73 20 6d 65 72 67 65 64 20 77 69 74 68 20 73  as merged with s
0e90: 6f 6d 65 0a 20 20 6f 74 68 65 72 2e 0a 0a 0a 49  ome.  other....I
0ea0: 64 65 61 73 20 72 65 67 61 72 64 69 6e 67 20 61  deas regarding a
0eb0: 6e 20 61 6c 67 6f 72 69 74 68 6d 20 74 6f 20 72  n algorithm to r
0ec0: 65 63 6f 76 65 72 20 63 68 61 6e 67 65 73 65 74  ecover changeset
0ed0: 73 2e 0a 0a 4b 65 79 20 66 65 61 74 75 72 65 3a  s...Key feature:
0ee0: 20 55 73 65 73 20 74 68 65 20 70 65 72 2d 66 69   Uses the per-fi
0ef0: 6c 65 20 72 65 76 69 73 69 6f 6e 20 74 72 65 65  le revision tree
0f00: 73 20 74 6f 20 68 65 6c 70 20 69 6e 20 75 6e 63  s to help in unc
0f10: 6f 76 65 72 69 6e 67 0a 74 68 65 20 75 6e 64 65  overing.the unde
0f20: 72 6c 79 69 6e 67 20 63 68 61 6e 67 65 73 65 74  rlying changeset
0f30: 73 20 61 6e 64 20 67 6c 6f 62 61 6c 20 72 65 76  s and global rev
0f40: 69 73 69 6f 6e 20 74 72 65 65 20 47 2e 0a 0a 54  ision tree G...T
0f50: 68 65 20 70 65 72 2d 66 69 6c 65 20 72 65 76 69  he per-file revi
0f60: 73 69 6f 6e 20 74 72 65 65 20 66 6f 72 20 61 20  sion tree for a 
0f70: 66 69 6c 65 20 58 20 69 73 20 69 6e 20 65 73 73  file X is in ess
0f80: 65 6e 63 65 20 74 68 65 20 67 6c 6f 62 61 6c 0a  ence the global.
0f90: 72 65 76 69 73 69 6f 6e 20 74 72 65 65 20 77 69  revision tree wi
0fa0: 74 68 20 61 6c 6c 20 6e 6f 64 65 73 20 6e 6f 74  th all nodes not
0fb0: 20 70 65 72 74 61 69 6e 69 6e 67 20 74 6f 20 58   pertaining to X
0fc0: 20 72 65 6d 6f 76 65 64 20 66 72 6f 6d 20 69 74   removed from it
0fd0: 2e 20 49 6e 0a 74 68 65 20 72 65 76 65 72 73 65  . In.the reverse
0fe0: 20 74 68 69 73 20 61 6c 6c 6f 77 73 20 75 73 20   this allows us 
0ff0: 74 6f 20 62 75 69 6c 74 20 75 70 20 74 68 65 20  to built up the 
1000: 67 6c 6f 62 61 6c 20 72 65 76 69 73 69 6f 6e 20  global revision 
1010: 74 72 65 65 20 66 72 6f 6d 0a 74 68 65 20 70 65  tree from.the pe
1020: 72 2d 66 69 6c 65 20 74 72 65 65 73 20 62 79 20  r-file trees by 
1030: 6d 61 74 63 68 69 6e 67 20 6e 6f 64 65 73 20 74  matching nodes t
1040: 6f 20 65 61 63 68 20 6f 74 68 65 72 20 61 6e 64  o each other and
1050: 20 65 78 74 65 6e 64 69 6e 67 2e 0a 0a 53 74 61   extending...Sta
1060: 72 74 20 77 69 74 68 20 74 68 65 20 70 65 72 20  rt with the per 
1070: 66 69 6c 65 20 72 65 76 69 73 69 6f 6e 20 74 72  file revision tr
1080: 65 65 20 6f 66 20 61 20 73 69 6e 67 6c 65 20 66  ee of a single f
1090: 69 6c 65 20 61 73 20 69 6e 69 74 69 61 6c 0a 61  ile as initial.a
10a0: 70 70 72 6f 78 69 6d 61 74 69 6f 6e 20 6f 66 20  pproximation of 
10b0: 74 68 65 20 67 6c 6f 62 61 6c 20 74 72 65 65 2e  the global tree.
10c0: 20 41 6c 6c 20 6e 6f 64 65 73 20 6f 66 20 74 68   All nodes of th
10d0: 69 73 20 74 72 65 65 20 72 65 66 65 72 20 74 6f  is tree refer to
10e0: 20 74 68 65 0a 72 65 76 69 73 69 6f 6e 20 6f 66   the.revision of
10f0: 20 74 68 65 20 66 69 6c 65 20 62 65 6c 6f 6e 67   the file belong
1100: 69 6e 67 20 74 6f 20 69 74 2c 20 61 6e 64 20 74  ing to it, and t
1110: 68 72 6f 75 67 68 20 74 68 61 74 20 74 68 65 20  hrough that the 
1120: 66 69 6c 65 0a 69 74 73 65 6c 66 2e 20 41 74 20  file.itself. At 
1130: 65 61 63 68 20 73 74 65 70 20 74 68 65 20 67 6c  each step the gl
1140: 6f 62 61 6c 20 74 72 65 65 20 63 6f 6e 74 61 69  obal tree contai
1150: 6e 73 20 74 68 65 20 6e 6f 64 65 73 20 66 6f 72  ns the nodes for
1160: 20 61 20 66 69 6e 69 74 65 0a 73 65 74 20 6f 66   a finite.set of
1170: 20 66 69 6c 65 73 2c 20 61 6e 64 20 61 6c 6c 20   files, and all 
1180: 6e 6f 64 65 73 20 69 6e 20 74 68 65 20 74 72 65  nodes in the tre
1190: 65 20 72 65 66 65 72 20 74 6f 20 72 65 76 69 73  e refer to revis
11a0: 69 6f 6e 73 20 6f 66 20 61 6c 6c 0a 66 69 6c 65  ions of all.file
11b0: 73 20 69 6e 20 74 68 65 20 73 65 74 2c 20 6d 61  s in the set, ma
11c0: 6b 69 6e 67 20 74 68 65 20 6d 61 70 70 69 6e 67  king the mapping
11d0: 20 74 6f 74 61 6c 2e 0a 0a 54 6f 20 61 64 64 20   total...To add 
11e0: 61 20 66 69 6c 65 20 58 20 74 6f 20 74 68 65 20  a file X to the 
11f0: 74 72 65 65 20 74 61 6b 65 20 74 68 65 20 70 65  tree take the pe
1200: 72 2d 66 69 6c 65 20 72 65 76 69 73 69 6f 6e 20  r-file revision 
1210: 74 72 65 65 20 52 20 61 6e 64 0a 70 65 72 66 6f  tree R and.perfo
1220: 72 6d 73 20 74 68 65 20 66 6f 6c 6c 6f 77 69 6e  rms the followin
1230: 67 20 61 63 74 69 6f 6e 73 3a 0a 0a 2d 20 46 6f  g actions:..- Fo
1240: 72 20 65 61 63 68 20 6e 6f 64 65 20 4e 20 69 6e  r each node N in
1250: 20 52 20 75 73 65 20 74 68 65 20 74 75 70 6c 65   R use the tuple
1260: 20 3c 61 75 74 68 6f 72 2c 20 62 72 61 6e 63 68   <author, branch
1270: 2c 20 63 6f 6d 6d 69 74 20 6d 65 73 73 61 67 65  , commit message
1280: 3e 0a 20 20 74 6f 20 69 64 65 6e 74 69 66 79 20  >.  to identify 
1290: 61 20 73 65 74 20 6f 66 20 6e 6f 64 65 73 20 69  a set of nodes i
12a0: 6e 20 47 20 77 68 69 63 68 20 6d 61 79 20 6d 61  n G which may ma
12b0: 74 63 68 20 4e 2e 20 55 73 65 20 74 68 65 20 74  tch N. Use the t
12c0: 69 6d 65 73 74 61 6d 70 0a 20 20 74 6f 20 6c 6f  imestamp.  to lo
12d0: 63 61 74 65 20 74 68 65 20 6e 6f 64 65 20 6e 65  cate the node ne
12e0: 61 72 65 73 74 20 69 6e 20 74 69 6d 65 2e 0a 0a  arest in time...
12f0: 2d 20 54 68 69 73 20 70 72 6f 63 65 73 73 20 77  - This process w
1300: 69 6c 6c 20 6c 65 61 76 65 20 6e 6f 64 65 73 20  ill leave nodes 
1310: 69 6e 20 4e 20 75 6e 6d 61 70 70 65 64 2e 20 49  in N unmapped. I
1320: 66 20 74 68 65 72 65 20 61 72 65 20 75 6e 6d 61  f there are unma
1330: 70 70 65 64 0a 20 20 6e 6f 64 65 73 20 77 68 69  pped.  nodes whi
1340: 63 68 20 68 61 76 65 20 6e 6f 20 6e 65 69 67 68  ch have no neigh
1350: 62 6f 75 72 69 6e 67 20 6d 61 70 70 65 64 20 6e  bouring mapped n
1360: 6f 64 65 73 20 77 65 20 68 61 76 65 20 74 6f 0a  odes we have to.
1370: 20 20 61 62 6f 72 74 2e 0a 0a 20 20 4f 74 68 65    abort...  Othe
1380: 72 77 69 73 65 20 74 61 6b 65 20 74 68 65 20 6e  rwise take the n
1390: 6f 64 65 73 20 77 68 69 63 68 20 68 61 76 65 20  odes which have 
13a0: 6d 61 70 70 65 64 20 6e 65 69 67 68 62 6f 75 72  mapped neighbour
13b0: 73 2e 20 54 72 61 63 65 20 74 68 65 0a 20 20 65  s. Trace the.  e
13c0: 64 67 65 73 20 61 6e 64 20 73 65 65 20 77 68 69  dges and see whi
13d0: 63 68 20 6f 66 20 74 68 65 73 65 20 6e 6f 64 65  ch of these node
13e0: 73 20 61 72 65 20 63 6f 6e 6e 65 63 74 65 64 20  s are connected 
13f0: 69 6e 20 74 68 65 20 6c 6f 63 61 6c 0a 20 20 74  in the local.  t
1400: 72 65 65 2e 20 54 68 65 6e 20 6c 6f 6f 6b 20 61  ree. Then look a
1410: 74 20 74 68 65 20 69 64 65 6e 74 69 66 69 65 64  t the identified
1420: 20 6e 65 69 67 68 62 6f 75 72 73 20 61 6e 64 20   neighbours and 
1430: 74 72 61 63 65 20 74 68 65 69 72 0a 20 20 63 6f  trace their.  co
1440: 6e 6e 65 63 74 69 6f 6e 73 2e 0a 0a 20 20 49 66  nnections...  If
1450: 20 74 77 6f 20 67 6c 6f 62 61 6c 20 6e 6f 64 65   two global node
1460: 73 20 68 61 76 65 20 61 20 64 69 72 65 63 74 20  s have a direct 
1470: 63 6f 6e 6e 65 63 74 69 6f 6e 2c 20 62 75 74 20  connection, but 
1480: 61 20 6d 75 6c 74 69 2d 65 64 67 65 0a 20 20 63  a multi-edge.  c
1490: 6f 6e 6e 65 63 74 69 6f 6e 20 69 6e 20 74 68 65  onnection in the
14a0: 20 6c 6f 63 61 6c 20 74 72 65 65 20 69 6e 73 65   local tree inse
14b0: 72 74 20 67 6c 6f 62 61 6c 20 6e 6f 64 65 73 20  rt global nodes 
14c0: 6d 61 70 70 69 6e 67 20 74 6f 20 74 68 65 0a 20  mapping to the. 
14d0: 20 6c 6f 63 61 6c 20 6e 6f 64 65 73 20 61 6e 64   local nodes and
14e0: 20 6d 61 70 20 74 68 65 6d 20 74 6f 67 65 74 68   map them togeth
14f0: 65 72 2e 20 54 68 69 73 20 65 78 70 61 6e 64 73  er. This expands
1500: 20 74 68 65 20 67 6c 6f 62 61 6c 20 74 72 65 65   the global tree
1510: 20 74 6f 0a 20 20 68 6f 6c 64 20 74 68 65 20 72   to.  hold the r
1520: 65 76 69 73 69 6f 6e 73 20 61 64 64 65 64 20 62  evisions added b
1530: 79 20 74 68 65 20 6e 65 77 20 66 69 6c 65 2e 0a  y the new file..
1540: 0a 20 20 4f 74 68 65 72 77 69 73 65 2c 20 62 6f  .  Otherwise, bo
1550: 74 68 20 73 69 64 65 73 20 68 61 76 65 20 6d 75  th sides have mu
1560: 6c 74 69 2d 65 64 67 65 20 63 6f 6e 6e 65 63 74  lti-edge connect
1570: 69 6f 6e 73 20 74 68 65 6e 20 61 62 6f 72 74 2e  ions then abort.
1580: 20 54 68 69 73 0a 20 20 6c 6f 6f 6b 73 20 6c 69   This.  looks li
1590: 6b 65 20 61 20 6d 65 72 67 65 20 6f 66 20 74 77  ke a merge of tw
15a0: 6f 20 64 69 66 66 65 72 65 6e 74 20 62 72 61 6e  o different bran
15b0: 63 68 65 73 2c 20 62 75 74 20 74 68 65 72 65 20  ches, but there 
15c0: 61 72 65 20 6e 6f 20 73 75 63 68 0a 20 20 69 6e  are no such.  in
15d0: 20 43 56 53 20 2e 2e 2e 20 57 61 69 74 20 2e 2e   CVS ... Wait ..
15e0: 2e 20 73 6f 72 74 20 74 68 65 20 6e 6f 64 65 73  . sort the nodes
15f0: 20 6f 76 65 72 20 74 69 6d 65 20 61 6e 64 20 66   over time and f
1600: 69 74 20 74 68 65 20 6e 65 77 20 6e 6f 64 65 73  it the new nodes
1610: 0a 20 20 69 6e 20 62 65 74 77 65 65 6e 20 74 68  .  in between th
1620: 65 20 6f 74 68 65 72 20 6e 6f 64 65 73 2c 20 70  e other nodes, p
1630: 65 72 20 74 68 65 20 74 69 6d 65 73 74 61 6d 70  er the timestamp
1640: 73 2e 20 57 65 20 68 61 76 65 20 6f 76 65 72 6c  s. We have overl
1650: 61 70 70 69 6e 67 0a 20 20 2f 20 61 6c 74 65 72  apping.  / alter
1660: 6e 61 74 69 6e 67 20 63 68 61 6e 67 65 73 20 74  nating changes t
1670: 6f 20 6f 6e 65 20 66 69 6c 65 20 61 6e 64 20 6f  o one file and o
1680: 74 68 65 72 73 2e 0a 0a 20 20 41 20 6c 61 73 74  thers...  A last
1690: 20 70 6f 73 73 69 62 69 6c 69 74 79 20 69 73 20   possibility is 
16a0: 74 68 61 74 20 61 20 6e 6f 64 65 20 69 73 20 6f  that a node is o
16b0: 6e 6c 79 20 63 6f 6e 6e 65 63 74 65 64 20 74 6f  nly connected to
16c0: 20 61 20 6d 61 70 70 65 64 0a 20 20 70 61 72 65   a mapped.  pare
16d0: 6e 74 2e 20 54 68 69 73 20 6d 61 79 20 62 65 20  nt. This may be 
16e0: 61 20 6e 65 77 20 62 72 61 6e 63 68 2c 20 6f 72  a new branch, or
16f0: 20 61 67 61 69 6e 20 61 6e 20 61 6c 74 65 72 6e   again an altern
1700: 61 74 69 6e 67 20 63 68 61 6e 67 65 20 6f 6e 0a  ating change on.
1710: 20 20 74 68 65 20 67 69 76 65 6e 20 6c 69 6e 65    the given line
1720: 2e 20 53 79 6d 62 6f 6c 73 20 6f 6e 20 74 68 65  . Symbols on the
1730: 20 72 65 76 69 73 69 6f 6e 73 20 77 69 6c 6c 20   revisions will 
1740: 68 65 6c 70 20 74 6f 20 6d 61 70 20 74 68 69 73  help to map this
1750: 2e 0a 0a 2d 20 57 65 20 6e 6f 77 20 68 61 76 65  ...- We now have
1760: 20 61 6e 20 65 78 74 65 6e 64 65 64 20 67 6c 6f   an extended glo
1770: 62 61 6c 20 74 72 65 65 20 77 68 69 63 68 20 69  bal tree which i
1780: 6e 63 6f 72 70 6f 72 61 74 65 73 20 74 68 65 20  ncorporates the 
1790: 72 65 76 69 73 69 6f 6e 73 0a 20 20 6f 66 20 74  revisions.  of t
17a0: 68 65 20 6e 65 77 20 66 69 6c 65 2e 20 48 6f 77  he new file. How
17b0: 65 76 65 72 20 6e 65 77 20 6e 6f 64 65 73 20 77  ever new nodes w
17c0: 69 6c 6c 20 72 65 66 65 72 20 6f 6e 6c 79 20 74  ill refer only t
17d0: 6f 20 74 68 65 20 6e 65 77 20 66 69 6c 65 2c 0a  o the new file,.
17e0: 20 20 61 6e 64 20 6f 6c 64 20 6e 6f 64 65 73 20    and old nodes 
17f0: 6d 61 79 20 6e 6f 74 20 72 65 66 65 72 20 74 6f  may not refer to
1800: 20 74 68 65 20 6e 65 77 20 66 69 6c 65 2e 20 54   the new file. T
1810: 68 69 73 20 68 61 73 20 74 6f 20 62 65 20 66 69  his has to be fi
1820: 78 65 64 2c 0a 20 20 61 73 20 61 6c 6c 20 6e 6f  xed,.  as all no
1830: 64 65 73 20 68 61 76 65 20 74 6f 20 72 65 66 65  des have to refe
1840: 72 20 74 6f 20 61 6c 6c 20 66 69 6c 65 73 2e 0a  r to all files..
1850: 0a 20 20 52 75 6e 20 6f 76 65 72 20 74 68 65 20  .  Run over the 
1860: 74 72 65 65 20 61 6e 64 20 6c 6f 6f 6b 20 61 74  tree and look at
1870: 20 65 61 63 68 20 70 61 72 65 6e 74 2f 63 68 69   each parent/chi
1880: 6c 64 20 70 61 69 72 2e 20 49 66 20 61 20 66 69  ld pair. If a fi
1890: 6c 65 20 69 73 0a 20 20 6e 6f 74 20 72 65 66 65  le is.  not refe
18a0: 72 65 6e 63 65 64 20 69 6e 20 74 68 65 20 63 68  renced in the ch
18b0: 69 6c 64 2c 20 62 75 74 20 74 68 65 20 70 61 72  ild, but the par
18c0: 65 6e 74 2c 20 74 68 65 6e 20 63 6f 70 79 20 61  ent, then copy a
18d0: 20 72 65 66 65 72 65 6e 63 65 0a 20 20 74 6f 20   reference.  to 
18e0: 74 68 65 20 66 69 6c 65 20 72 65 76 69 73 69 6f  the file revisio
18f0: 6e 20 6f 6e 20 74 68 65 20 70 61 72 65 6e 74 20  n on the parent 
1900: 66 6f 72 77 61 72 64 20 74 6f 20 74 68 65 20 63  forward to the c
1910: 68 69 6c 64 2e 20 54 68 69 73 0a 20 20 73 69 67  hild. This.  sig
1920: 6e 61 6c 73 20 74 68 61 74 20 74 68 65 20 66 69  nals that the fi
1930: 6c 65 20 64 69 64 20 6e 6f 74 20 63 68 61 6e 67  le did not chang
1940: 65 20 69 6e 20 74 68 65 20 67 69 76 65 6e 20 72  e in the given r
1950: 65 76 69 73 69 6f 6e 2e 0a 0a 2d 20 41 66 74 65  evision...- Afte
1960: 72 20 61 6c 6c 20 66 69 6c 65 73 20 68 61 76 65  r all files have
1970: 20 62 65 65 6e 20 69 6e 74 65 67 72 61 74 65 64   been integrated
1980: 20 69 6e 20 74 68 69 73 20 6d 61 6e 6e 65 72 20   in this manner 
1990: 77 65 20 68 61 76 65 20 67 6c 6f 62 61 6c 0a 20  we have global. 
19a0: 20 72 65 76 69 73 69 6f 6e 20 74 72 65 65 20 63   revision tree c
19b0: 61 70 74 75 72 69 6e 67 20 61 6c 6c 20 63 68 61  apturing all cha
19c0: 6e 67 65 73 65 74 73 2c 20 69 6e 63 6c 75 64 69  ngesets, includi
19d0: 6e 67 20 74 68 65 20 75 6e 63 68 61 6e 67 65 64  ng the unchanged
19e0: 0a 20 20 66 69 6c 65 73 20 70 65 72 20 63 68 61  .  files per cha
19f0: 6e 67 65 73 65 74 2e 0a 0a 0a 54 68 69 73 20 61  ngeset....This a
1a00: 6c 67 6f 72 69 74 68 6d 20 68 61 73 20 74 6f 20  lgorithm has to 
1a10: 62 65 20 72 65 66 69 6e 65 64 20 74 6f 20 61 6c  be refined to al
1a20: 73 6f 20 74 61 6b 65 20 41 74 74 69 63 2f 20 66  so take Attic/ f
1a30: 69 6c 65 73 20 69 6e 74 6f 0a 61 63 63 6f 75 6e  iles into.accoun
1a40: 74 2e 0a 0a                                      t...