File History
Not logged in

History of tools/cvs2fossil/lib/c2f_pinitcsets.tcl

2008-02-24
04:43:56 [14fdb6d32b] part of check-in [6559f3231e] New command 'state foreachrow' for incremental result processing, using less memory. Converted a number of places in pass InitCSet to this command, and marked a number of othre places for possible future use. (By: aku on 2008-02-24 04:43:56) [diff] [annotate]
2008-02-23
06:40:48 [37791a272d] part of check-in [efec424a19] Merged bugfix [b3d61d7829] into the main branch for optimization of memory usage. (By: aku on 2008-02-23 06:40:48) [diff] [annotate]
06:37:54 [37791a272d] part of check-in [383c10f004] Merged bugfix [b3d61d7829] into this semi-abandoned branch just in case we will work on it again. Do it now instead of forgetting it later. (By: aku on 2008-02-23 06:37:54) [diff] [annotate]
06:33:30 [37791a272d] part of check-in [b3d61d7829] Fixed bug made in [f46458d5bd] which prevented the saving of the changesets generated by the breaking of the internal dependencies. (By: aku on 2008-02-23 06:33:30) [diff] [annotate]
2008-02-17
02:06:19 [8351f6b640] part of check-in [f46458d5bd] Reworked the basic structure of pass InitCSets to keep memory consumption down. Now incremental creates, breaks, saves, and releases changesets, instead of piling them on before saving all at the end. Memory tracking confirms that this changes the accumulating mountain into a near-constant usage, with the expected spikes from the breaking. (By: aku on 2008-02-17 02:06:19) [diff] [annotate]
2008-02-16
06:46:41 [9999ef81cc] part of check-in [27ed4f7dc3] Extended pass InitCsets and underlying code with more log output geared towards memory introspection, and added markers for special locations. Extended my notes with general observations from the first test runs over my example CVS repositories. (By: aku on 2008-02-16 06:46:41) [diff] [annotate]
2008-02-08
21:52:21 [eda30d7ee3] part of check-in [6b78df3861] Merge in changes from Andreas's branch. (By: drh on 2008-02-08 21:52:21) [diff] [annotate]
2008-02-06
05:04:12 [eda30d7ee3] part of check-in [66235f2430] Updated the copyright information of all files touched in the new year. (By: aku on 2008-02-06 05:04:12) [diff] [annotate]
2008-01-30
08:23:36 [97c1e14340] part of check-in [9e1b461b2f] Broke package dependency cycle introduced when moving the cset load code from the InitCsets pass to the cset class. (By: aku on 2008-01-30 08:23:36) [diff] [annotate]
03:23:02 [0ab46f96ec] part of check-in [49dd66f64f] Moved the code loading changesets from state to its proper class. (By: aku on 2008-01-30 03:23:02) [diff] [annotate]
2007-12-02
23:47:45 [87cc4829ad] part of check-in [e288af3995] Fluff: Renamed state methods use/reading/writing to usedb/use/extend for clarity. Updated all callers. Extended state module with code to dump the SQL statements it receives to a file for analysis. Extended the 'use' declarations of several passes. (By: aku on 2007-12-02 23:47:45) [diff] [annotate]
20:04:40 [ced4d957b1] part of check-in [00bf8c198e] The performance was still not satisfying, even with faster recomputing of successors. Doing it multiple times (Building the graph in each breaker and sort passes) eats time. Caching in memory blows the memory. Chosen solution: Cache this information in the database. Created a new pass 'CsetDeps' which is run between 'InitCsets' and 'BreakRevCsetCycles' (i.e. changeset creation and first breaker pass). It computes the changeset dependencies from the file-level dependencies once and saves the result in the state, in the new table 'cssuccessor'. Now the breaker and sort passes can get the information quickly, with virtually no effort. The dependencies are recomputed incrementally when a changeset is split by one of the breaker passes, for its fragments and its predecessors. The loop check is now trivial, and integrated into the successor computation, with the heavy lifting for the detailed analysis and reporting moved down into the type-dependent SQL queries. The relevant new method is 'loops'. Now that the loop check is incremental the pass based checks have been removed from the integrity module, and the option '--loopcheck' has been eliminated. For paranoia the graph setup and modification code got its loop check reinstated as an assert, redusing the changeset report code. Renumbered the breaker and sort passes. A number of places, like graph setup and traversal, loading of changesets, etc. got feedback indicators to show their progress. The selection of revision and symbol changesets for the associated breaker passes was a bit on the slow side. We now keep changeset lists sorted by type (during loading or general construction) and access them directly. (By: aku on 2007-12-02 20:04:40) [diff] [annotate]
05:49:00 [f56c12b551] part of check-in [9c57055025] Performance bugfix. nextmap/premap can still be performance killers and memory hogs. Moved the computation of sucessor changesets down to the type-dependent code (new methods) and the SQL database, i.e. the C level. In the current setup it was possible that the DB would deliver us millions of file-level dependency pairs which the Tcl level would then reduce to tens of actual changeset dependencies. Tcl did not cope well with that amount of data. Now the reduction happens in the query itself. A concrete example was a branch in the Tcl CVS generating nearly 9 million pairs, which reduced to roughly 200 changeset dependencies. This blew the memory out of the water and the converter ground to a halt, busily swapping. Ok, causes behind us, also added another index on 'csitem(iid)' to speed the search for changesets from the revisions, tags, and branches. (By: aku on 2007-12-02 05:49:00) [diff] [annotate]
2007-11-30
03:57:19 [39a67db2c5] part of check-in [b42cff97e3] Replaced the checks for self-referential changesets in the cycle breaker with a scheme in the changeset class doing checks when splitting a changeset, which is also called by the general changeset integrity code, after each pass. Extended log output at high verbosity levels. Thorough checking of the fragments a changeset is to be split into. (By: aku on 2007-11-30 03:57:19) [diff] [annotate]
2007-11-29
09:16:33 [b1c3a78eda] part of check-in [80b1e8936f] Renamed state table 'csrevision' to 'csitem' to reflect the new internals of changesets. Updated all places where it is used. (By: aku on 2007-11-29 09:16:33) [diff] [annotate]
07:47:50 [37a8f5de58] part of check-in [27f093d23c] More realignment of variable names with their content, in pass 5. (By: aku on 2007-11-29 07:47:50) [diff] [annotate]
06:21:57 [cec097c034] part of check-in [215d2f1ad9] Brought knowledge of the new types to the state definition, changed the creation of the initial changesets to use tags and branches. (By: aku on 2007-11-29 06:21:57) [diff] [annotate]
2007-11-27
08:59:54 [30453369b1] part of check-in [2e07cd7164] Bugfix in the generation of the initial symbol changesets. Keep entries apart per line-of-development. (By: aku on 2007-11-27 08:59:54) [diff] [annotate]
04:26:56 [3b39c5f0f7] part of check-in [8c6488ded2] Continued work on the integrity checks for changesets. Moved callers out of transactions. Two checks are already tripping on bad changesets made by InitCSets (pass 5). (By: aku on 2007-11-27 04:26:56) [diff] [annotate]
02:37:51 [47d9786663] part of check-in [bf83201c7f] Outline for more integrity checks, focusing on the changesets. (By: aku on 2007-11-27 02:37:51) [diff] [annotate]
2007-11-25
07:54:09 [217d875a91] part of check-in [b679ca3356] Code cleanup. Removed trailing whitespace across the board. (By: aku on 2007-11-25 07:54:09) [diff] [annotate]
2007-11-22
03:11:34 [940095b79c] part of check-in [65be27aa69] Modified the API for the construction of changesets a bit, now allowing their construction with the correct id, instead of correcting it later. Updated pass 5 to use this, and fixed bug where the id counter for changesets was left uninitialized, allowing the improper generation of duplicate ids. (By: aku on 2007-11-22 03:11:34) [diff] [annotate]
2007-11-17
00:29:42 [10710518c2] part of check-in [38b967dcf5] Merge aku's CVS import changes into the main line. Fix a small bug in diff.c. (By: drh on 2007-11-17 00:29:42) [annotate]
2007-11-16
08:32:40 [10710518c2] part of check-in [96b7bfb834] Added convenience command to the state package when the sql returns a single row. Added more statistics about revisions, tags, branches, symbols, changesets to various passes. (By: aku on 2007-11-16 08:32:40) [diff] [annotate]
03:52:18 [bb434d64fd] part of check-in [341d96be21] Bugfix. In pass 5, loading the changesets used the type codes instead of the type names. Modified the SQL selecting the data to return the proper names. (By: aku on 2007-11-16 03:52:18) [diff] [annotate]
2007-11-13
05:09:07 [960f3c67b5] part of check-in [24c0b662de] Reworked the in-memory storage of changesets in pass 5 and supporting classes, and added loading of changesets from the persistent state for when the pass is skipped. (By: aku on 2007-11-13 05:09:07) [diff] [annotate]
2007-11-10
23:44:29 [046ff5e25b] part of check-in [08ebab80cd] Rewrote the algorithm for breaking internal dependencies to my liking. The complex part handling multiple splits has moved from the pass code to the changeset class itself, reusing the state computed for the first split. The state is a bit more complex to allow for its incremental update after a break has been done. Factored major pieces into separate procedures to keep the highlevel code readable. Added lots of official log output to help debugging in case of trouble. (By: aku on 2007-11-10 23:44:29) [diff] [annotate]
20:40:06 [8f42ee8f95] part of check-in [95af789e1f] Oops. pass 5 is not complete. Missed the breaking of internal dependencies, this is done in this pass already. Extended pass _2_ and file revisions with code to save the branchchildren (possible dependencies), and pass 5 and changesets with the proper algorithm. From cvs2svn, works, do not truly like it, as it throws away and recomputes a lot of state after each split of a cset. Could update and reuse the state to perform all splits in one go. Will try that next, for now we have a working form in the code base. (By: aku on 2007-11-10 20:40:06) [diff] [annotate]
07:46:20 [aae0715d5d] part of check-in [5f7acef887] Completed pass 5, computing the initial set of changesets. Defined persistent structure and filled out the long-existing placeholder class (project::rev). (By: aku on 2007-11-10 07:46:20) [diff] [annotate]
05:34:26 [60ccdc280e] part of check-in [54d1e3537e] Started on pass 5, computing the initial approximate set of project level revisions, aka 'ChangeSets'. Skeleton of the pass added. (By: aku on 2007-11-10 05:34:26) [annotate]