Ticket UUID: | 838bde7990d8e190957cbfe7f15c77322dc54e57 | ||
Title: | file extracted from Fossil zip archive could have different name | ||
Status: | Review | Type: | Incident |
Severity: | Important | Priority: | |
Subsystem: | Resolution: | Open | |
Last Modified: | 2008-12-09 16:08:44 | ||
Version Found In: | Fossil version [f84bfc31bf] 2008-11-27 02:30:29 | ||
Description & Comments: | |||
@rem issue demostration del .\_fossil_ del .\test.fossil rmdir /Q /S .\testdir mkdir .\testdir del manifest.uuid del manifest cd .\testdir echo 345678>"1rřřřř R .test" cd .. fossil new test.fossil fossil open test.fossil fossil add .\testdir\\"1rřřřř R .test" fossil ls fossil commit -m "check in a file with accended char/s and spaces in its name" --nosign fossil ui ---- on win XP I create file with some accending characters in its name, commit it to the fossil repository and save repository as a zip file. When I extract such file from fossil generated file by windows Explorer/Total Commander I get file with different name. Extracting by unzip program from Info-ZIP web site yields correctly named file. In above example I create file with "r-caron" chars in its name(U+0159), after unzipping I get file with "DEGREE SIGN"(U+00B0) in it instead.
Many other characters get twisted in similar way. I wonder, if there should be something marked in zip file header, as how to interpret stored file name.
I tried to check differences with explorer generated zip files/fossil generated ones with help of: anonymous claiming to be kkinnell added on 2008-12-03 02:56:00: If that is what happened, then the problem is how the programs you are using for unzipping the files are interpreting them. The one that gives you the correct version is using the same encoding that you used when you created the file, the other one is using something else. fossil itself uses SQLite BLOBs to store its artifacts. The storage doesn't encode the data in any way, it treats it as binary data. anonymous added on 2008-12-03 04:53:53: If I modify the test and try to add file on path already having mix of accended/not accended characters and spaces, del .\_fossil_ kkinnell added on 2008-12-03 16:20:32: I can confirm that the encoding used for display of the filename and contents are not the same as the encoding used for wiki, the wiki strings are "entified" as &#ddd; decimals whereas display of the file path strings are dependent on the browser's default encoding. When I set my browser to use the default windows encoding (Western, or Latin-1 which is ISO-8859-1) I get behavior similar to that which you describe. I think it is possible you Please verify that all of your encodings—system, browser and zip programs, are the same as your input system and see if you still get bad behavior. I can duplicate most of the problems you are having by mis-matching the encodings, but I am on a GNU/Linux system and I can't be sure there is not something peculiar to the windows version causing part of your problem. anonymous added on 2008-12-04 07:06:01: If I click ZIP link, I get Zipped archive of repository If I keep default character encoding on browser page (Western ISO 8859-1) and click on directory link, I get page stating: So it seems to me, based on encoding of generated page, sometimes fossil can fail to show true content of directory. anonymous claiming to be Ilia Frenkel added on 2008-12-09 09:49:26: kkinnell added on 2008-12-09 15:58:07: 1. File paths within a repository use the bits supplied by the fs, but do not note the encoding. The browser's encoding determines display. 2. Wiki display entifies characters, and so the display of a file path from within a wikified string may be different than the display in the file browser. 3. Is this a wart, or is this correct behavior? (Mailing list discussion?) |