Upgrade your account to a Community Supporter account and remove most of the site ads.

Reply to thread

Message: <blockquote data-quote="le Redoutable" data-source="post: 9293442" data-attributes="member: 7031865">the goal is to create from a homogeneous Winzip-type file a heterogeneous file which will then be eligible to a new Winzip-file, with a gain even as low as 1-2%;what is a homogeneous file ?it is a file where all Byte values are represented with a ratio of say 1/180 to 1/300 ( the ideal would be 1/256 if occurences of each value were perfectly homogeneous )what is a heterogeneous file ?it is a file where some values appear more often than some others;for example, text files are essentially composed of occurences of values from 32 to 128 ( or so )so, here's my method :first, some statistics :find the value with the most occurences;as I printed above, the most occurent value will give a ratio ( for example ) of 1/180;that means you can use offsets for each occurence of that value within a Byte ( because statistics say offsets shouldn't exceed 180, then 255 ( the max value you can print within a Byte ) should rarely get exceeded;still sometimes you may end up with an offset of ( 280, 400, or even 850 ) , so you can easily rule that , if you print an offset of 255 it means the offset is equal to 254 + another Byte of 0 to 254 , which again if equal to 255 means you have an offset of 254+254+ another Byte etcThe only problem with appending too much offset values is it adds to the length of the output file ( well, beginning with the most occurent Value somehow mitigates this problem )ok.here's the idea :in lieu of Byte values you use offsets for each Byte value ( in the order of from the most occurent Byte Value down to the less occurent Byte Value )then, as you print offsets you put a flag where in the original file you located the said offset;then, each time you check for occurences ( that is, because there are 256 values  from 0 to 255 , you will do 256 times the job ) , each time you find a flag you don't add to the offset for the n-th value quickly an example for a file of 20 Bytes , composed of 6 Values ( 39, 44, 11, 18, 74, 78 ):01 3902 4403 3904 1105 1806 1807 1108 7809 3910 1111 4412 3913 1814 1115 1116 1117 7418 4419 7820 39first, the statistics :39 544 311 618 374 178 2sorted ( and printed to the output file ) :1 112 393 444 185 786 74now look at this :01 39 +02 44 +03 39 +04 11 + ( offset is 04 - 00 = 4 )05 18 +06 18 +07 11 + ( offset is 07 - 04 = 3 )08 78 +09 39 +10 11 + ( offset is 10 - 07 = 3 )11 44 +12 39 +13 18 +14 11 + ( offset is 14 - 10 = 4 )15 11 + ( offset is 15 - 14 = 1 )16 11 + ( offset is 16 - 15 = 1 )17 74 18 44 19 78 20 39 so the output file looks like :433411next value ( 39 ) :01 39 +102 44 +03 39 +204 11 . (here's a flag )05 18 +06 18 +07 11 .08 78 +09 39 +4 ( 09 - 03 , -1-for-flag-at-04 ,-1-for-flag-at-07 )10 11 .11 44 +12 39 +3-1 = 213 18 +14 11 .15 11 .16 11 .17 74 +18 44 +19 78 +20 39 +8-3 = 5adding to the output file :12425next value ( 44 ) :01 39 .02 44 103 39 .04 11 .05 18 +06 18 +07 11 .08 78 +09 39 .10 11 .11 44 412 39 .13 18 +14 11 .15 11 .16 11 .17 74 +18 44 319 78 20 39 adding to the output file :143next value ( 18 ) :01 39 .02 44 .03 39 .04 11 .05 18 106 18 107 11 .08 78 +09 39 .10 11 .11 44 .12 39 .13 18 214 11 15 11 16 11 17 74 18 44 19 78 20 39 adding to the output file :112etcnote that as you advance in the less common values , the offsets become low ( and that's exactly what the program is for )in a huge file you should end up with a lot more of low values ( like 001 , 050, 030 etc ) than big ones ( 220, 190 etc )here's what I call a heterogeneous file <img src="https://cdn.jsdelivr.net/joypixels/assets/8.0/png/unicode/64/1f642.png" class="smilie smilie--emoji" loading="lazy" width="64" height="64" alt=":)" title="Smile    :)"  data-smilie="1"data-shortname=":)" />if my vision is correct, you will be able to Winzip the output file, giving birth to a new zip file, which in turn will be re-heterogeneoused, for even a 1% gain ( but repeated 1.000 times ( or 1.000.000 times if you want to transfer a 1GB file to a floppy 720 ko lol )so, where am I wrong ?</blockquote>

Verification