| 1 | 
 This is a patched version of zlib, modified to use | 
 
 
 
 
 
 | 2 | 
 Pentium-Pro-optimized assembly code in the deflation algorithm. The | 
 
 
 
 
 
 | 3 | 
 files changed/added by this patch are: | 
 
 
 
 
 
 | 4 | 
  | 
 
 
 
 
 
 | 5 | 
 README.686 | 
 
 
 
 
 
 | 6 | 
 match.S | 
 
 
 
 
 
 | 7 | 
  | 
 
 
 
 
 
 | 8 | 
 The speedup that this patch provides varies, depending on whether the | 
 
 
 
 
 
 | 9 | 
 compiler used to build the original version of zlib falls afoul of the | 
 
 
 
 
 
 | 10 | 
 PPro's speed traps. My own tests show a speedup of around 10-20% at | 
 
 
 
 
 
 | 11 | 
 the default compression level, and 20-30% using -9, against a version | 
 
 
 
 
 
 | 12 | 
 compiled using gcc 2.7.2.3. Your mileage may vary. | 
 
 
 
 
 
 | 13 | 
  | 
 
 
 
 
 
 | 14 | 
 Note that this code has been tailored for the PPro/PII in particular, | 
 
 
 
 
 
 | 15 | 
 and will not perform particuarly well on a Pentium. | 
 
 
 
 
 
 | 16 | 
  | 
 
 
 
 
 
 | 17 | 
 If you are using an assembler other than GNU as, you will have to | 
 
 
 
 
 
 | 18 | 
 translate match.S to use your assembler's syntax. (Have fun.) | 
 
 
 
 
 
 | 19 | 
  | 
 
 
 
 
 
 | 20 | 
 Brian Raiter | 
 
 
 
 
 
 | 21 | 
 breadbox@muppetlabs.com | 
 
 
 
 
 
 | 22 | 
 April, 1998 | 
 
 
 
 
 
 | 23 | 
  | 
 
 
 
 
 
 | 24 | 
  | 
 
 
 
 
 
 | 25 | 
 Added for zlib 1.1.3: | 
 
 
 
 
 
 | 26 | 
  | 
 
 
 
 
 
 | 27 | 
 The patches come from | 
 
 
 
 
 
 | 28 | 
 http://www.muppetlabs.com/~breadbox/software/assembly.html | 
 
 
 
 
 
 | 29 | 
  | 
 
 
 
 
 
 | 30 | 
 To compile zlib with this asm file, copy match.S to the zlib directory | 
 
 
 
 
 
 | 31 | 
 then do: | 
 
 
 
 
 
 | 32 | 
  | 
 
 
 
 
 
 | 33 | 
 CFLAGS="-O3 -DASMV" ./configure | 
 
 
 
 
 
 | 34 | 
 make OBJA=match.o | 
 
 
 
 
 
 | 35 | 
  | 
 
 
 
 
 
 | 36 | 
  | 
 
 
 
 
 
 | 37 | 
 Update: | 
 
 
 
 
 
 | 38 | 
  | 
 
 
 
 
 
 | 39 | 
 I've been ignoring these assembly routines for years, believing that | 
 
 
 
 
 
 | 40 | 
 gcc's generated code had caught up with it sometime around gcc 2.95 | 
 
 
 
 
 
 | 41 | 
 and the major rearchitecting of the Pentium 4. However, I recently | 
 
 
 
 
 
 | 42 | 
 learned that, despite what I believed, this code still has some life | 
 
 
 
 
 
 | 43 | 
 in it. On the Pentium 4 and AMD64 chips, it continues to run about 8% | 
 
 
 
 
 
 | 44 | 
 faster than the code produced by gcc 4.1. | 
 
 
 
 
 
 | 45 | 
  | 
 
 
 
 
 
 | 46 | 
 In acknowledgement of its continuing usefulness, I've altered the | 
 
 
 
 
 
 | 47 | 
 license to match that of the rest of zlib. Share and Enjoy! | 
 
 
 
 
 
 | 48 | 
  | 
 
 
 
 
 
 | 49 | 
 Brian Raiter | 
 
 
 
 
 
 | 50 | 
 breadbox@muppetlabs.com | 
 
 
 
 
 
 | 51 | 
 April, 2007 |