pkgsrc

History
wiz 17962092fa Update to 7.8:
Version 7.8 05-Sep-08
---------------------

1.  Replaced UCP searching code with optimized version as implemented for Ad
    Muncher (http://www.admuncher.com/) by Peter Kankowski. This uses a two-
    stage table and inline lookup instead of a function, giving speed ups of 2
    to 5 times on some simple patterns that I tested. Permission was given to
    distribute the MultiStage2.py script that generates the tables (it's not in
    the tarball, but is in the Subversion repository).

2.  Updated the Unicode datatables to Unicode 5.1.0. This adds yet more
    scripts.

3.  Change 12 for 7.7 introduced a bug in pcre_study() when a pattern contained
    a group with a zero qualifier. The result of the study could be incorrect,
    or the function might crash, depending on the pattern.

4.  Caseless matching was not working for non-ASCII characters in back
    references. For example, /(\x{de})\1/8i was not matching \x{de}\x{fe}.
    It now works when Unicode Property Support is available.

5.  In pcretest, an escape such as \x{de} in the data was always generating
    a UTF-8 string, even in non-UTF-8 mode. Now it generates a single byte in
    non-UTF-8 mode. If the value is greater than 255, it gives a warning about
    truncation.

6.  Minor bugfix in pcrecpp.cc (change "" == ... to NULL == ...).

7.  Added two (int) casts to pcregrep when printing the difference of two
    pointers, in case they are 64-bit values.

8.  Added comments about Mac OS X stack usage to the pcrestack man page and to
    test 2 if it fails.

9.  Added PCRE_CALL_CONVENTION just before the names of all exported functions,
    and a #define of that name to empty if it is not externally set. This is to
    allow users of MSVC to set it if necessary.

10. The PCRE_EXP_DEFN macro which precedes exported functions was missing from
    the convenience functions in the pcre_get.c source file.

11. An option change at the start of a pattern that had top-level alternatives
    could cause overwriting and/or a crash. This command provoked a crash in
    some environments:

      printf "/(?i)[\xc3\xa9\xc3\xbd]|[\xc3\xa9\xc3\xbdA]/8\n" | pcretest

    This potential security problem was recorded as CVE-2008-2371.

12. For a pattern where the match had to start at the beginning or immediately
    after a newline (e.g /.*anything/ without the DOTALL flag), pcre_exec() and
    pcre_dfa_exec() could read past the end of the passed subject if there was
    no match. To help with detecting such bugs (e.g. with valgrind), I modified
    pcretest so that it places the subject at the end of its malloc-ed buffer.

13. The change to pcretest in 12 above threw up a couple more cases when pcre_
    exec() might read past the end of the data buffer in UTF-8 mode.

14. A similar bug to 7.3/2 existed when the PCRE_FIRSTLINE option was set and
    the data contained the byte 0x85 as part of a UTF-8 character within its
    first line. This applied both to normal and DFA matching.

15. Lazy qualifiers were not working in some cases in UTF-8 mode. For example,
    /^[^d]*?$/8 failed to match "abc".

16. Added a missing copyright notice to pcrecpp_internal.h.

17. Make it more clear in the documentation that values returned from
    pcre_exec() in ovector are byte offsets, not character counts.

18. Tidied a few places to stop certain compilers from issuing warnings.

19. Updated the Virtual Pascal + BCC files to compile the latest v7.7, as
    supplied by Stefan Weber. I made a further small update for 7.8 because
    there is a change of source arrangements: the pcre_searchfuncs.c module is
    replaced by pcre_ucd.c.
2008-09-06 14:25:28 +00:00
patch-aa
patch-ab