pkgsrc/converters
rh 880ad47019 Update recode to 3.5
User-visible changes are:

.* Incompatible changes
. + A double dot `..' should now be used instead of a colon `:'.
. + Option --force (-f) is needed to pursue recoding despite errors.
. + There is no more quoting for special characters within charsets names.
. + Auto check (`-a') and popen (`-o') options have been withdrawn.
. + Some charsets and aliases were deleted, see `Charsets & aliases' below.

.* Extended features
. + Program messages are available in localised form for many languages.
. + Long character names are available in French, if LANGUAGE is set to `fr'.
. + A new request syntax allows for recode chaining, and for surfaces.
. + Option --header-file (-h) accepts a language parameter, and Perl is new.
. + Full charset listings now show the UCS-2 value for characters.
. + Option --known=PAIRS (-k) also accepts octal and hexadecimal numbers.
. + Option --list (-l) better sorts charsets and aliases, also fully written.
. + Charset `RFC1345' implements mnemonic+ascii+38, and is now reversible.
. + HTML is not limited anymore to Latin-1, HTML 4.0 entities are supported.

.* New features
. + Euro support.
. + Updated RFC 1345 set of tables, from Keld Simonsen.
. + Some African charsets and transliterated forms.
. + Conversions for ISO 10646 and Unicode.
. + Combining or explosion of UCS-2 diacriticized characters and ligatures.
. + Implementation of surfaces, see `Surfaces & aliases' below.
. + Mixed mode for recoding only comments and strings in C sources or PO files.
. + A stand-alone recoding library gets installed, often as a shared library.
. + Option --find-subsets (-T) lists charsets which are subsets of another.
. + The library may generate testing data, and study character frequencies.

.* Charsets & aliases

. + New ISO 10646 and Unicode charsets
.  - combined-UCS-2: pseudo-charset.
.  - count-characters: pseudo-charset.
.  - dump-with-names: pseudo-charset.
.  - ISO-10646-UCS-2: aliases are UNICODE-1-1, BMP, rune and u2.
.  - ISO-10646-UCS-4: aliases are 10646, ISO-10646, UCS-4 and u4.
.  - UNICODE-1-1-UTF-7: aliases are TF-7 and u7.
.  - UTF-8: aliases are UTF-2, UTF-FSS, FSS_UTF, TF-8 and u8.
.  - UTF-16: aliases are Unicode, TF-16 and u6.

. + RFC 1345.bis matters
.  - Deleted charsets
     dk-us, us-dk (because of &duplicate which `recode' does not handle yet).
.  - New charsets
     baltic (alias is iso-ir-179); CP1250 (1250, ms-ee, windows-1250);
     CP1251 (1251, ms-cyrl, windows-1251); CP1252 (1252, ms-ansi, windows-1252);
     CP1253 (1253, ms-greek, windows-1253);
     CP1254 (1254, ms-turk, windows-1254); CP1255 (1255, ms-hebr, windows-1255);
     CP1256 (1256, ms-arab, windows-1256);
     CP1257 (1257, WinBaltRim, windows-1257);
     CWI (CWI-2, cp-hu); EBCDIC-IS-FRISS (friss);
     GOST_19768-87 with aliases of previous GOST_19768-74;
     IBM256 (256, CP256, EBCDIC-INT1); IBM875 (875, CP875, EBCDIC-Greek);
     IBM1004 (1004, CP1004, os2latin1); IBM1047 (1047, CP1047);
     ISO-8859-13 (ISO_8859-13:1998, iso-baltic, iso-ir-179a, l7, latin7);
     ISO-8859-14 (ISO_8859-14:1998, iso-celtic, iso-ir-199, l8, latin8);
     ISO-8859-15 (ISO_8859-15:1998, iso-ir-203, l9, latin9);
     KOI-7; KOI-8 (GOST_19768-74); KOI8-R; KOI8-RU; KOI8-U;
     macintosh_ce (macce); mac-is;
     NeXTSTEP (next) yet previous `recode' had it outside RFC 1345.
.  - Alias promoted to charset (with previous charset becoming alias)
     ISO-646.basic (with ISO-646.basic:1983); ISO-646.irv (ISO-646.irv:1983);
     ISO_5427-ext (ISO_5427:1981); ISO_5428 (ISO_5428:1980);
     ISO-8859-1 (ISO_8859-1:1987); ISO-8859-2 (ISO_8859-2:1987);
     ISO-8859-3 (ISO_8859-3:1988); ISO-8859-4 (ISO_8859-4:1988);
     ISO-8859-5 (ISO_8859-5:1988); ISO-8859-6 (ISO_8859-6:1987);
     ISO-8859-7 (ISO_8859-7:1987); ISO-8859-8 (ISO_8859-8:1988);
     ISO-8859-9 (ISO_8859-9:1989); ISO-8859-10 (latin6);
     NC_NC00-10 (NC_NC00-10:81); sami (latin-lap).
.  - New aliases
     037 (for charset IBM037); 038 (IBM038); 273 (IBM273); 274 (IBM274);
     275 (IBM275); 278 (IBM278); 280 (IBM280); 281 (IBM281); 284 (IBM284);
     285 (IBM285); 290 (IBM290); 297 (IBM297); 367 (ANSI_X3.4-1968);
     420 (IBM420); 423 (IBM423); 424 (IBM424); 500, 500V1 (IBM500);
     819 (ISO-8859-1); 864 (IBM864); 868 (IBM868); 870 (IBM870);
     871 (IBM871); 880 (IBM880); 891 (IBM891); 903 (IBM903); 905 (IBM905);
     912, CP912, IBM912 (ISO-8859-2); 918 (IBM918); 1026 (IBM1026);
     ECMA-113, ECMA-113:1986 (ECMA-Cyrillic); GOST_19768-74 (KOI8);
     ISO_8859-N (ISO-8859-N) for N = 1 through 10 and 13 through 15;
     ISO_8859-10:1993 (ISO-8869-10); iso-ir-170 (INVARIANT);
     KOI8_L2 (CSN_369103); pclatin2, pcl2 (IBM852); SS636127 (SEN_850200_B).

. + New African charsets
.  - AFRL1-101-BPI_OCIL: aliases are t-francais and t-fra.
.  - AFRFUL-102-BPI_OCIL: aliases are bambara, bra, ewondo and fulfulde.
.  - AFRFUL-103-BPI_OCIL: aliases are t-bambara, t-bra, t-ewondo and t-fulfulde.
.  - AFRLIN-104-BPI_OCIL: aliases are lingala, lin, sango and wolof.
.  - AFRLIN-105-BPI_OCIL: aliases are t-lingala, t-lin, t-sango and t-wolof.

. + Extra miscellaneous charsets
.  - KEYBCS2, Kamenicky.
.  - CORK, T1.
.  - KOI-8_CS2.

. + New HTML pseudo-charsets
.  - HTML_1.1: alias is h1.
.  - HTML_2.0: aliases are RFC 1866, 1866 and h2.
.  - HTML-i18n: alias is RFC 2070.
.  - HTML_3.2: reimplemented; alias is h3.
.  - HTML_4.0: aliases are h4, HTML and h.
.  - Deleted aliases: HTF, 8859, ISO 8859, Entities, SGML, WWW, w3.

.* Surfaces & aliases

. + New MIME encoding surfaces
.  - Base64: aliases are 64 and b64.
.  - Quoted-Printable: aliases are qp and Quote-Printable.

. + New permutation surfaces
.  - 21-Permutation: alias is swabytes.
.  - 4321-Permutation.

. + New end of line surfaces
.  - CR.
.  - CR-LF: alias is cl.

. + New (fully reversible) dump surfaces
.  - Decimal-1: aliases are d and d1.
.  - Decimal-2: alias is d2.
.  - Decimal-4: alias is d4.
.  - Hexadecimal-1: aliases are x and x1.
.  - Hexadecimal-2: alias is x2.
.  - Hexadecimal-4: alias is x4.
.  - Octal-1: aliases are o and o1.
.  - Octal-2: alias is o2.
.  - Octal-4: alias is o4.

. + New miscellaneous surfaces.
.  - data, test7, test8, test15, test16.
2000-01-10 20:54:26 +00:00
..
base64 Initial import of base64-1.0, a simple encoder/decoder for base64 files. 1999-08-16 18:18:19 +00:00
cbmconvert Strip trailing '.', and/or leading '(a|an) ' 2000-01-05 15:37:50 +00:00
kdesupport regen 1999-11-22 20:32:05 +00:00
mpack Strip trailing '.', and/or leading '(a|an) ' 2000-01-05 15:37:50 +00:00
p5-MIME-Base64 more whitespace cleanup (silence, pkglint!) 2000-01-08 04:26:49 +00:00
pkg Import of some more missing categories 1997-11-20 21:01:17 +00:00
psiconv Initial import of psiconv-0.6.1. 1999-12-06 01:58:06 +00:00
recode Update recode to 3.5 2000-01-10 20:54:26 +00:00
uudeview Strip trailing '.', and/or leading '(a|an) ' 2000-01-05 15:37:50 +00:00
uulib Strip trailing '.', and/or leading '(a|an) ' 2000-01-05 15:37:50 +00:00
xdeview Strip trailing '.', and/or leading '(a|an) ' 2000-01-05 15:37:50 +00:00
Makefile Add and enable psiconv 1999-12-06 01:59:22 +00:00