This module provides functions that deals with formatting data with
Content-Type 'text/plain; format=flowed' as described in RFC2646
(http://www.rfc-editor.org/rfc/rfc2646.txt). In a nutshell,
format=flowed text solves the problem in plain text files where it
is not known which lines can be considered a logical paragraph,
enabling lines to be automatically flowed (wrapped and/or joined)
as appropriate when displaying.
In format=flowed, a soft newline is expressed as " \n", while hard
newlines are expressed as "\n". Soft newlines can be automatically
deleted or inserted as appropriate when the text is reformatted.
WWW: http://search.cpan.org/dist/Text-Flowed/
Justification: socialtext dependency
This provides a simple interface to Plucene. Plucene is large and multi-
featured, and it expected that users will subclass it, and tie all the
pieces together to suit their own needs. Plucene::Simple is, therefore,
just one way to use Plucene. It's not expected that it will do exactly
what *you* want, but you can always use it as an example of how to
build your own interface.
WWW: http://search.cpan.org/dist/PluceneSimple/
Justification: socialtext dependency
Quirks: 1/6 test fails
Bastardize provides an magical object into which text can be charged
and then returned in various, slighty modified ways.
Among others, bastardize has the following methods:
rdct converts english to hyperreductionist english
(ex. "english" becomes "")
pig pig latin
(ex. "hi there" becomes "ihay erethay")
k3wlt0k a k3wlt0kizer developed originally by Fmh
rot13 implements rot13 "encryption" in perl
(ex. "foo bar" becomes "sbb one")
rev reverses the arrangement of characters
censor attempts to censor text which might be innaproriate
n20e performs numerical abbreviations
(ex. "numerical_abbreviation" becomes "n20e")
WWW: http://search.cpan.org/dist/Text-Bastardize/
This is an XS wrapper around some Unicode Consortium code to check if
a string is valid UTF-8, revised to conform to what expat/Mozilla
think is valid UTF-8, especially with regard to low-ASCII characters.
Note that this module has NOTHING to do with Perl's internal UTF8 flag
on scalars.
This module is for use when you're getting input from users and want
to make sure it's valid UTF-8 before continuing.
WWW: http://search.cpan.org/dist/Unicode-CheckUTF8/
The goals of this project are simple:
Create a highly configurable, easily modifiable source code beautifier.
What it does:
* Ident code, aligning on parens, assignments, etc
* Align on '=' and variable definitions
* Align structure initializers
* Align #define stuff
* Align backslash-newline stuff
* Reformat comments (a little bit)
* Fix inter-character spacing
* Add or remove parens on return statements
* Add or remove braces on single-statement if/do/while/for statements
* Highly configurable - 118 configurable options as of version 0.0.15
WWW: http://uncrustify.sourceforge.net
PR: ports/100604
Submitted by: Dmitry Marakasov <amdmi3 at mail.ru>
- by default, textproc/aspell installs the English dictionaries (no
change);
- thereafter you can install any foreign dictionary;
- when you install a foreign dictionary, i.e. french/aspell or
textproc/da-aspell, it installs only the dictionaries, and depends
upon textproc/aspell for the programs;
- if you don't need the English dictionaries, you can define
WITHOUT_DICTEN or install textproc/aspell-without-dicten;
- add a new port for textproc/en-aspell: if aspell had been installed
without the English dictionaries, they can be added thereafter;
- add a missing port for german/alt-aspell;
- foreign dictionaries are almost independent from textproc/aspell,
and their maintainership is available.
Credits: special thanks to Serge Gagnon <ser_gagnon (at) sympatico.ca>
specification generously provided by Adobe at
http://partners.adobe.com/public/developer/pdf/index_reference.html
The file format is well-supported, with the exception of the
"linearized" or "optimized" output format, which this module can read
but not write. Many specific aspects of the document model are not
manipulable with this package (like fonts), but if the input document
is correctly written, then this module will preserve the model
integrity.
This library grants you some power over the PDF security model. Note
that applications editing PDF documents via this library MUST respect
the security preferences of the document. Any violation of this
respect is contrary to Adobe's intellectual property position, as
stated in the reference manual at the above URL.
WWW: http://search.cpan.org/dist/CAM-PDF/
PR: ports/100182
Submitted by: Gea-Suan Lin <gslin at gslin.org>
ecore data structures and making things generally easy to get around in.
The functions detailed in EXML.h are fairly self explanatory, and the io
interfaces are also generalized and independent (open from a socket, write
to in memory xml image).
WWW: http://www.enlightenment.org/
PR: ports/100002
Submitted by: Stanislav Sedov <ssedov at mbsd.msk.ru>
Since JSON is a pure-perl module and JSON::Syck is based on libsyck,
JSON::Syck is supposed to be very fast and memory efficient. See
chansen's benchmark table at
http://idisk.mac.com/christian.hansen/Public/perl/serialize.pl
JSON.pm comes with dozens of ways to do the same thing and lots of
options, while JSON::Syck doesn't. There's only Load and Dump.
Oh, and JSON::Syck doesn't use camelCase method names :-)
Author: Audrey Tang <autrijus@autrijus.org>
Tatsuhiko Miyagawa <miyagawa@gmail.com>
WWW: http://search.cpan.org/dist/JSON-Syck/
PR: ports/100071
Submitted by: Gea-Suan Lin <gslin at gslin.org>
transformations implemented in PDF::FromHTML::Twig.
There is also a command-line utility, html2pdf.pl, that comes with this
distribution.
WWW: http://search.cpan.org/dist/PDF-FromHTML/
PR: ports/100060
Submitted by: Gea-Suan Lin <gslin at gslin.org>
transparently target multiple backends without changing its code.
WWW: http://search.cpan.org/dist/PDF-Writer/
PR: ports/100058
Submitted by: Gea-Suan Lin <gslin at gslin.org>
It is specifically targeted at producing technical documentation
in the field of computer science.
Documents are written in an XML-based markup language and translated
to different formats with XSL-transformations. At this time, eCromedos
supports the target formats XHTML and LATEX. Where LATEX output can be
further processed into high-quality printable formats by use of the
TEX typesetting system (http://www.ctan.org).
Author: Tobias Koch <tkoch@ecromedos.net>
WWW: http://www.ecromedos.net/
PR: ports/98895
Submitted by: Nicola Vitale <nivit at email.it>
utility for work with dictionaries in StarDict's format.
The word from "list of words" may be string with leading '/' for using Fuzzy
search algorithm, string may contain '?' and '*' for using regexp search.
It work in interactive and not interactive mode.
WWW: http://sdcv.sourceforge.net/
PR: ports/96836
Submitted by: chinsan <chinsan.tw at gmail.com>
parser. It is implemented using the Xerces C++ API, and it provides
access to most of the C++ API from Perl.
WWW: http://xerces.apache.org/xerces-p/
PR: ports/95296
Submitted by: Ken Menzel <kenm@icarz.com>
written in Python.
It is designed to be easy to adapt and extend for your application.
Stuff you can do with the Reverend:
* classify RSS stories
* classify recipes by cuisine
* who do you write like? Shakespeare, Dickens or Austen
* detect the language of a document
* is your code more like Guido's or Peter's
Author: Amir Bakhtiar <amir@divmod.org>
WWW: http://www.divmod.org/trac/wiki/DivmodReverend
PR: ports/96531
Submitted by: Nicola Vitale <nivit@email.it>
written in Perl and C. The archetypal application is website search, but it
can be put to many different uses.
Features
* Extremely fast and scalable - can handle millions of documents
* Full support for 12 Indo-European languages.
* Support for boolean operators AND, OR, and AND NOT; parenthetical
groupings, and prepended +plus and -minus
* Algorithmic selection of relevant excerpts and highlighting of search terms
within excerpts
* Highly customizable query and indexing APIs
* Phrase matching
* Stemming
* Stoplists
WWW: http://www.rectangular.com/kinosearch/
PR: ports/96115
Submitted by: Vivek Khera <vivek@khera.org>
XML::RSS::Parser is a lightweight liberal parser of RSS feeds. This parser
is "liberal" in that it does not demand compliance of a specific RSS version
and will attempt to gracefully handle tags it does not expect or understand.
The parser's only requirements is that the file is well-formed XML and
remotely resembles RSS. Roughly speaking, well formed XML with a channel
element as a direct sibling or the root tag and item elements etc.
There are a number of advantages to using this module then just using
a standard parser-tree combination. There are a number of different RSS
formats in use today. In very subtle ways these formats are not entirely
compatible from one to another. XML::RSS::Parser makes a couple assumptions
to "normalize" the parse tree into a more consistent form. For instance,
it forces channel and item into a parent-child relationship.
WWW: http://search.cpan.org/dist/XML-RSS-Parser/
Google SiteMaps.
The Sitemap Protocol allows you to inform search engine
crawlers about URLs on your Web sites that are available
for crawling.
WWW: http://search.cpan.org/dist/WWW-Google-SiteMap/
the excellent Enchant spellchecker available as a Python module.
The bindings are generated using SWIG. It includes all the functionality
of Enchant with the flexibility of Python and a nice 'Pythonic'
object-oriented interface. It also aims to provide some higher-level
functionality than is available in the C API.
Author: Ryan Kelly <ryan@rfk.id.au>
WWW: http://pyenchant.sourceforge.net/
PR: ports/95284
Submitted by: Nicola Vitale <nivit@email.it>
It provides a lexical scanner and LR parser (constructed by PCCTS),
both of which are efficient and offer good error detection and
recovery; a set of functions for traversing the AST (abstract
syntax tree) generated by the parser; and utility functions for
manipulating strings according to BibTeX conventions.
WWW: http://www.gerg.ca/software/btOOL
PR: ports/94686
Submitted by: Kay Lehmann <kay_lehmann@web.de>
simplifies the process of writings documents and publishing them to
various output formats.
Muse consists of two main parts: an enhanced text-mode for authoring
documents and navigating within Muse projects, and a set of publishing
styles for generating different kinds of output.
WWW: http://www.emacswiki.org/cgi-bin/wiki/MuseMode
PR: ports/93716
Submitted by: Dryice Liu <dryice@dryice.name>
xmldiff uses xmlprpr and diff to display meaningful differences in XML
files in an easy to read format. Output formats available include HTML,
ANSI colour, and regular diff. The coloured modes are particularly
useful for viewing small differences in context within large XML files.
WWW: http://software.decisionsoft.com/tools.html
PR: ports/92947
Submitted by: Paul Chvostek <paul+ports@it.ca>
An XML pretty printer created to format XML that doesn't make use of
mixed content. In the default mode each element is put on a separate
line with consistent indentation. It can also separate attributes onto
individual lines, sort attributes in a specified or alphabetic order,
expand self closing tags, and more.
Note that the distribution calls this tool "xmlpp", but it has been
renamed so as not to conflict with an xmlpp already in the ports tree.
WWW: http://software.decisionsoft.com/tools.html
PR: ports/92946
Submitted by: Paul Chvostek <paul+ports@it.ca>
The po4a (po for anything) project goal is to ease translations
(and more interestingly, the maintenance of translations) using
gettext tools on areas where they were not expected like documentation.
This package contains the main libraries of po4a, and the following sub-modules:
- KernelHelp: Help messages of each kernel compilation option.
- Man: Good old manual page format.
- Pod: Perl documentation format.
- Sgml: either debiandoc or docbook DTD.
- Dia: uncompressed Dia diagrams.
- LaTeX: generic TeX or LaTeX format
WWW: http://packages.debian.org/unstable/text/po4a
PR: ports/91532
Submitted by: Meno Abels <meno.abels@adviser.com>
multibyte characters such as UTF-8, EUC-JP, and GB2312, fullwidth
characters such as east Asian characters, combining characters
such as diacritical marks and Thai, and languages which don't
use whitespaces between words such as Chinese and Japanese.
WWW: http://packages.debian.org/unstable/perl/libtext-wrapi18n-perl
PR: ports/91532
Submitted by: Meno Abels <meno.abels@adviser.com>
Fakeroot runs a command in an environment were it appears to have
root privileges for file manipulation, by setting LD_PRELOAD to a
library with alternative versions of getuid(), stat(), etc. This
is useful for allowing users to create archives (tar, ar, .deb .rpm
etc.) with files in them with root permissions/ownership. Without
fakeroot one would have to have root privileges to create the
constituent files of the archives with the correct permissions and
ownership, and then pack them up, or one would have to construct
the archives directly, without using the archiver.
WWW: http://freshmeat.net/projects/fakeroot
PR: ports/91532
Submitted by: Meno Abels <meno.abels@adviser.com>
This port installs it in the data format for use with dictd server.
WWW: http://romdict.sourceforge.net/
PR: ports/90620
Submitted by: Ion-Mihai "IOnut" Tetcu <itetcu@people.tecnik93.com>
Its genesis came from the need to use the same datastructure as HTML::Template,
but provide Excel files instead. The existing modules don't do the trick, as
they require replication of logic that's already been done within
HTML::Template.
WWW: http://search.cpan.org/dist/Excel-Template/
PR: ports/90044
Submitted by: Espen Tagestad <espen@tagestad.no>
The Heirloom Documentation Tools provide troff, nroff, and related
utilities to format manual pages and other documents for output
on terminals and printers. They are portable and enhanced versions
of the utilities released by Sun as part of OpenSolaris, and, for
pic, grap, mpm, and some minor parts, by Lucent as part of Plan 9.