pkgsrc/textproc/miller/Makefile

# $NetBSD: Makefile,v 1.14 2017/08/14 21:22:55 wiz Exp $

DISTNAME=	mlr-5.2.2
PKGNAME=	${DISTNAME:S/mlr/miller/}
CATEGORIES=	devel
MASTER_SITES=	${MASTER_SITE_GITHUB:=johnkerl/}
GITHUB_PROJECT=	miller
GITHUB_RELEASE=	v${PKGVERSION_NOREV}

MAINTAINER=	pkgsrc-users@NetBSD.org
HOMEPAGE=	https://github.com/johnkerl/miller/
COMMENT=	Command-line CSV processor
LICENSE=	2-clause-bsd

BUILD_DEPENDS+=	asciidoc-[0-9]*:../../textproc/asciidoc

GNU_CONFIGURE=	yes
USE_LIBTOOL=	yes
TEST_TARGET=	check

.include "../../mk/bsd.pkg.mk"
Updated miller to 5.2.2. 5.2.2 This bugfix release delivers a fix for #147 where a memory allocation failed beyond 4GB. 5.2.1 Fix non-x86/gcc7 build error 2017-08-14 23:22:55 +02:00			`# $NetBSD: Makefile,v 1.14 2017/08/14 21:22:55 wiz Exp $`
Import miller-2.0.0 as textproc/miller. Miller is like sed, awk, cut, join, and sort for name-indexed data such as CSV. With Miller, you get to use named fields without needing to count positional indices. This is something the Unix toolkit always could have done, and arguably always should have done. It operates on key-value-pair data while the familiar Unix tools operate on integer-indexed fields: if the natural data structure for the latter is the array, then Miller's natural data structure is the insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV. (Miller can handle positionally-indexed data as a special case.) 2015-08-28 11:27:10 +02:00
Updated miller to 5.2.2. 5.2.2 This bugfix release delivers a fix for #147 where a memory allocation failed beyond 4GB. 5.2.1 Fix non-x86/gcc7 build error 2017-08-14 23:22:55 +02:00			`DISTNAME= mlr-5.2.2`
Update miller to 3.4.0. Use release tarball and drop autotools dependencies. Changes in 3.4.0: JSON, reshape, regex captures, and more Primary features: JSON is now a supported format for input and output. Miller handles tabular data, and JSON supports arbitrarily deeply nested data structures, so if you want general JSON processing you should use jq. But if you have tabular data represented in JSON then Miller can now handle that for you. Please see the reference page and the FAQ. Reshape is a standard data-processing idiom, now available in Miller: http://johnkerl.org/miller/doc/reference.html#reshape Incidentally (not part of this release, but new since the last release) Miller is now available in FreeBSD's package manager: https://www.freshports.org/textproc/miller/. A full list of distributions containing Miller may be found here. Miller is not yet available from within Fedora/CentOS, but as a step toward this goal, an SRPM is included in this release (see file-list below). DSL enhancements for mlr put and mlr filter: Regex captures \0 through \9: http://johnkerl.org/miller/doc/reference.html#Regex_captures Ternary operator in expression right-hand sides: e.g. mlr put '$y = $x < 0.5 ? 0 : 1' Boolean literals true and false Final semicolon is now allowed: e.g. mlr put '$x=1;$y=2;' Environment variables are now accessible, where environment-variable names may be string literals or arbitrary expressions: mlr put '$home = ENV["HOME"]' or mlr put '$value = ENV[$name]'. While records are still string-to-string maps for input and output, and between then statements, types are preserved between multiple statements within a put. Example: mlr put '$y = string($x); $z = $y . $y' works as expected, without requring mlr put '$y = string($x); $z = string($y) . string($y)' as before. Bug fixes: Mixed-format join, e.g. CSV file joined with DKVP file, was incorrectly computing default separators (IRS, IFS, IPS). This resulted in records not being joined together. Segmentation violation on non-standard-input read of files with size an exact multiple of page size and not ending in IRS, e.g. newline. (This is less of a corner case than it sounds: for example, leave a long-running program running with output redirected to a file, then in a sleep-and-process loop, have Miller process that file. The former program's stdio library will likely be doing block-sized buffered I/O, where block sizes will often be multiples of system page size and the block will almost surely not ending a newline.) Acknowledgements: Big thank-yous to @gregfr and @aaronwolen for feature requests including reshape and regex captures, and to @jungle-boogie for his work getting Miller into FreeBSD. Also, ongoing thanks to @0-wiz-0 for his past work on configure support, making it possible for Miller to be put to use in multiple operating systems. 3.3.2 Bootstrap sampling, EWMA, merge-fields, isnull/isnotnull functions @johnkerl johnkerl released this on Jan 11 · 497 commits to master since this release Bootstrap sampling in mlr bootstrap: http://johnkerl.org/miller/doc/reference.html#bootstrap. Compare to reservoir sampling in mlr sample: http://johnkerl.org/miller/doc/reference.html#sample. Exponentially weighted moving averages in mlr step -a ewma: principally useful for smoothing of noisy time series, e.g. finely sampled system-resource utilization to give one of many possible examples. Please see http://johnkerl.org/miller/doc/reference.html#step. "Horizontal" univariate statistics in mlr merge-fields, compared to mlr stats which is "vertical". Also allows collapsing multiple fields into one, such as in_bytes and out_bytes data fields summing to bytes_sum. This can also be done easily using mlr put. However, mlr merge-fields allows aggregation of more than just a pair of field names, and supports pattern-matching on field names. Please see http://johnkerl.org/miller/doc/reference.html#merge-fields for more information. isnull and isnotnull functions for mlr filter and mlr put. stats1, stats2, merge-fields, step, and top correctly handle not only missing fields (in the row-heterogeneous-data case) but also null-valued fields. Minor memory-management improvements. 2016-02-18 11:07:48 +01:00			`PKGNAME= ${DISTNAME:S/mlr/miller/}`
Import miller-2.0.0 as textproc/miller. Miller is like sed, awk, cut, join, and sort for name-indexed data such as CSV. With Miller, you get to use named fields without needing to count positional indices. This is something the Unix toolkit always could have done, and arguably always should have done. It operates on key-value-pair data while the familiar Unix tools operate on integer-indexed fields: if the natural data structure for the latter is the array, then Miller's natural data structure is the insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV. (Miller can handle positionally-indexed data as a special case.) 2015-08-28 11:27:10 +02:00			`CATEGORIES= devel`
			`MASTER_SITES= ${MASTER_SITE_GITHUB:=johnkerl/}`
			`GITHUB_PROJECT= miller`
Update miller to 3.4.0. Use release tarball and drop autotools dependencies. Changes in 3.4.0: JSON, reshape, regex captures, and more Primary features: JSON is now a supported format for input and output. Miller handles tabular data, and JSON supports arbitrarily deeply nested data structures, so if you want general JSON processing you should use jq. But if you have tabular data represented in JSON then Miller can now handle that for you. Please see the reference page and the FAQ. Reshape is a standard data-processing idiom, now available in Miller: http://johnkerl.org/miller/doc/reference.html#reshape Incidentally (not part of this release, but new since the last release) Miller is now available in FreeBSD's package manager: https://www.freshports.org/textproc/miller/. A full list of distributions containing Miller may be found here. Miller is not yet available from within Fedora/CentOS, but as a step toward this goal, an SRPM is included in this release (see file-list below). DSL enhancements for mlr put and mlr filter: Regex captures \0 through \9: http://johnkerl.org/miller/doc/reference.html#Regex_captures Ternary operator in expression right-hand sides: e.g. mlr put '$y = $x < 0.5 ? 0 : 1' Boolean literals true and false Final semicolon is now allowed: e.g. mlr put '$x=1;$y=2;' Environment variables are now accessible, where environment-variable names may be string literals or arbitrary expressions: mlr put '$home = ENV["HOME"]' or mlr put '$value = ENV[$name]'. While records are still string-to-string maps for input and output, and between then statements, types are preserved between multiple statements within a put. Example: mlr put '$y = string($x); $z = $y . $y' works as expected, without requring mlr put '$y = string($x); $z = string($y) . string($y)' as before. Bug fixes: Mixed-format join, e.g. CSV file joined with DKVP file, was incorrectly computing default separators (IRS, IFS, IPS). This resulted in records not being joined together. Segmentation violation on non-standard-input read of files with size an exact multiple of page size and not ending in IRS, e.g. newline. (This is less of a corner case than it sounds: for example, leave a long-running program running with output redirected to a file, then in a sleep-and-process loop, have Miller process that file. The former program's stdio library will likely be doing block-sized buffered I/O, where block sizes will often be multiples of system page size and the block will almost surely not ending a newline.) Acknowledgements: Big thank-yous to @gregfr and @aaronwolen for feature requests including reshape and regex captures, and to @jungle-boogie for his work getting Miller into FreeBSD. Also, ongoing thanks to @0-wiz-0 for his past work on configure support, making it possible for Miller to be put to use in multiple operating systems. 3.3.2 Bootstrap sampling, EWMA, merge-fields, isnull/isnotnull functions @johnkerl johnkerl released this on Jan 11 · 497 commits to master since this release Bootstrap sampling in mlr bootstrap: http://johnkerl.org/miller/doc/reference.html#bootstrap. Compare to reservoir sampling in mlr sample: http://johnkerl.org/miller/doc/reference.html#sample. Exponentially weighted moving averages in mlr step -a ewma: principally useful for smoothing of noisy time series, e.g. finely sampled system-resource utilization to give one of many possible examples. Please see http://johnkerl.org/miller/doc/reference.html#step. "Horizontal" univariate statistics in mlr merge-fields, compared to mlr stats which is "vertical". Also allows collapsing multiple fields into one, such as in_bytes and out_bytes data fields summing to bytes_sum. This can also be done easily using mlr put. However, mlr merge-fields allows aggregation of more than just a pair of field names, and supports pattern-matching on field names. Please see http://johnkerl.org/miller/doc/reference.html#merge-fields for more information. isnull and isnotnull functions for mlr filter and mlr put. stats1, stats2, merge-fields, step, and top correctly handle not only missing fields (in the row-heterogeneous-data case) but also null-valued fields. Minor memory-management improvements. 2016-02-18 11:07:48 +01:00			`GITHUB_RELEASE= v${PKGVERSION_NOREV}`
Import miller-2.0.0 as textproc/miller. Miller is like sed, awk, cut, join, and sort for name-indexed data such as CSV. With Miller, you get to use named fields without needing to count positional indices. This is something the Unix toolkit always could have done, and arguably always should have done. It operates on key-value-pair data while the familiar Unix tools operate on integer-indexed fields: if the natural data structure for the latter is the array, then Miller's natural data structure is the insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV. (Miller can handle positionally-indexed data as a special case.) 2015-08-28 11:27:10 +02:00
			`MAINTAINER= pkgsrc-users@NetBSD.org`
Switch github HOMEPAGEs to https. 2017-07-31 00:32:10 +02:00			`HOMEPAGE= https://github.com/johnkerl/miller/`
Import miller-2.0.0 as textproc/miller. Miller is like sed, awk, cut, join, and sort for name-indexed data such as CSV. With Miller, you get to use named fields without needing to count positional indices. This is something the Unix toolkit always could have done, and arguably always should have done. It operates on key-value-pair data while the familiar Unix tools operate on integer-indexed fields: if the natural data structure for the latter is the array, then Miller's natural data structure is the insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV. (Miller can handle positionally-indexed data as a special case.) 2015-08-28 11:27:10 +02:00			`COMMENT= Command-line CSV processor`
			`LICENSE= 2-clause-bsd`

Update miller to 2.1.1. Changes: v2.1.1 Incremental read-performance increase for CSV format While #51 is still underway, already there is nearly a 2x read-performance increase in v2.1.1 over v2.1.0. v2.1.0 Minor enhancements and bug fixes Highlights: travis-CI integration (thanks @SikhNerd!); hour-minute-second functions; fixed pretty-print alignment of UTF-8 data. 2015-09-04 15:46:37 +02:00			`BUILD_DEPENDS+= asciidoc-[0-9]*:../../textproc/asciidoc`

Update miller to 3.2.2: Many changes; speed ups, autoconf support, .... 2015-12-30 00:43:18 +01:00			`GNU_CONFIGURE= yes`
			`USE_LIBTOOL= yes`
Update miller to 3.4.0. Use release tarball and drop autotools dependencies. Changes in 3.4.0: JSON, reshape, regex captures, and more Primary features: JSON is now a supported format for input and output. Miller handles tabular data, and JSON supports arbitrarily deeply nested data structures, so if you want general JSON processing you should use jq. But if you have tabular data represented in JSON then Miller can now handle that for you. Please see the reference page and the FAQ. Reshape is a standard data-processing idiom, now available in Miller: http://johnkerl.org/miller/doc/reference.html#reshape Incidentally (not part of this release, but new since the last release) Miller is now available in FreeBSD's package manager: https://www.freshports.org/textproc/miller/. A full list of distributions containing Miller may be found here. Miller is not yet available from within Fedora/CentOS, but as a step toward this goal, an SRPM is included in this release (see file-list below). DSL enhancements for mlr put and mlr filter: Regex captures \0 through \9: http://johnkerl.org/miller/doc/reference.html#Regex_captures Ternary operator in expression right-hand sides: e.g. mlr put '$y = $x < 0.5 ? 0 : 1' Boolean literals true and false Final semicolon is now allowed: e.g. mlr put '$x=1;$y=2;' Environment variables are now accessible, where environment-variable names may be string literals or arbitrary expressions: mlr put '$home = ENV["HOME"]' or mlr put '$value = ENV[$name]'. While records are still string-to-string maps for input and output, and between then statements, types are preserved between multiple statements within a put. Example: mlr put '$y = string($x); $z = $y . $y' works as expected, without requring mlr put '$y = string($x); $z = string($y) . string($y)' as before. Bug fixes: Mixed-format join, e.g. CSV file joined with DKVP file, was incorrectly computing default separators (IRS, IFS, IPS). This resulted in records not being joined together. Segmentation violation on non-standard-input read of files with size an exact multiple of page size and not ending in IRS, e.g. newline. (This is less of a corner case than it sounds: for example, leave a long-running program running with output redirected to a file, then in a sleep-and-process loop, have Miller process that file. The former program's stdio library will likely be doing block-sized buffered I/O, where block sizes will often be multiples of system page size and the block will almost surely not ending a newline.) Acknowledgements: Big thank-yous to @gregfr and @aaronwolen for feature requests including reshape and regex captures, and to @jungle-boogie for his work getting Miller into FreeBSD. Also, ongoing thanks to @0-wiz-0 for his past work on configure support, making it possible for Miller to be put to use in multiple operating systems. 3.3.2 Bootstrap sampling, EWMA, merge-fields, isnull/isnotnull functions @johnkerl johnkerl released this on Jan 11 · 497 commits to master since this release Bootstrap sampling in mlr bootstrap: http://johnkerl.org/miller/doc/reference.html#bootstrap. Compare to reservoir sampling in mlr sample: http://johnkerl.org/miller/doc/reference.html#sample. Exponentially weighted moving averages in mlr step -a ewma: principally useful for smoothing of noisy time series, e.g. finely sampled system-resource utilization to give one of many possible examples. Please see http://johnkerl.org/miller/doc/reference.html#step. "Horizontal" univariate statistics in mlr merge-fields, compared to mlr stats which is "vertical". Also allows collapsing multiple fields into one, such as in_bytes and out_bytes data fields summing to bytes_sum. This can also be done easily using mlr put. However, mlr merge-fields allows aggregation of more than just a pair of field names, and supports pattern-matching on field names. Please see http://johnkerl.org/miller/doc/reference.html#merge-fields for more information. isnull and isnotnull functions for mlr filter and mlr put. stats1, stats2, merge-fields, step, and top correctly handle not only missing fields (in the row-heterogeneous-data case) but also null-valued fields. Minor memory-management improvements. 2016-02-18 11:07:48 +01:00			`TEST_TARGET= check`
Import miller-2.0.0 as textproc/miller. Miller is like sed, awk, cut, join, and sort for name-indexed data such as CSV. With Miller, you get to use named fields without needing to count positional indices. This is something the Unix toolkit always could have done, and arguably always should have done. It operates on key-value-pair data while the familiar Unix tools operate on integer-indexed fields: if the natural data structure for the latter is the array, then Miller's natural data structure is the insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV. (Miller can handle positionally-indexed data as a special case.) 2015-08-28 11:27:10 +02:00
			`.include "../../mk/bsd.pkg.mk"`