Data using simple preprocessing
This commit is contained in:
parent
68e5f235a1
commit
b5f573f927
504139
data/eu_train_simple.tsv
Normal file
504139
data/eu_train_simple.tsv
Normal file
File diff suppressed because it is too large
Load diff
4
preprocessing/preprocess-simple.sh
Normal file
4
preprocessing/preprocess-simple.sh
Normal file
|
@ -0,0 +1,4 @@
|
||||||
|
#!/usr/bin/env sh
|
||||||
|
|
||||||
|
sed -E 's|^--?||;s|^"?–||;s|—||;s|―||;s|^_||;s|^ ||' < "${1:-/dev/stdin}" |
|
||||||
|
uniq | paste - -
|
Loading…
Reference in a new issue