Hi!
I would like to build a minimalistic program which finds the tokens from a text. Its output should ideally be identical to the result of:
gposttl input_file | awk '{print $1}'
which are just the tokens.
An example:
That's a word and this is a number 123.456. I would like to write... but I can't resist. Let's hope John's forgiveness will embrace this fool act, hopefully.
And the output:
That
's
a
word
and
this
is
a
number
123.456
.
I
would
like
to
write
...
but
I
can
n't
resist
.
Let's
hope
John
's
forgiveness
will
embrace
this
fool
act
,
hopefully
.
My questions is not about how to do it but rather how to test whether my program is correct.