1

Here is an example that would work. The clue was to add one space less than N to both the front and the back of the input string, so it becomes preformatted. Then it was simply a matter of selecting the appropriate slices and put those in a return list value.

```
#!/usr/bin/env python
# File: n-gram.py
def N_Gram(N,text):
NList = [] # start with an empty list
if N> 1:
space = " " * (N-1) # add N - 1 spaces
text = space + text + space # add both in front and back
# append the slices [i:i+N] to NList
for i in range( len(text) - (N - 1) ):
NList.append(text[i:i+N])
return NList # return the list
# test code
for i in range(5):
print N_Gram(i+1,"text")
# more test code
nList = N_Gram(7,"Here is a lot of text to print")
for ngram in iter(nList):
print '"' + ngram + '"'
```

The function N_Gram outputs exactly what you seem to want.

Good luck and happy coding.

_____

René

-1

How serious is the sparse data problem? Investigate the performance of n-gram taggers as n increases from 1 to 6. Tabulate the accuracy score. Estimate the training data required for these taggers, assuming a vocabulary size of 10 in 5degree and a tagset size of 10 in 2 degree. Please help me to solve this exersise!!!

You

This article has been dead for over six months: Start a new discussion instead