Dear All,

I am a total newbie to Python and programming in general. I know I'd find more materials for Python2, but Python3 was a reflected choice.

That said, I have gone trough:

http://www.daniweb.com/forums/thread173960-2.html

and tried to assemble my spell checker, and ended up with the following code:

#!/usr/bin/python3
# Filename: spellcheck2.py

correct = []
unknown = []

dict_file = open("DictionaryE.txt", "r").readlines()

for i in range(len(dict_file)):
    dict_file[i] = dict_file[i][0:len(dict_file[i])-2] #eliminate \n, line characters in the dictionary

input_text = open("text.txt", "r").read()
input_text = input_text.lower() #avoid problems with CAPS

list_words = input_text.split(' ')

print(list_words)

for word in list_words:
    if word in dict_file:
        correct.append(word)
    else:
        unknown.append(word)

print()
print("Correct words are: ")
print()
for x in range(len(correct)):
    print(x+1, '\t', correct[x])
 
print()
print("Unknown words are: ")
print()
for z in range(len(unknown)):
    print(z+1, '\t', unknown[z])

Please find attached the files with the dictionary and the sample text.

I really cannot understand why some words that certainly are in the dictionary (like "very" and "among") end up in the unknown words list. Any help would be welcome.

As a secondary issue, I couldn't figure out how to use multiple separators (in addition to space, also have punctuation) with the "split" command for lists.

Thanks in advance for any help.


Yeti

Attachments
a
aah
aahed
aahing
aahs
aardvark
aardvarks
aardwolf
ab
abaci
aback
abacus
abacuses
abaft
abalone
abalones
abandon
abandoned
abandonedly
abandonee
abandoner
abandoners
abandoning
abandonment
abandonments
abandons
abase
abased
abasedly
abasement
abaser
abasers
abases
abash
abashed
abashedly
abashes
abashing
abashment
abashments
abasing
abatable
abate
abated
abatement
abatements
abater
abaters
abates
abating
abatis
abatises
abator
abattoir
abattoirs
abbacies
abbacy
abbatial
abbe
abbes
abbess
abbesses
abbey
abbeys
abbot
abbotcies
abbotcy
abbots
abbotship
abbotships
abbott
abbr
abbrev
abbreviate
abbreviated
abbreviates
abbreviating
abbreviation
abbreviations
abbreviator
abbreviators
abc
abdicable
abdicate
abdicated
abdicates
abdicating
abdication
abdications
abdicator
abdomen
abdomens
abdominal
abdominally
abduct
abducted
abducting
abduction
abductions
abductor
abductors
abducts
abeam
abecedarian
abecedarians
abed
aberdeen
aberrance
aberrancies
aberrancy
aberrant
aberrantly
aberrants
aberration
aberrational
aberrations
abet
abetment
abets
abettal
abettals
abetted
abetter
abetters
abetting
abettor
abettors
abeyance
abeyances
abeyancies
abeyancy
abeyant
abhor
abhorred
abhorrence
abhorrences
abhorrent
abhorrently
abhorrer
abhorrers
abhorring
abhors
abidance
abide
abided
abider
abiders
abides
abiding
abidingly
abidingness
abigail
abilene
abilities
ability
abiotic
abject
abjection
abjectly
abjectness
abjuration
abjurations
abjuratory
abjure
abjured
abjurer
abjurers
abjures
abjuring
ablate
ablated
ablates
ablating
ablation
ablations
ablatival
ablative
ablatively
ablatives
ablaze
able
ableness
abler
ables
ablest
ablings
abloom
ablush
abluted
ablution
ablutionary
ablutions
ably
abnegate
abnegated
abnegates
abnegating
abnegation
abnegations
abnegator
abnegators
abner
abnormal
abnormalities
abnormality
abnormally
abnormals
abo
aboard
abode
aboded
abodes
aboding
aboil
abolish
abolishable
abolished
abolisher
abolishers
abolishes
abolishing
abolishment
abolition
abolitionary
abolitionism
abolitionist
abolitionists
abominable
abominably
abominate
abominated
abominates
abominating
abomination
abominations
abominator
abominators
aboral
aboriginal
aboriginally
aborigine
aborigines
aborning
abort
aborted
aborter
aborters
abortifacient
aborting
abortion
abortional
abortionist
abortionists
abortions
abortive
abortively
abortiveness
abortogenic
aborts
abound
abounded
abounding
abounds
about
above
aboveboard
aboveground
aboves
abracadabra
abradant
abradants
abrade
abraded
abrader
abraders
abrades
abrading
abraham
abrasion
abrasions
abrasive
abrasively
abrasiveness
abrasives
abreact
abreacted
abreacting
abreaction
abreacts
abreast
abridge
abridged
abridgement
abridgements
abridger
abridgers
abridges
abridging
abridgment
abridgments
abroad
abrogate
abrogated
abrogates
abrogating
abrogation
abrogations
abrogative
abrogator
abrogators
abrupt
abrupter
abruptest
abruptly
abruptness
abs
abscam
abscess
abscessed
abscesses
abscessing
abscise
abscised
abscises
abscising
abscissa
abscissae
abscissas
abscission
abscissions
abscond
absconded
absconder
absconders
absconding
absconds
absence
absences
absent
absented
absentee
absenteeism
absentees
absenter
absenters
absentia
absenting
absently
absentminded
absentmindedly
absentmindedness
absents
absinth
absinthe
absinthes
absinths
absolute
absolutely
absoluteness
absoluter
absolutes
absolutest
absolution
absolutions
absolutism
absolutist
absolutistic
absolutists
absolvable
absolve
absolved
absolver
absolvers
absolves
absolving
absorb
absorbability
absorbable
absorbed
absorbencies
absorbency
absorbent
absorbents
absorber
absorbers
absorbing
absorbingly
absorbs
absorption
absorptions
absorptive
abstain
abstained
abstainer
abstainers
abstaining
abstains
abstemious
abstemiously
abstemiousness
abstention
abstentionism
abstentionist
abstentions
abstentious
abstinence
abstinent
abstinently
abstract
abstracted
abstractedly
abstractedness
abstracter
abstracters
abstracting
abstraction
abstractionism
abstractionist
abstractionists
abstractions
abstractly
abstractness
abstractor
abstractors
abstracts
abstricts
abstruse
abstrusely
abstruseness
abstruser
abstrusest
absurd
absurder
absurdest
absurdities
absurdity
absurdly
absurdness
absurds
absurdum
abt
abubble
abundance
abundances
abundant
abundantly
abusable
abusage
abuse
abused
abuser
abusers
abuses
abusing
abusive
abusively
abusiveness
abut
abutment
abutments
abuts
abuttal
abuttals
abutted
abutter
abutters
abutting
abuzz
abyes
abysm
abysmal
abysmally
abysms
abyss
abyssal
abysses
abyssinia
abyssinian
abyssinians
ac
acacia
acacias
academe
academes
academia
academias
academic
academical
academically
academician
academicians
academicianship
academicism
academics
academies
academy
acadia
acanthi
acanthus
acanthuses
acapulco
accede
acceded
accedence
acceder
acceders
accedes
acceding
accelerable
accelerando
accelerant
accelerate
accelerated
accelerates
accelerating
acceleration
accelerations
accelerative
accelerator
accelerators
accelerometer
accelerometers
accent
accented
accenting
accents
accentual
accentuate
accentuated
accentuates
accentuating
accentuation
accentuator
accept
acceptability
acceptable
acceptableness
acceptably
acceptance
acceptances
acceptant
acceptation
accepted
acceptedly
acceptee
acceptees
accepter
accepters
accepting
acceptive
acceptor
accepts
access
accessability
accessed
accesses
accessibility
accessible
accessibleness
accessibly
accessing
accession
accessions
accessories
accessorily
accessoriness
accessors
accessory
accidence
accident
accidental
accidentally
accidentalness
accidentals
accidents
accidie
accidies
acclaim
acclaimed
acclaimer
acclaimers
acclaiming
acclaims
acclamation
acclamations
acclimate
acclimated
acclimates
acclimating
acclimation
acclimatization
acclimatize
acclimatized
acclimatizer
acclimatizes
acclimatizing
acclivities
acclivitous
acclivity
accolade
accolades
accommodate
accommodated
accommodates
accommodating
accommodatingly
accommodation
accommodational
accommodations
accommodative
accommodatively
accommodativeness
accommodator
accommodators
accompanied
accompanies
accompaniment
accompaniments
accompanist
accompanists
accompany
accompanying
accompanyist
accompli
accomplice
accomplices
accomplis
accomplish
accomplishable
accomplished
accomplisher
accomplishers
accomplishes
accomplishing
accomplishment
accomplishments
accord
accordable
accordance
accordant
accordantly
accorded
accorder
accorders
according
accordingly
accordion
accordionist
accordionists
accordions
accords
accost
accostable
accosted
accosting
accosts
account
accountability
accountable
accountableness
accountably
accountancy
accountant
accountants
accountantship
accounted
accounter
accounters
accounting
accounts
accouter
accoutered
accoutering
accouterment
accouterments
accouters
accoutred
accoutrement
accoutres
accoutring
accredit
accreditation
accredited
accreditee
accrediting
accreditment
accredits
accrete
accreted
accretes
accreting
accretion
accretionary
accretions
accruable
accrual
accruals
accrue
accrued
accruement
accrues
accruing
acct
accts
acculturate
acculturation
acculturational
acculturative
accumulable
accumulate
accumulated
accumulates
accumulating
accumulation
accumulations
accumulative
accumulatively
accumulativeness
accumulator
accumulators
accuracies
accuracy
accurate
accurately
accurateness
accurse
accursed
accursedly
accursedness
accurst
accusable
accusal
accusals
accusant
accusation
accusations
accusative
accusatively
accusativeness
accusatives
accusatorial
accusatorially
accusatory
accusatrix
accusatrixes
accuse
accused
accuser
accusers
accuses
accusing
accusingly
accusive
accusor
accustom
accustomed
accustoming
accustoms
ace
aced
acerb
acerbate
acerbated
acerbates
acerbating
acerber
acerbest
acerbic
acerbities
acerbity
acerola
acerose
acerous
aces
acetaldehyde
acetaminophen
acetanilide
acetate
acetates
acetic
acetified
acetifies
acetify
acetifying
acetone
acetones
acetonic
acetylcholine
acetylene
acetylsalicylic
ache
ached
achene
achenes
achenial
aches
achier
achiest
achievable
achieve
achieved
achievement
achievements
achiever
achievers
achieves
achieving
achilles
achiness
aching
achingly
achoo
achordate
achromat
achromatic
achromatically
achromatism
achromats
achy
acid
acidhead
acidheads
acidic
acidifiable
acidification
acidified
acidifier
acidifiers
acidifies
acidify
acidifying
acidities
acidity
acidly
acidness
acidophilus
acidoses
acidosis
acidotic
acids
acidulate
acidulated
acidulates
acidulating
acidulation
acidulous
acidulously
acidulousness
acidy
acing
acknowledge
acknowledgeable
acknowledged
acknowledgedly
acknowledgement
acknowledgements
acknowledger
acknowledgers
acknowledges
acknowledging
acknowledgment
acknowledgments
aclu
acme
acmes
acne
acned
acnes
acoin
acolyte
acolytes
aconite
aconites
acorn
acorns
acoustic
acoustical
acoustically
acoustics
acquaint
acquaintance
acquaintances
acquaintanceship
acquaintanceships
acquainted
acquainting
acquaints
acquiesce
acquiesced
acquiescence
acquiescent
acquiescently
acquiesces
acquiescing
acquiesence
acquirable
acquire
acquired
acquirement
acquirements
acquirer
acquirers
acquires
acquiring
acquisition
acquisitions
acquisitive
acquisitively
acquisitiveness
acquit
acquits
acquittal
acquittals
acquitted
acquitter
acquitting
acre
acreage
acreages
acred
acres
acrid
acrider
acridest
acridities
acridity
acridly
acr
Python is an interpreted general-purpose high-level programming language whose design philosophy emphasizes code readability Python aims to combine remarkable power with very clear syntax and its standard library is large and comprehensive Its use of indentation for block delimiters is unusual among popular programming languages

Maybe beter?

for i in range(len(dict_file)):
    #dict_file[i] = dict_file[i][0:len(dict_file[i])-2] #eliminate \n, line characters in the dictionary
    dict_file[i] = dict_file[i].strip()
list_words = input_text.strip().split(' ')
Unknown words are: 
()
(1, '\t', 'general-purpose')
(2, '\t', 'high-level')

Python 2.5 Linux

Edited 5 Years Ago by -ordi-: n/a

Thank you so much -ordi- (also for the speed in the reply!!).

That indeed did the trick. I will go through documentation on "strip" to try to get a better hand of it. I presume that the problem was with strange characters in between the sample words, right?

Cheers,


Yeti

Thank you so much -ordi- (also for the speed in the reply!!).

That indeed did the trick. I will go through documentation on "strip" to try to get a better hand of it. I presume that the problem was with strange characters in between the sample words, right?

Cheers,


Yeti

Yeah, strip() removes that.

It's not good way:

input_text = input_text.replace('-', ' ')

list_words = input_text.strip().split()
Unknown words are: 
()

Maybe here -> http://reliablybroken.com/b/2010/04/split-a-file-on-any-character-in-python/ it's better.

or http://www.daniweb.com/forums/thread338875.html

Edited 5 Years Ago by -ordi-: n/a

Well, I actually found your solution quite intelligent.

However, only to avoid several different lines of "replace" (one for each punctuation character), I used "string.punctuation" like this:

import string

for i in string.punctuation:
    input_text = input_text.replace(i, ' ')

This is certainly not the best way to do it, but at least I understand it and, for the moment, that is better than simply copying this stuff:

http://stackoverflow.com/questions/265960/best-way-to-strip-punctuation-from-a-string-in-python

Thanks once again -ordi-

This question has already been answered. Start a new discussion instead.