pythonpython-3.xfor-loopcomparisoniteration

How to print words that only contain letters from a list?


I have recently been trying to create a program in Python 3 which will read a text file which contains 23005 words, the user will then enter a string of 9 characters which the program will use to create words and compare them to the ones in the text file.

I want to print words which contains between 4-9 letters and that also contains the letter in the middle of my list. For example if the user enters the string "anitsksem" then the fifth letter "s" must be present in the word.

Here is how far I have gotten on my own:

# Open selected file & read
filen = open("svenskaOrdUTF-8.txt", "r")

# Read all rows and store them in a list
wordList = filen.readlines()

# Close File
filen.close()

# letterList index
i = 0
# List of letters that user will input
letterList = []
# List of words that are our correct answers
solvedList = []

# User inputs 9 letters that will be stored in our letterList
string = input(str("Ange Nio Bokstäver: "))
userInput = False

# Checks if user input is correct
while userInput == False:
   # if the string is equal to 9 letters
   # insert letter into our letterList.
   # also set userInput to True
    if len(string) == 9:
        userInput = True
        for char in string:
            letterList.insert(i, char)
            i += 1

    # If string not equal to 9 ask user for a new input
    elif len(string) != 9:
        print("Du har inte angivit nio bokstäver")
        string = input(str("Ange Nio Bokstäver: "))

# For each word in wordList
# and for each char within that word
# check if said word contains a letter from our letterList
# if it does and meets the requirements to be a correct answer
# add said word to our solvedList

for word in wordList:
    for char in word:
        if char in letterList:
            if len(word) >= 4 and len(word) <= 9 and letterList[4] in word:
                print("Char:", word)
                solvedList.append(word)

The issue that I run into is that instead of printing words which only contain letters from my letterList, it prints out words which contains at least one letter from my letterList. This also mean that some words are printed out multiple time, for example if the words contains multiple letters from letterList.

I've been trying to solve these problems for a while but I just can't seem to figure it out. I Have also tried using permutations to create all possible combinations of the letters in my list and then comparing them to my wordlist, however I felt that solution was to slow given the number of combinations which must be created.

    # For each word in wordList
    # and for each char within that word
    # check if said word contains a letter from our letterList
    # if it does and meets the requirements to be a correct answer
    # add said word to our solvedList
    for word in wordList:
        for char in word:
            if char in letterList:
                if len(word) >= 4 and len(word) <= 9 and letterList[4] in word:
                    print("Char:", word)
                    solvedList.append(word)

Also since I'm kind of to new to Python, if you have any general tips to share, I would really appreciate it.


Solution

  • You get multiple words mainly because you iterate through each character in a given word and if that character is in the letterList you append and print it.

    Instead, iterate on a word basis and not on a character basis while also using the with context managers to automatically close files:

    with open('american-english') as f:
        for w in f:
            w = w.strip()
            cond = all(i in letterList for i in w) and letterList[4] in w
            if 9 > len(w) >= 4 and cond:
                print(w)
    

    Here cond is used to trim down the if statement, all(..) is used to check if every character in the word is in letterList, w.strip() is to remove any redundant white-space.

    Additionally, to populate your letterList when the input is 9 letters, don't use insert. Instead, just supply the string to list and the list will be created in a similar, but noticeably faster, fashion:

    This:

    if len(string) == 9:
        userInput = True
        for char in string:
            letterList.insert(i, char)
            i += 1
    

    Can be written as:

    if len(string) == 9:
        userInput = True
        letterList = list(string)
    

    With these changes, the initial open and readlines are not needed, neither is the initialization of letterList.