I have a list of file names in python like this:
HelloWorld.csv
hello_windsor.pdf
some_file_i_need.jpg
san_fransisco.png
Another.file.txt
A file name.rar
I am looking for an IntelliJ style search algorithm where you can enter whole words or simply the first letter of each word in the file name, or a combination of both. Example searches:
hw -> HelloWorld.csv, hello_windsor.pdf
hwor -> HelloWorld.csv
winds -> hello_windsor.pdf
sf -> some_file_i_need.jpg, san_francisco.png
sfin -> some_file_i_need.jpg
file need -> some_file_i_need.jpg
sfr -> san_francisco.png
file -> some_file_i_need.jpg, Another.file.txt, A file name.rar
file another -> Another.file.txt
fnrar -> A file name.rar
You get the idea.
Is there any Python packages that can do this? Ideally they'd also rank matches by 'frecency' (how often the files have been accessed, how recently) as well as by how strong the match is.
I know pylucene is one option but it seems very heavyweight given the list of file names is short and I have no interest in searching the contents of the file? Is there any other options?
You can do this by using the regular expression (import re) in the python and creating the function. This is bit complex but is achievable using regular expression.
import re
def intellij_search(search_term, file_list):
words = search_term.split()
#empty list for storing name
matching_files = []
for file_name in file_list:
# Initialize a variable to keep track.
matches_all_words = True
#Iterate over each word in the search term
for word in words:
# Create a regular expression pattern
pattern = '.*'.join(word)
# Check if the file name matches the pattern
if not re.search(pattern, file_name, re.IGNORECASE):
# If the file name does not match the pattern, set the
#variable to False and break the loop
matches_all_words = False
break
# If the file name matches all words in the search term, add it to
#the list of matching file name
if matches_all_words:
matching_files.append(file_name)
# Return the matche file
return matching_files
files = ['HelloWorld.csv', 'hello_windsor.pdf', 'some_file_i_need.jpg', 'san_francisco.png', 'Another.file.txt', 'A file name.rar']
#print(intellij_search('hw', files))
#print(intellij_search('sf', files))
#print(intellij_search('Afn', files))
I am not sure if you are looking for something like this or else.