javalinear-search

Is there a reason .contains() would not work with scanner?


I am working on a linear search problem that takes a file of names and compares it to a phonebook file of names and numbers. My only task right now is to see how many names are in the phonebook file. Everything works as expected up until the if statement in my main method, but for the life of me, I cannot figure out what I am doing wrong. Through testing, I can print out all the lines in both files, so I know I am reading the files correctly. Output should be 500 / 500 as all the names are in the phonebook file of over a million lines. Please help.

package phonebook;

import java.util.Objects;
import java.util.Scanner;
import java.io.File;
import java.io.FileNotFoundException;

public class Main {
    final static String NAME_PATH = "C:\\Users\\{user}\\Downloads\\find.txt";
    final static String PHONEBOOK_PATH = "C:\\Users\\{user}\\Downloads\\directory.txt";

    private static String[] namesList(File file) {
        int count = 0;
        try (Scanner scanner = new Scanner(file)) {
            while (scanner.hasNextLine()) {
                scanner.nextLine();
                count++;
            }
            String[] names = new String[count];
            Scanner sc = new Scanner(file);
            for (int i = 0; i < count; i++) {
                names[i] = sc.nextLine();
            }
            return names;
        } catch (FileNotFoundException e) {
            System.out.printf("File not found: %s", NAME_PATH);
            return null;
        }
    }

    private static String timeDifference(long timeStart, long timeEnd) {
        long difference = timeEnd - timeStart;
        long minutes = (difference / 1000) / 60;
        long seconds = (difference / 1000) % 60;
        long milliseconds = difference - ((minutes * 60000) + (seconds * 1000));
        return "Time taken: " + minutes + " min. " + seconds + " sec. " +
                milliseconds + " ms.";
    }

    public static void main(String[] args) {
        File findFile = new File(NAME_PATH);
        File directoryFile = new File(PHONEBOOK_PATH);
        String[] names = namesList(findFile);
        int count = 0;
        try (Scanner scanner = new Scanner(directoryFile)) {
            System.out.println("Start searching...");
            long timeStart = System.currentTimeMillis();
            for (int i = 0; i < Objects.requireNonNull(names).length; i++) {
                while (scanner.hasNextLine()) {
                    if (scanner.nextLine().contains(names[i])) {
                        count++;
                        break;
                    }
                }
            }
            long timeEnd = System.currentTimeMillis();
            System.out.print("Found " + count + " / " + names.length + " entries. " +
                    timeDifference(timeStart, timeEnd));
        } catch (FileNotFoundException e) {
            System.out.printf("File not found: %s", PHONEBOOK_PATH);
        }
    }
}

Output:

Start searching...
Found 1 / 500 entries. Time taken: 0 min. 0 sec. 653 ms.
Process finished with exit code 0

Solution

  • The problem is how you are searching. If you want to search iteratively then you need to re-start the iteration for each name. Otherwise, you are merely searching forward in the phonebook. If the second name in the name list appears before the first name then you will only find one name since you will have exhausted the phonebook before finding anything.

    However, repeatedly reading the phonebook file is a costly endeavor. Instead, load the phone list (as you have done for the name list) and then you can iteratively search that list for each element in the name list. The following examples assume you are using List rather than arrays. Using for-each loops to make it obvious what is going on (versus using Stream API).

    List<String> names = loadNames();
    // each phonebook entry contains the name and the phone number in one string
    List<String> phonebook = loadPhonebook();
    int numFound = 0;
    
    for (String name : names) {
      for (String entry : phonebook) {
        if (entry.contains(name)) {
          ++numFound;
        }
      }
    }
    

    However, this is still an expensive task because you are repeatedly doing nested iterations. Depending on the format of the phonebook file you should be able to parse out the names and store these in a TreeSet. Then the search is constant time.

    List<String> names = loadNames();
    // phonebookNames are just the names - the phone number has been stripped away
    TreeSet<String> phonebookNames = loadPhonebookNames();
    int numFound = 0;
    
    for (String name : names) {
      if (phonebookNames.contains(name)) {
        ++numFound;
      }
    }
    

    Presumably, your assignment will eventually want to use the phone number for something so you probably don't want to drop that on the floor. Instead of parsing out just the name, you can capture the name and the phone number using a Map (key=name, value=phone number). Then you can count the presence of names thusly.

    List<String> names = loadNames();
    // phonebook is a Map of phone number values keyed on name
    Map<String,String> phonebook = loadPhonebook();
    int numFound = 0;
    
    for (String name : names) {
      if (phonebook.containsKey(name)) {
        ++numFound;
      }
    }