javaarrayssortinghashdisk

Ordering an Array with Hash Values


I have read data from a file and have taken every line from the file and then inserted them into an array. I need to convert these strings into bytes and write them to a disk, based hash file.

What I want to do is take every string with the same hash value and write them to the same sector on my disk. So far, what I have done is ordered them based on their hash value, which didn't work out very well towards the end of the array as there are 1000 elements and the largest hash value my function returns is 249.

Linear probing caused a lot of string to be out of place, so using this array to write to my sectors won't work very well. How should I go about this?

Here is my code of what I've done so far if I have not been being clear:

private void importFile(String dataFile) {
  String line = null;
  theDisk.clearDisk();

  try {
    BufferedReader bufferedReader = new BufferedReader(new FileReader(dataFile));

    // List to hold the lines 
    List<String> list = new ArrayList<>();

    while((line = bufferedReader.readLine()) != null){
      list.add(line);
    }

    String[] strArray = list.toArray(new String[0]);
    String[] orderedArray = new String[strArray.length];

    for(int i = 0; i < strArray.length; i++) {
      String current = strArray[i];
      // Use email as key
      String key = current.substring(0,current.indexOf(','));
      int index = hashFunc3(key);

      if(orderedArray[index] == null) {
        orderedArray[index] = current;
      } else {
        while(orderedArray[index] != null) {
          index = index+1;
        }
        orderedArray[index] = current;
      }
    }

    // Always close files.
    bufferedReader.close();     
  }

  catch(FileNotFoundException ex) {
    System.out.println("Unable to open file '" + dataFile + "'");
  }

  catch(IOException ex) {
    System.out.println("Error reading file '" + dataFile + "'");
  }
}

Solution

  • I’d suggest using an ArrayList of ArrayLists rather than an array. This will allow you to put lines with the same hash into the same inner ArrayList. Use the hash as index in the outer ArrayListto find the correct inner list. For initialization, fill up the outer list with empty ArrayLists (to avoid IndexOutOfBoundsException or NPE when filling into the inner list).

            // No need to put the lines into a list first;
            // just sort them by hash as we read them
            List<List<String>> orderedList = new ArrayList<>(maxHash3 + 1);
            // add empty array lists to ordered list to hold the lines
            for (int ix = 0; ix <= maxHash3; ix++) {
                orderedList.add(new ArrayList<>());
            }
    
            while((line = bufferedReader.readLine()) != null){
                  // Use email as key
                  String key = line.substring(0,line.indexOf(','));
                  int index = hashFunc3(key);
                  // add line to inner ArrayList
                  orderedList.get(index).add(line);
            }
    

    The above uses:

    private static final int maxHash3 = 249;
    

    Now you may do:

            // to write the lines to disk you may for instance do something like this:
            for (List<String> bucket : orderedList) {
                for (String currentLine : bucket) {
                    // write currentLine to file
                }
            }
    

    We might have used an array of ArrayList instead, but mixing arrays and collections doesn’t always work too well.