phpfilerandomstrlenword-list

Get random word of specific length from wordlist


I am writing a simple PHP function that will access word-list.txt and pull a random word (the words are separated by a new line). This word needs to have a maximum length of $maxlength. The way that I have written it, it will pull the word and if the length is too long, then it will keep getting a new word until it is less than or equal to $maxlength. The issue that I am running into is that the script returns a fatal error for a maximum execution time. Here is the code:

function GetWord($maxlength) {
    $file_content = file('word-list.txt');
    $nword = $file_content[array_rand($file_content)];

    while(mb_strlen($nword) > $maxlength) {
        $nword = $file_content[array_rand($file_content)];
    }

    return $nword;
}

The only alternative that I could think of is putting the wordlist into a database and having a column with each corresponding word's length. That would allow me to select the word choices based on their length. I am trying to avoid having to use a database however, so I want to find out what is wrong with my script. Any help is greatly appreciated. Thanks!


Solution

  • The following class does some sorting when instantiated, but then every lookup for a random word takes only O(1) time:

    class RandomWord {
        private $words;
        private $boundaries;
    
        private static function sort($a, $b){
            return strlen($a) - strlen($b);
        }
    
        function __construct($file_name) {
            $this->words = file($file_name, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
    
            // Sort the words by their lenghts
            usort($this->words, array('RandomWord', 'sort'));
    
            // Mark the length boundaries
            $last = strlen($this->words[0]);
    
            foreach($this->words as $key => $word) {
                $length = strlen($word);
    
                if ($length > $last) {
                    for($i = $last; $i < $length; $i++) {
                        // In case the lengths are not continuous
                        //    we need to mark the intermediate values as well
                        $this->boundaries[$i] = $key - 1;
                    }
                    $last = $length;
                }
            }
        }
    
        public function get($max_length) {
            if (isset($this->boundaries[$max_length])) {
                return $this->words[rand(0, $this->boundaries[$max_length])];
            }
    
            return $this->words[array_rand($this->words)];
        }
    }
    

    Use it like:

    $r = new RandomWord("word-list.txt");
    $word1 = $r->get(6);
    $word2 = $r->get(3);
    $word3 = $r->get(7);
    ...
    

    Update: now I have tested it and works.