javanetwork-programmingintellij-ideaurlbackend

How can I detect the age recommendation for accessing a certain website?


Introduction to the problem:

Currently, my app has a feature that can detect whether a page/profile was found and if the respective link can be accessed in the first place. Now, I am trying to make a more secure program that will also check the age restrictions of the platform the user will access, as the app won't be allowed to send the user on a platform that has inappropriate content.

The problem:

I tried checking the whole content of a page by using a BufferedReader that will read each line and check for tokens that usually indicate age restrictions, but this doesn't work and it is not reliable and efficient.

Code related to the problem:

//Link Validation
public static boolean checkAbilityToCreate(String link)  {
    if (link.startsWith("http://") || link.startsWith("https://") && !link.equals("https://example.com")) {
        
        Pattern emailPattern = Pattern.compile
                ("([\\w.-]+)\\.([\\w .-]){2,}/(.+)");
        String checkOn = link.split("//")[1];
        Matcher matcher = emailPattern.matcher(checkOn);
        
        if (matcher.find()) {
            try {
                URI uri = new URI(link);
                URL url = uri.toURL();
            } catch (Exception e) {
                return false;
            }
            return valid_Hyper_Text_Transfer_Protocol(link);
        }
    }
    return false;
}

private static boolean valid_Hyper_Text_Transfer_Protocol(String httpToCheck) {
    
    try {
        URL url = new URL(httpToCheck);
        HttpURLConnection connection = (HttpURLConnection) url.openConnection();
        connection.connect();
        
        if (connection.getResponseCode() == HttpURLConnection.HTTP_OK) {
            return ContentValidator.isValidContent(connection);
        }
        
    } catch (IOException e) {
        //TODO - handle
    }
    
    return false;
}

private static class ContentValidator {
    public static boolean isValidContent(HttpURLConnection connection) {
        
        try {
            BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
            StringBuilder content = new StringBuilder();
            String dummy = "";
            while ((dummy = in.readLine()) != null) {
                content.append(dummy);
                if (content.toString().contains("page isn't available") ||
                        content.toString().contains("page not found")) {
                    return false;
                }

                //another if statement that was looking in the content for age 
                //restrictions
            }
            in.close();
            
            return true;
        } catch (IOException e) {
            return false;
        }
    }
}

The code shown is the way I validate a link, in the isValidContent method were the tokens that would check for any indicator of age recommendation at each line (I removed them as they wouldn't solve the problem).

What I ask for: I would apreciate any kind of suggestion or documentation that will further help me implement this feature.

Extra Details:


Solution

  • After more people told me so, I must accept that the only way I can create this feature is to either choose the top sites I won’t like the user to enter or to try into making a better pattern recognition algorithm still based on tokens that will identify if there is an age recommendation.