I have an array (converted from a string) that contains words with non-standard letters (letters not used in English, like ć, ä, ü). I don't want to replace those characters, I want to get rid of the whole words that have them.
from [Adam-Smith, Christine, Müller, Roger, Hauptstraße, X Æ A-12]
to [Adam-Smith, Christine, Roger]
This is what I got so far:
<?php
$tags = "Adam-Smith, Christine, Müller, Roger, Hauptstraße, X Æ A-12";
$tags_array = preg_split("/\,/", $tags);
$tags_array = array_filter($tags_array, function($value){
return strstr($value, "a") === false;
});
foreach($tags_array as $tag) {
echo "<p>".$tag."</p>";
}
?>
I have no idea how to delete words that are not [a-z, A-Z, 0-9] and [(), "", -, +, &, %, @, #] characters. Right now the code deletes every word with an "a". What should I do to achieve this?
$raw = 'Adam-Smith, Christine, Müller, Roger, Hauptstraße, X Æ A-12, johnny@knoxville, some(person), thing+asdf, Jude "The Law" Law, discord#124123, 100% A real person, shouldntadd.com';
$regex = '/[^A-Za-z0-9\s\-\(\)\"\+\&\%\@\#]/';
$tags = array_map('trim', explode(',', $raw));
$tags = array_filter($tags, function ($tag) use ($regex) {
return !preg_match($regex, $tag);
});
var_dump($tags);
Yields:
array(9) {
[0]=>
string(10) "Adam-Smith"
[1]=>
string(9) "Christine"
[2]=>
string(5) "Roger"
[3]=>
string(16) "johnny@knoxville"
[4]=>
string(12) "some(person)"
[5]=>
string(10) "thing+asdf"
[6]=>
string(18) "Jude "The Law" Law"
[7]=>
string(14) "discord#124123"
[8]=>
string(18) "100% A real person"
}
If you want to include a full stop as an allowable character (if you were checking for email addresses), you can add \.
to the end of the regex.