Beschreibung
In Ihrem Textmuster der unerwünschte String ähnelt eine E-Mail-Adresse, aber endet bequem in jpg
. Bei einem negativen Lookahead können wir also die Dateinamenserweiterungen ausschließen.
(?!\S*\.(?:jpg|png|gif|bmp)(?:[\s\n\r]|$))[A-Z0-9._%+-][email protected][A-Z0-9.-]{3,65}\.[A-Z]{2,4}
Beispiel
Live Demo
https://regex101.com/r/mU7bO3/2
Beispieltext
[email protected] [email protected] [email protected]
Probe Spiele
[email protected]
[email protected]
Erklärung
NODE EXPLANATION
----------------------------------------------------------------------
(?! look ahead to see if there is not:
----------------------------------------------------------------------
\S* non-whitespace (all but \n, \r, \t, \f,
and " ") (0 or more times (matching the
most amount possible))
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
jpg 'jpg'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
png 'png'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
gif 'gif'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
bmp 'bmp'
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
[\s\n\r] any character of: whitespace (\n, \r,
\t, \f, and " "), '\n' (newline), '\r'
(carriage return)
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
$ before an optional \n, and the end of
a "line"
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
[A-Z0-9._%+-]+ any character of: 'A' to 'Z', '0' to '9',
'.', '_', '%', '+', '-' (1 or more times
(matching the most amount possible))
----------------------------------------------------------------------
@ '@'
----------------------------------------------------------------------
[A-Z0-9.-]{3,65} any character of: 'A' to 'Z', '0' to '9',
'.', '-' (between 3 and 65 times (matching
the most amount possible))
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
[A-Z]{2,4} any character of: 'A' to 'Z' (between 2
and 4 times (matching the most amount
possible))
----------------------------------------------------------------------
Was ist die Sprache, die Sie verwenden? –
Warum begrenzen Sie die TLDs auf 4 Zeichen? Siehe: http://www.iana.org/domains/root/db – Toto
Ich kenne die Sprache nicht - es ist ein Stück Software (geschrieben von jemand anderem), die Text/HTML-Dateien basierend auf einer Regex, die kann vom Benutzer geändert werden. Offensichtlich ist der Standard-Regex veraltet - danke Toto. – Melchester