Regex to extract the top level domain from a URL -
i want extract top level domain url : logs :
<182>jul 28 13:52:34 proxysquid1 logger: 1501249953.155 0 192.168.4.27 tcp_miss/503 2408 post http://xxxxx.ddns.net:1220/is-ready - direct/154.68.5.134 text/html i want top level domain :
ddns i tried regex
([\da-z\.-]+)\.([a-z\.]) but got
xxxxx.ddns could me solve this.
thanks
you kind of mistook words here... tld (top level domain) refers last segment of domain name or part follows after "dot" symbol. (e.g.: .com, .net, etc..)
what you're searching second level domain (or sld).
i've edited daveo's answer question, match returned first capture group:
(?:[-a-za-z0-9@:%_\+~.#=]{2,256}\.)?([-a-za-z0-9@:%_\+~#=]*)\.[a-z]{2,6}\b(?:[-a-za-z0-9@:%_\+.~#?&\/\/=]*) here demo: https://regex101.com/r/x2luio/1
explanation:
(?:[-a-za-z0-9@:%_\+~.#=]{2,256}\.)?- first part before sld (subdomains).([-a-za-z0-9@:%_\+~#=]*)- capturing group (where domain should returned)\.[a-z]{2,6}- match tld (if want capture)\b(?:[-a-za-z0-9@:%_\+.~#?&\/\/=]*)- , rest of regex, should match port and/or rest of url (/example/page/).
it's point regex not match if you're testing domain sld , cctld (country code tld) 'combo', example: .co.uk , .co.it, both end of domain commercial , general websites, however, both return co sld.
Comments
Post a Comment