Menu
  • HOME
  • TAGS

Python: URL parsing issue while adding a trailing slash

python,url,url-parsing

You could pattern match on the last substring to check for known domains vs file extensions. It's not too difficult to enumerate at least the basic top level domains like .com, .gov, .org, etc. If you are familiar with regular extensions, you can match on a pattern like '.com$'. Otherwise,...

Java URL Class getPath(), getQuery() and getFile() inconsistent with RFC3986 URI Syntax

java,url,url-parsing,rfc3986

Here some executable code based on your fragments: import java.net.MalformedURLException; import java.net.URL; public class URLExample { public static void main(String[] args) throws MalformedURLException { printURLInformation(new URL("https://www.somesite.com/?param1=val1")); printURLInformation(new URL("https://www.somesite.com?param1=val1")); } private static void printURLInformation(URL url) { System.out.println(url); System.out.println("Path:\t" + url.getPath()); System.out.println("File:\t" + url.getFile());...

PHP - remove http/www from message (except for the host domain) to disable clickable links

php,html,url-parsing,input-sanitization

For anyone looking for an answer - I posted a related (more specific) question which solved the problem: PHP - remove words (http|https|www|.com|.net) from string that do not start with specific words