I was testing this in prompt mode:

import re
def checkurl(url):
 check = re.match(r'^(:?https?://)?[^\/#?&]+\.[^\/#?&]+.*$',url)
 if check == None:
  print “NNOOOOOO!!!!”
 else:
  print “YESSSS!!!!!”


checkurl("javascript:void(0);")
checkurl("/account/general?ru=https%3a%2f%2fwww.bing.com%3a443%2fsearch%3fq%3dukraine%2bcrisis&FORM=SEFD")
checkurl("/?FORM=Z9FD1")
checkurl("http://choice.microsoft.com")

I have no idea why this won't work. Ignore the regex its only supposed to weed out relative paths and fragments of strings. I can't get the function to work at all.

I couldn't get the function syntax down correctly. Which had me confused.

There are only 10 lines of code there, 6 if you exclude the test calls. I know I should've added more clarity but it would've taken 30 seconds to figure out something was wrong. If there wasn't, then I must've screwed up something on my end, either python or my use of it, in which case say so. Nope. Just ask an open ended question and pretend it helps.

In thought maybe this was just some small quirk and a 2nd set of eyes would easily spot the error. nope

Its ok I found a work around. thread closed

I don't agree with that. Python's error messages are extremely helpful. The error message here was

  File "foo.py", line 6
    print “NNOOOOOO!!!!”
          ^
SyntaxError: invalid syntax

The compiler shows the exact position in the source file where there is an error. This is much more efficient than our eyes.

Except I still don't know what went wrong. I managed to get this to work when I maunally typed out the characters:

>>> import re
>>> def checkurl(url):
...  check = re.match(r'^(:?https?://)?[^\/#?&]+\.[^\/#?&]+.*$',url)
...  if check is None:
...   print "NNOOO!!!"
...  else:
...   print "YESSS!!!"
... 
>>> 
>>> 
>>> checkurl("javascript:void(0);")
NNOOO!!!
>>> checkurl("/?FORM=Z9FD1")
NNOOO!!!

The only thing I can conclude is there is some non-utf8 character being inserted due to using Mac OS X and a GUI. But IDK! vi doesn't pick up on anything. IDK. It does what it wants.

There is no such thing as a non utf8 character. Utf8 is not a set of characters, it is an encoding, a way to represent unicode code points as a sequence of bytes. On the other hand there is a set of valid characters for python source code. Your code contained unicode points U+201C and U+201D instead of the accepted U+0022 double quoting character. It may happen if you write code with a word processor instead of a text editor.

By the way, the encoding should be declared at the top of your program by a comment such as

# -*-coding: utf8-*-

Make sure that your editor saves the file as utf8.

Edited 2 Years Ago by Gribouillis

This question has already been answered. Start a new discussion instead.