I want to print the string. In my code i am not getting the right string.

line="\python\001tag\file.txt"
str=re.search(r"[(0-9)+]",line) (dont use raw_string here)

print str.group()

This gives nothing. I want to extract 001 from there.

Note: I dont want to use rawstring.because here user is getting the path from other resource. Is it possible to replace single slash by double salsh to solve this problem.

If another person is passing the string like this "\Python\001tag\file.txt" without giving this as raw string. Here I want to search what string that person passed and in this i want to extract only the "001" number.how to do in this case?

I think you misunderstand the meaning of the raw strings. The important point is the sequence of characters that the string contains. You can see it by converting to a list. For example

>>> line = "\python\001tag\file.txt"
>>> list(line)
['\\', 'p', 'y', 't', 'h', 'o', 'n', '\x01', 't', 'a', 'g', '\x0c', 'i', 'l', 'e', '.', 't', 'x', 't']
>>> line = r"\python\001tag\file.txt"
>>> list(line)
['\\', 'p', 'y', 't', 'h', 'o', 'n', '\\', '0', '0', '1', 't', 'a', 'g', '\\', 'f', 'i', 'l', 'e', '.', 't', 'x', 't']
>>> line = "\\python\\001tag\\file.txt"
>>> list(line)
['\\', 'p', 'y', 't', 'h', 'o', 'n', '\\', '0', '0', '1', 't', 'a', 'g', '\\', 'f', 'i', 'l', 'e', '.', 't', 'x', 't']

In the first case, due to the interpretation of literal \ in code, python interpretes the sequence \001 as a single character with octal number 001, or hexadecimal \x01. In the same way \f is interpreted as the formfeed character \x0c. In the second case (raw string), the literal backslash is not special, and it is always interpreted as a backslash character. In the third case, we escape every backslash, and we obtain the same results.

For literal file names in windows OS, one would choose the second or third way.

In regular expressions, my advice is to always use raw strings, because one tries to produce strings for the regular expression language, where the backslash plays a special role. This role must not intermix with the special role that it plays in the normal lexical python code.

If you want the 001 in the line above, the correct regex is one of

r"[0-9]+"
r"\d+"
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.