Hi,
I'm trying to work out this problem in my code where i am trying to replace a keyword with another word. The problem I am running into is that when the keyword is present inside of another word.
Eg,
"I'm trying to debug a problem but the debugger is not working"
I want to replace "debug" with "fix" but not have "debugger" replaced with "fixger"
Thanks.

Recommended Answers

All 6 Replies

This regular expression searches for "debug" not followed by an alphanumeric character or underscore

import re

pat = re.compile(r"debug(?!\w)")

sentence = "I'm trying to debug a problem but the debugger is not working"
print pat.sub("fix", sentence)

However it wouldn't work for "I'm trying to debug a problem but the variable my_debug has a wrong value" ;)

In the problem i am doing i need to apply it to a variable, which contains the keyword. Is it somehow possible to apply the re.compile to a variable? I am also using .replace to replace the keyword with the new word if this makes any difference.

I've also been looking around and from what I've found the answer may have something to do with $, \b, \Z and others \characters but i don't know how to apply these to the variable.

The argument of re.compile is a string which represents a regular expression, (or a variable having this value). However, the value returned by re.compile is not a string, but a regular expression object, which doesn't have a method replace . That's why I'm using sub . See http://docs.python.org/lib/node46.html#l2h-393 I think you should describe your problem more completely. Will the keyword change ? etc.

Ok,
I'll try to describe it better.
I have function that has a text which contains keywords, a keyword which i must replace with word from a list of keywords.

I've solved how to replace the keywords with the keywords in a given list, but there are keywords that are present in the text that are similiar eg OUR_HERO and OUR_HEROINE, and when i try replace OUR_HERO to say James Bond OUR_HEROINE also gets replaced with James BondINE.

Hope this is clearer

There is a technique from the world of parsing and compilation, which uses a dictionary for the keywords. In the text, you look for all identifiers, and if the identifier is in the dictionary (it's a keyword), then it's replaced by a certain value. Here is a way to implement this

import re

kwords = {
  "OUR_HERO" : "James Bond",
  "debug" : "fix",
  "foo": "bar",
}

# a regular expression matching all identifiers
identifier = re.compile(r"[A-Za-z_]\w*")

def repl(match_obj):
  # this function will be called for each identifier in the text during substitution
  word = match_obj.group(0)
  return kwords.get(word, word)

text = """
THis is the text i'm trying to debug with
the help of OUR_HERO and OUR_HEROIN, but it's
a damned foo bar.
"""

print identifier.sub(repl, text)

You can use escape sequence \b to match exact words.

import re

s = "I'm trying to debug a problem but the debugger is not working"

target = 'debug'
repl = 'fix'

patt = re.compile(r'\b%s\b' % target)
print patt.sub(repl, s)

Gribouillis has a very nice solution. I am going to use his example to show a different approach.

text = """
This is the text i'm trying to debug with
the help of OUR_HERO and OUR_HEROIN, but it's
a damned foo bar.
"""
kwords = {"OUR_HERO" : "James Bond",
          "debug" : "fix",
          "foo": "bar",
          'this':'it'}

for key in kwords:
    patt = re.compile(r'\b%s\b' % key, re.I)
    text = patt.sub(kwords[key], text)
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.