Replaces every occasion of the pattern, or only the first occasion if there is a subexpression, between \( and \), anywhere in the regular expression, as repeated replace is not what one would expect in that case. The string size is restricted in POSIX regular expressions to the size of the int, approximately 32 kbytes, but otherwise such replace should be enough for anything necessary in real life.

/* replace using posix regular expressions */
#include <stdio.h>
#include <string.h>
#include <regex.h>

int rreplace (char *buf, int size, regex_t *re, char *rp)
    char *pos;
    int sub, so, n;
    regmatch_t pmatch [10]; /* regoff_t is int so size is int */

    if (regexec (re, buf, 10, pmatch, 0)) return 0;
    for (pos = rp; *pos; pos++)
        if (*pos == '\\' && *(pos + 1) > '0' && *(pos + 1) <= '9') {
            so = pmatch [*(pos + 1) - 48].rm_so;
            n = pmatch [*(pos + 1) - 48].rm_eo - so;
            if (so < 0 || strlen (rp) + n - 1 > size) return 1;
            memmove (pos + n, pos + 2, strlen (pos) - 1);
            memmove (pos, buf + so, n);
            pos = pos + n - 2;
    sub = pmatch [1].rm_so; /* no repeated replace when sub >= 0 */
    for (pos = buf; !regexec (re, pos, 1, pmatch, 0); ) {
        n = pmatch [0].rm_eo - pmatch [0].rm_so;
        pos += pmatch [0].rm_so;
        if (strlen (buf) - n + strlen (rp) + 1 > size) return 1;
        memmove (pos + strlen (rp), pos + n, strlen (pos) - n + 1);
        memmove (pos, rp, strlen (rp));
        pos += strlen (rp);
        if (sub >= 0) break;
    return 0;

int main (int argc, char **argv)
    char buf [FILENAME_MAX], rp [FILENAME_MAX];
    regex_t re;

    if (argc < 2) return 1;
    if (regcomp (&re, argv [1], REG_ICASE)) goto err;
    for (; fgets (buf, FILENAME_MAX, stdin); printf ("%s", buf))
        if (rreplace (buf, FILENAME_MAX, &re, strcpy (rp, argv [2])))
            goto err;
    regfree (&re);
    return 0;
err:    regfree (&re);
    return 1;
9 Years
Discussion Span
Last Post by xaviv

If you want to test it, just run it in some bash-like shell, giving the regular expressions and the replace pattern as arguments, preferably in the form $'expression' if you want to use escape sequences such as \t and \n. Then enter the text, and if your expression is correct, the the replaced line will appear, end with ctrl-d. In the basic regular expression you can use . for any character, range or set like [a-z0-9;-] or inverted like [^ ], and * or \{n,m\} for repeated character, range or subexpression. Subexpression is between \( and \), and you can refer to it with \n both in expression and replace pattern, where n is the number of subexpression, like in \1. Remember about *, that regular expression is evaluated from left to right, so the construct like a* must follow some other characters, and cannot be in the beginning of the expression, because * means none or more, and construct like a* alone doesn't determine any particular place.


I tested what a* does, in the beginning of the expression, in repeated replace. And guess what it does? It fills the string with replace pattern until it is full, the code finds it and then exits. But it's ok to use it in the beginning of the expression, if the search is not repeated. I'm really sorry for writing so many comments.


Oh i'm sorry, in accordance with my limits.h the biggest size of the string is 2147483647, some 2 GB so i guess large enough for most things...


There is an error in line 26. It should be:

      if (strlen (buf) - n + strlen (rp) > size) return 1;
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.