What does it really mean by "escaping" in a string ?


The escape-sequence replacements are −

\n is replaced by the newline character
\r is replaced by the carriage-return character
\t is replaced by the tab character
\$ is replaced by the dollar sign itself ($)
\" is replaced by a single double-quote (")
\ is replaced by a single backslash ()

Does it mean "translate"/"convert"/"substitute" or does it mean "ignore this single quote and don't take it as the ending single quote" ? I thought the former but some google result links mention the latter but mostly Googling brings irrelevant results. Brings up links related to mysqli_real_escape_string.

It's a term from the days where we used terminals. I used to tap the escape key before I typed something to get another function or character.

Today it's used when you have a language that for example expects strings in quote marks. So how do you put in a quote mark? (see your question!)

Every language that displays human-readable text has "non-printable" characters that must be "displayed" sometimes. When coding, you need to somehow be able to display those characters (in the code, not during execution) in a human-readable way or you'd go insane trying to read the code. Thus language needs to be able to tell the compiler or interpreter what is to be seen as parsable code versus "string literals" which is NOT to be parsed for any special meaning in the language.

For accuracy sake, I'll point out that the above is an over-generalization. You could design an entire language where there are non string literals, but most of the main ones have to have them.

For those that need to have string literals, there must be something that tells the compiler (or interpreter) "This is a string literal". That's usually done by surrounding the string literal with single or double quotes, but there are exceptions.

So if I am the, let's say, C++ compiler and I come across double quotes...

string str = "This is a string.";

Easy enough. I have a bunch of characters. The ones INSIDE the quotes are NOT to be parsed as code. The ones OUTSIDE the quotes ARE to parsed as code. I throw the quotes away. They serve simply to denote what is to be parsed as code and what is not. str contains This is a string.

Now suppose I want to DISPLAY quotes or for whatever reason have quotes as part of my string variable. The code below, in C++, will make the parser unhappy.

string str = "These are quotes: "";

Without escape characters, I (the C++ parser) see THREE quotes characters and I need them in pairs when denoting the boundaries of string literals, so I have to throw an error when I get to the third double quote. Now WITH an escape character backslash, no error...

string str = "These are quotes: \"";

str contains These are quotes: "

The backslash tells the parser to NOT treat the quotes immediately after it as denoting the BEGINNING or END of the string literal, but instead as PART of the string literal.

In summary, every language must be parsed into meaningful sections that can turned into "tokens" (anything meaningful to the language roughly). A string literal is a token. To that end, every language has rules and keywords and characters that denote where tokens start and end. However, all of those keywords and characters can be PART of a token rather than denoting the END of a token, and that requires "escaping" that keyword or character so that it is interpreted as such. In C++ it's done with a backslash. Thus \\ is an "escaped" backslash, which means "treat this as a single backslash character and not an 'escape'".

\$ is replaced by the dollar sign itself ($)

Note that languages vary. The above is needed in PHP but not C++ because the dollar sign has special meaning in PHP, but not C++.

Note: A quick google search suggests I may have been using the word "parse" incorrectly and what I am referring to above as parsing may actually be the "lexer" stage.

Hmm. Well I lost my post yet again. Anyway, to the OP, your link...

has some problems in this example regarding the parsing and escaping. The point is made just fine regarding the $ and how it parses in single vs. double quotes. But the output it shows is incorrect (run it and see. The exclamation point does not disappear, among other things) and you should delete the print "<br />"; line and change \\n to \n and run in command line rather than the web/HTML to get any interesting results. As always, experiment.

   $variable = "name";
   $literally = 'My $variable will not print!\\n';

   print "<br />";

   $literally = "My $variable will print!\\n";

Member Avatar

Escaping is a pretty big subject - sanitising data for queries and possibly re-casting datatypes. Backslashing certain "sensitive" characters, like $ for creating literals. Backslashing single and double quotes within those types of quotes... htmlentities and htmlspecialchars can also be used - especially on < and >.

However - try not to escape with backslashes or htmlentities to store data in a DB, as this may make things difficult when querying - searching for "O'Dowd" will not be positive for stored "O\'Dowd" or "O&#8217;Dowd" . Simply use them for displaying data to the screen in most cases.

Thanks guys!

I figured this out myself. Escaping means different things at different times. Sometimes it means translate something into another or convert or substitute (eg. convert /n to next line).
Sometimes it means do not take this/these single quote(s) as the closing single quote in a string (eg. echo 'get lost you \'bird brain\' dude').
I figured all this out myself as tutorials weren't that clear thus complicating things. I wanted an experienced person to confirm my guesses. And, it's just been done!