I want to start developing an IDE or at least a Syntax highlighter into my program. Atm it's a simple text editor with basic copy paste functions.

How does Notepad++ do it? Is there a way I can integrate that into my own program? If not then I'll write it on my own. Also how does codeblocks compile our programs. I read that VC++ took 17 years to be created so how come codeblocks is so new and yet it compiles sourcecode :S

Also if codeblocks is using a backend compiler can I use that same thing in my program or is it going to take me years too?

Edited by triumphost

4 Years
Discussion Span
Last Post by mike_2000_17

I have made a few compilers for esoteric programming languages (I have no idea why but I find them fun to write :P) and I find that even making a simple compiler can be very tricky, c++ would be near impossible to write. Luckily, you can use g++ from a console command, and as such you can compile your code without writing the compiler yourself. Syntax highlighting is not as tricky, but is still a challenge. I have written one highlighter that turns a c++ source file into an HTML output file that is properly highlighted. I took a string as input, and made an array of the same length of an enum of code states, then wrote a function that runs throught the string and sets the array to have the right highlighting. Here is an example:

#include <iostream>
using namespace std;
int main()
    cout<<"Hello World!"<<endl;
    return 0;



where p=PREPROCESSOR, k=KEYWORD, o=OPERATOR, c=REGULAR CODE, s=STRING LITERAL, n=NUMERIC LITERAL. If you need more help I could give you some source code, but I think you should try to figure it out yourself.


How does Notepad++ do it

Notepad++ uses Scintilla as a back-end for the syntax highlighting and other coding-related features (code completion, etc.). Most IDEs and enhanced text editors (like Notepad++, emacs, vim, etc.) use only a limited number of libraries that do the actual syntax highlighting. There is no reason to reinvent the wheel, many syntax highlighting engines are available that can take raw text and produce a markup output that has all colouring in it, others will handle the entire text-editing too.

Most projects like this just start off simple and build up over the years. Basic syntax highlighting is quite simple, just look for keywords for a given language (e.g., in C++, the list could be "for if else while do void int float double ...") and highlight them with the configured color or font. Once you do that, you can start checking for indentations, then record variable and type names and highlight those too, and so on, until you have a complete system.

Also how does codeblocks compile our programs.

Codeblocks does not compile programs. When you install codeblocks, it also installs MinGW which is a minimalist port for Windows of GCC (GNU Compiler Collection, which includes a C++ compiler amongst many other things). GCC is one of the oldest and most robust compiler suites out-there, and it is free and open-source (obviously, it's GNU). MinGW is a port for Windows, because GCC is primarily developed as a Unix (or Unix-like) tool.

Codeblocks, like most other IDEs, can also work with many different compilers. A compiler is basically just a console program that takes the source files and some options (include directories, libraries to link, compilation options, etc.) as command-line arguments and produce the binaries (executable or DLLs, or whatever). The IDEs simply invoke the compiler with your source files and picks up its output (warnings, errors, etc.) and shows it to you. Most IDEs can be configured to do this with any compiler. The main C++ compilers are GCC (g++), ICC (Intel Compiler Collection, with i++), MSVC (MicroSoft Visual C++ compiler, called cl.exe), Clang, Comeau, and Digital Mars Compilers.

I read that VC++ took 17 years to be created so how come codeblocks is so new and yet it compiles sourcecode :S

The compiler, MSVC, probably did take a very long time to produce, especially given the sometimes dubious competence of Microsoft programmers. GCC and ICC and others are even older. Creating a good compiler is not a small task. Also, you have to understand that the programming languages evolve and the compilers evolve along with them. For example, the 2008 version of MSVC was pretty much the first reasonably standard-compliant C++ compiler from microsoft (for the 2003 standard for C++), and now, there is a new standard for C++ (called C++11), so they have a load of work to update it again. This is a natural evolution, and it is also why there aren't that many compilers (and many compilers share the same front-ends (translates your code to C code) or back-ends (translates the C code into binary code)). It is a monumental task to create a compiler, especially a production-grade compiler. Companies that provide compilers provide them for a good reason (for MS, they need it to allow seemless Windows development because most other compilers are Unix-based) (for Intel, they need it to allow people to make full use of their hardware if people want kick-ass performance).

Also if codeblocks is using a backend compiler can I use that same thing in my program or is it going to take me years too?

You can use the compiler directly, either in command-line or from within your program (system call). In fact, many experienced programmers don't use an IDE for compiling at all. In fact, there is more to this, it is not just IDE and a compiler. There is the IDE (or text editor), then the build system, and then the compiler (and then the linker, but that is usually part of the compiler invocation). The build system is what handles the invocation of the compiler. It is usually a pseudo-scripting language that allows you to configure your build (or "project") and it will then do all the monkey-work to invoke the compiler correctly, and everything (for example, the vcproj files are just scripting files for the MSVC build system). So, many projects only rely on a build system because it doesn't require a particular IDE or a particular compiler, or even a particular OS, so all you need is the source code and the build script and you're good to go. Personally, I much prefer that, and most open-source projects are also done like that.

So, if you build your own IDE program, you probably want to simply create the menus necessary to configure an off-the-shelf build system, then you won't even have to worry about the compiler. I highly recommend cmake, which is one of the most popular and powerful build systems.

Edited by mike_2000_17: typo


K well I designed my Interface already:

I looked at CMake I'm not sure but it seems like its for buildiing my project.. I don't want to build my project. I want my project to build my code in the text editor. I've looked at G++ commands and used a couple but my commands are extremely long!

ProjectFolder>g++ main.cpp -I"Libraries/ZLib" -I"Libraries/Curl"
-I"Libraries/Boost" -I"........\MinGW\include" -L"C:\MinGW\lib\libgdi32.a" -
L"Libraries\ZLib\libz.a" -L"curl" -L"ws2_32" -L"wldap32" -L"winmm" -L"Libraries\
Boost\lib\libboost_regex-mgw46-mt-1_47.a" -L"Libraries\Boost\lib" -l"gdiplus"

Like that but that's only HALF my command :S.. It also gives me a ton of unresolved symbols and external stuff missing. I'm not sure if order matters or not but that command is so long that there has to be an easier or another way.

How can I get my program to do that for me?

Edited by triumphost


How can I get my program to do that for me?

That's what CMake does!

This is the typical process from writing some code in an IDE to getting it into an executable:

  • Use the text-editor (or text-editing part of the IDE) to write some code and save them to files (headers and cpp files, or others);
  • Use the configuration menus to setup things like:

    • Include-paths to look for the #included headers;
    • Compilation options (optimizations, language standard, and other more advanced things);
    • External libraries to be linked to the project (the .lib, .a, .dll and .so files needed by the project);
    • Warnings to be enabled / disabled (should enable all by default);
    • Whether it is a Debug build or a Release build;
    • Destination directory for final executable or DLL, and directories for intermediate output file (object files);
    • etc.
  • Click on the "Build" button (or "Run") which triggers the IDE to:

    • Generate a build script from all the given configurations;
    • Execute the build script through a back-end console or terminal (or shell);
    • Redirect the output of that back-end console to display it within the IDE in one way or another (either a simple reprinting of it to a text-box, often below the coding area, or parse it to collect the errors and warnings into some list-view or something like that);
  • If the build is successful, run the executable (if any).

Once you get the text-editing part working, with code-highlighting and whatever else you want, then you have to worry about the remaining steps.

First, of course, you have to create some menus and all that to allow the user to input all the configurations. That's easy enough, mostly drag-dropping some GUI components on a dialog box and writing some code to record all these parameters. Then, you have to (1) find a way to save/load those configurations to a file such that the user can reload them later, and (2) generate a build script from these parameters.

Saving and loading the configurations can be done in a variety of ways. That's not a huge issue.

Generating the build script can be as simple as constructing a string which is the command to run the compiler. That's easy, you just do something like this (say that you have a struct called config that holds all the configuration parameters):

std::stringstream ss;
ss << "g++ " << config.warning_flags << " -o " << config.exec_file_name << " ";
for(int i = 0; i < config.source_files.size(); ++i)
  ss << config.source_files[i] << " ";

At the end you have a string (within the stringstream) that you can use at a system command:

system(ss.str().c_str()); // resulting in: "g++ -Wall -o my_project.exe my_source1.cpp my_source2.cpp ... "

And that's pretty much it, for a very crude build script.

In reality, this is way too naive and won't be seriously useful to anyone. A proper build script generator would need to inspect the system to find a proper compiler, figure out the command needed to invoke it, scan system folders for installed libaries to link to, do some intermediate steps like caching configurations in some hidden build-directory (not to have to redo the same things again), etc. etc.

The reason why I suggested CMake is because it takes care of all these steps, all you have to do is generate a file called "CMakeLists.txt" in the source directory and call CMake on the source directory (e.g. "cmake C:\Projects\MyProject\src". The advantage is that generating the CMake file is very simple because you don't have to worry about a lot of the details of the platform (available compilers, system libraries, etc.) because CMake does that for you and all you need to put in the CMake files are exactly the kinds of things that the user would provide you with, e.g., include-paths, external libraries, compilation options, etc. So, you can almost literally just save what the user gives you into a CMakeLists.txt file and then invoke CMake on that, avoiding all the nasty steps.

BTW, the command you showed was really short. In many projects, the resulting command-line that invokes the compiler can fill an entire screen or more of the command prompt. But nobody actually writes out the compiler command manually, either you do it through an IDE, or you use a build-system like CMake, or Make, or autoconf, or MSBuild, or bjam, or qmake, or a plethora of other build-systems. And no respectable build-system lacks the ability for highly customized building scripts, meaning that most build-system are basically full-blown script interpreters, and CMake is amongst those that I like the best. All I'm saying is that for your IDE project, you should not program your own build-system for that IDE, but simply make your IDE a thin-shell for an already existing build-system (most IDEs are build like that too, MSVC uses MSBuild, QDevelop uses qmake (which is very similar to cmake), etc.).

Edited by mike_2000_17: typo

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.