Hi,
I want to parse PHP entities (classes, variables functions methods and properties) and save information on database (function signature, variable name et al). I'm trying to see alternatives for how I can archieve that with any language (the technique). So far I'm thinking of using regex to match stuffs but there might be better ways. I want to do almost same thing as ctags does.

Is there any other technique than using regex to match patterns?
Thanks!

I want to do something like
open php file>>>Parse entities>>>save to database (line number, signature, filepath et al)

example can be in PHP, no problem with that :)
thanks!

Recommended Answers

All 10 Replies

I think you have to write a lexical analyzer for that by providing rules to identify lexemes. Try YACC -> http://en.wikipedia.org/wiki/Yacc

can you explain why I need that? I don't need syntax highlighting. Let me illustrate with example. I have file like this one

<?php
class YaddaClass{
    private $variable;
    public function __construct(){
    
    }
    
    public function getYadda($input){
    
    }

}

?>

Now I need parser to give me something like

YaddaClass	ctags.php	2;"	c
YaddaClass::__construct	ctags.php	4;"	m
YaddaClass::getYadda	ctags.php	8;"	f
variable	ctags.php	3;"	p

To do the syntax highlighting, first we have to identify the syntax to be highlighted. I think that is the first step you have to take. You do not have to do syntax highlighting. You just need to identify the keywords, variable names, functions etc.

Using Java Reflection we can inspect the fields (member variables) of classes and get / set them at runtime. But I do not think PHP has something like that. My best guess is that you have to implement a lexical analyzer.

To do the syntax highlighting, first we have to identify the syntax to be highlighted. I think that is the first step you have to take. You do not have to do syntax highlighting. You just need to identify the keywords, variable names, functions etc.

Using Java Reflection we can inspect the fields (member variables) of classes and get / set them at runtime. But I do not think PHP has something like that. My best guess is that you have to implement a lexical analyzer.

I don't need any highlighting, I need parsing files and getting entities like I showed example above. I need how to do it just like ctags does but not everything that ctags does!

Yes, I know that you don't want any syntax highlighting. That is why I said that you just need to identify the keywords, variable names, functions etc using lexical analyzer. ctags is good if can read the tag file line by line and split to identify useful parts.

Yes, I know that you don't want any syntax highlighting. That is why I said that you just need to identify the keywords, variable names, functions etc using lexical analyzer. ctags is good if can read the tag file line by line and split to identify useful parts.

Good then, I want to write something similar to ctags but a very lightweight and I wanted to know the tricks behind so that I don't fall in a ditch someone have fallen before :)

I wrote a lexical analyzer using C++ for a user define simple language. Since php is much more complex you better use a lexical analyzer generator. Read http://dinosaur.compilertools.net/

I wrote a lexical analyzer using C++ for a user define simple language. Since php is much more complex you better use a lexical analyzer generator. Read http://dinosaur.compilertools.net/

you mean ctags is lexical analyzer?

Actually I have no idea about the implementation of the ctags. but it seems like an advanced lexical analyzer which can handle multiple languages

Actually I have no idea about the implementation of the ctags. but it seems like an advanced lexical analyzer which can handle multiple languages

So any good tutorial for very new bee to the arena? I have not done this before!

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.