Hiya, I'm currently working on a little java program that will scan through a directory, reading php files and then dump out a list of function names for that particular file. Does any one know how I can modify my regex to exclude functions that are inside /* */ comments? I can work out how to match functions inside comments using the following with the DOTALL constant: -

(/\*).*?function (.+).*?(\*/)

but I've tried and I can't seem to inverse it so it only matches functions outside comments. Is anyone able to help? Please...

Btw, here is my draft code so far...

import java.io.*;
import java.util.*;
import java.util.regex.*;

public class ScanDir {
	FilenameFilter phpFileFilter;

	public static void main(String[] args) {

		new ScanDir().listOfFiles("/var/www/website/functions/");

	}

	public ScanDir() {
		
		// Set up the filter to include dirs and .php files
		phpFileFilter = new FilenameFilter() {
			public boolean accept(File path, String name) {
				File f = new File(path, name);
				if(f.isDirectory()) {
					return true;
				} else if(name.endsWith(".php")) {
					return true;
				} else {
					return false;
				}	
			}
		};
	}

	public void listOfFiles(String path) {
		File directory = new File(path);

		File[] files = directory.listFiles(phpFileFilter);
		if(files == null) {
			// Might not be a directory or directory does not exist
		} else {
			// Loop though directory listings
			for(int i=0; i<files.length; i++) {
				if(files[i].isDirectory()) {
					this.listOfFiles(files[i].getAbsolutePath());
				} else {
					fetchFunctions(files[i]);
				}
			}
		}
	}

	public void fetchFunctions(File file) {

		// Reads a php file and parses it for functions before printing it
		BufferedReader inputStream = null;

		ArrayList<String> functions = new ArrayList<String>();

		try {
			try {
				StringBuilder text = new StringBuilder();
				inputStream = new BufferedReader(new FileReader(file));
				int c;

				while ((c = inputStream.read()) != -1) {
					text.append( (char) c);
				}

				functions.addAll( scanTextForFunctions(text.toString()) );

			} finally {
				if(inputStream != null) {
					inputStream.close();
				}
			}

		} catch (IOException e) {
			System.out.println("Cannot process file!" + e.getMessage());
		}

		System.out.println();
		System.out.println(file.getName() + " functions");
		System.out.println("-------------------------------------");
		for(String f : functions) {
			System.out.println(f);
		}
	}

	public ArrayList<String> scanTextForFunctions(String text) {

		// Hunts down functions in the php text

		Pattern pattern = Pattern.compile("function (.+) \\{");
		Matcher matcher = pattern.matcher(text);
		ArrayList<String> functions = new ArrayList<String>();

		while(matcher.find()) {
			functions.add(matcher.group(1));
		}

		return functions;
	}

}

Recommended Answers

All 2 Replies

If you first run the text through an expression that separates the uncommented code sections from the commented out sections, you can then parse only the uncommented sections.

Thank you Ezzaral, I feel a bit silly now lol. I spend so long trying to work out the perfect regex pattern that does it all that I hadn't thought of splitting it up!

Richard

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.