HI guys, a while ago I built an application to save strings to file https://www.daniweb.com/programming/software-development/threads/501460/application-to-save-wordsentences-to-file and I was wondering, what if I wanted to have the ability to search those strings, say I can choose a search key, like "bring" and search all the strings (and when I say strings I really mean sentences) for that key and when there is a match I display the whole sentence containing that key.
I'm looking into sorting and searching algorithms now, but they all seems to be working with ints, like, the binary search, (the binary search could be a good aid in this project). The thing is though, if I have multiple sentences, how would I sort them first?
Say I have "Sentence number 3", "Sentence number 1", "Sentence number 2", etc

Recommended Answers

All 62 Replies

If the term you are searching for can be anywhere in the sentence, then sorting isn't going to relevant, and binary search won't help. If you will always be searchng for whole words, then maybe you could parse out all the words in each sentence and build some kind of map, but it's prpbably not worth the effort. You are stuck witha simple exhauastive search.

If your dataset is small enough to fit in memory the all you need is to loop through all the entries using something like contains(String searchString)to check each entry - just a few lines of code and execution times well within "amost instant".

Cool thanks, I don't think I'll implement the search, I was more curious than anything.
So the binary search can also be implemented with strings not only with ints
Cheers

A binary search can be implemented for any data type or class that can be sorted into a known order, so Strings are definitely possible. The problem here is that binary search works when you know the whole value you are searching for, not when you just know one part of the value that's not necessarily at the start of the value.
Why not implement the search anyway? It's only a few lines and the practice will be good for you.

OK I understand, thanks.
That's OK I can implement that, is that I was thinking to do another application, and this might be a bit more than a few lines because if I implement the functionality I need to inplement the GUI as well. That's fine, will do it, you're right in the end, it's a good exercise :-)!
OK, so a few questions:
-can I use the already developed application to save to file rather than redoing it https://www.daniweb.com/programming/software-development/threads/501460/application-to-save-wordsentences-to-file
-the way I envisage this to work is this: the GUI (I know I have to do the functionality first and then the GUI but I might as well clarify a few things) will have, at the bottom of the window after the current buttons, a label explaining what to do, an extra input field for the keyword to be typed in, a search button to search for the keyword and a new textArea which I can use to display the returned strings.
About the functionality now.

loop through all the entries using something like contains(String searchString)to check each entry

When you say entries, you mean the entries stored in the file presumably?
The thing is though, shouldn't I first somehow save all the entries in the file in some kind of array and then loop through it, or that's not necessary and I can loop throught as it is?
When I loop through them, whichever way we do it, I will have tofind the keyword, presumably with contains(String searchString) but how do I then retrieve just the sentece where the keyword is contained?
thanks

  1. If you have code that does what you want then re-use it. That's just "working smart".
    But in this case I think your existing code writes 1 word per line, whereas to find sentences you can simplify that to write one sentence per line.

  2. For the search read the whole file into an array or List, one sentence per entry - you already said it won't be very big. Then loop thru it, roughly like

    for (String sentence : allSentences) {
    if (sentence.contains(searchStrint)) <display sentence>
    }

thanks

But in this case I think your existing code writes 1 word per line, whereas to find sentences you can simplify that to write one sentence per line.

It writes one sentence per line, or rather, whatever the input in one line, so it should be fine:

private class ProcessButtonHandling implements ActionListener{
        public void actionPerformed(ActionEvent event){
            stringInput = input.getText();//copy text from textArea to string
            scannerInput = new Scanner(stringInput);            
            try{    
                fileWriter = new BufferedWriter( new FileWriter(file,true));
            }           
            catch(SecurityException securityException){//if you don't have write access
                System.err.println("You don't have write access to this file");
                System.exit(1);
            }
            catch(FileNotFoundException fileNotFoundException){
                System.err.println("Error opening or creating the file");
                System.exit(1);
            }
            catch(IOException ioexception){
                System.err.println("General Error with IO");
                System.exit(1);
            }
            while(scannerInput.hasNext()){
                stringInput = scannerInput.nextLine();
                try{
                    fileWriter.append(stringInput + separator);
                }
                catch(IOException ioexception){
                    System.err.println("General Error with IO");
                    ioexception.printStackTrace();
                    System.exit(1);
                }
                //System.out.printf("Enter your sentence or end of file - ctrl+z or Enter+ctrl+d\n");
            }
            CloseFile();
            Clear();
        }//end of actionPerformed

    }//end of inner class

Ooops, yes you're right. So no need to change or re-write anything there.

A quick one sorry, what is the best way to add that functionality discussed without relying on the event generated by the button (so that I test it before adding the GUI part)?

read the whole file into an array or List, one sentence per entry - Then loop thru it

Without a button that calls the function, when would it be a good time to create this array and loop through it? I'm thinking to put the functionality in a function anyway, so that I can call it when I add the button

EDIT: Ah also one more thing. What if a keyword I use to search is in more than one sentence? I need to be able to return all the sentences that contain that keyword. I say that because I was thinking to have a function that saves the sentence in the array, search for the keyword and return the relevant sentence/s, so the function will return a string:
public String searchWord(){}

The search will bo so fast that you can do it for every character the user types. Of course after only 1 char you will get very many hits, so you will want some kind of limit. Pseaudo code maybe like this:

// during startup one-time load the file into allSentences

List<String> getMatchingSentences(String search) {
    create temp empty list
    for (String sentence : allSentences) {
        if sentence contains search {
            if  list is > 10 entries // arbitrary limit
                return  null or maybe an empty list // too many matches
            add sentence to list
        }
    }
    return list
}

so now in your event handler for any typing in the entry field you can just say

List<String> matches = getMatchingSentences(entryField.getText())
if matches != null  display the matches

it will look like magic when you run it - once you have typed a few letters the matches will start to display and get fewer with every letter you type. The matches will display as fast as you can type, with no detectable delay.

AH OK, I think this is even better, so I don't need to implement a button as it is real time (the results textArea could have an event handler so every type I type something the results are shown, even if this might be a bit more complicated to implement, but I can try).
WIth reading and writing on a file though, I think there could be a problem. In my current application I write to a file using a Scanner scannerInput, so I suppose now I need another variable of type Scanner, one that allows me to read from file, but I don't suppose I can do something like this:

private Scanner scannerOutput;//to output data from the file
...
public void fileInArray(){
        scannerOutput = new Scanner(new File(workingDir, fileName));        
    }

considering that there I've already created a file is already a file

fileName = "sample.txt";
file = new File(workingDir, fileName); 

in the constructor to deal with writing data to the file?
What I'm trying to say is, since a file sample.txt already exists I can't risk creating another one as it will overwrite the current one, presumably. So, when I read from file rather than creating a new one with scannerOutput = new Scanner(new File(workingDir, fileName));, how do I make sure that I open and use the old one?
I had a look on the net for "reading an existing java file" but they all insist on the above approach, so maybe by using that syntax you don't create a new file?

There's nothing wrong with
Scanner inputScanner = new Scanner(new File(workingDir, fileName));

Yes you're creating a new File object in Java's heap, but that just refers to the same existing real file when you use it for input only like that ... or you could re-use your existing File object to create the new Scanner.No problem.

On the other hand, this is 2016 so you shouild probably be learning the "new" (as in "new since Java 1.4 somewhere back in the dark ages") NIO classes. In particular the Files class (That's Files not File) has a readAllLines method that reads all the lines in a text file ito a List<String> - just what you need!

Cool thanks.

In particular the Files class (That's Files not File) has a readAllLines method that reads all the lines in a text file ito a List<String> - just what you need!

Yes thanks, readAllLines is the one I was looking at myself after I've done a bit of googling as it seems the quickest and easiest way to copy everything from a file. And yes, the NIO classes are a very new thing for me, somewhat confusing at the same time because there are io classes and then nio classes which look in fact the same, but I'm sure they do different things (the NIO call just static members or something like that?)
I'm just going to code the

I think NIO is a mixed bag. It was created because of problems and limitations in the original file handling (eg support of symbolic links, OS-dependent attributes etc), but it's design is wierd in an OO language because it's all static methods just like we wrote in the 1970's. Oh well.

Sorry, didn't have much time to look at this. OK, I made some small changes, the idea was that before looping through all the sentences in the file, I'd copy them over onto an arraylist, but I'm getting some errors. I used the readAllLines method and , since I haven't modified the GUI as yet I wanted to print the array's elements in the console.
I've added the various import statements (full code available here http://pastebin.com/w86c2ZVi ):

import java.nio.file.Files;//for readAllLines()
import java.nio.file.Paths;//used to get the path of the file
import java.util.ArrayList;//to work with the array list to save the sentences in an ArrayList 
import java.util.List;
import java.util.Collection;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
...
private ArrayList< String > allSentences = new ArrayList< String >();//stores all the sentences in the file
private final static Charset ENCODING = StandardCharsets.UTF_8;
...
public SentenceRecorder(){
    allSentences = fileInArray();//copying content of file in arrayList
    printSentenceArray();
   }
public ArrayList<Object> fileInArray(){//save file text into array
        try{
            scannerOutput = new Scanner(new File(workingDir, fileName));
        }
        catch(FileNotFoundException fileNotFoundException){
            System.err.println("Error opening the file");
            System.exit(1);
        }
        Path path = Paths.get(fileName);
        return Files.readAllLines(path, ENCODING);

    }
    public void printSentenceArray(){
        System.out.println("Array contains: ");
        for(int count = 0; count < allSentences.size(); count++){
            System.out.printf("%s", allSentences.get(count));
        }

    }

but the compiler is complaining:

G:\JAVA\GUI\2015\createFrames\files\withSearch>javac *.java
SentenceRecorder.java:71: error: incompatible types
                allSentences = fileInArray();//copying content of file in arrayList
                                          ^
  required: ArrayList<String>
  found:    ArrayList<Object>
SentenceRecorder.java:151: error: cannot find symbol
                Path path = Paths.get(fileName);
                ^
  symbol:   class Path
  location: class SentenceRecorder
2 errors

G:\JAVA\GUI\2015\createFrames\files\withSearch>

I'm trying to copy the elements of the array list in an ArrayList variable by doing this allSentences = fileInArray();
And this, admittedly, I've found it somewhere (but unfortunately I must have closed the tab in my browser by mistake and I can't find the URL where I found this anymore):

Path path = Paths.get(fileName);
return Files.readAllLines(path, ENCODING);

The syntax seems a bit strange I have to say...

required: ArrayList<String> found: ArrayList<Object>

You delare your method as returning a list of <Object>, but the receiving variable is a list of <String>. You can't assign Objects to Strings.

readAllLines returns a List<String> to that's how you should declare your return type and your receiving variable.

Ah...mmm that was a silly mistake...sorry about that.
The other error though seems a bit more...er, how to put it, difficult.
I'm importing the nio class import java.nio.file.Paths; and that seems to be enough to cover both Path and Paths, although Paths appears to be an interface, is that a problem?
From memory a "cannot find symbol" error occurs because an identifies is used without declaring, but here I'm declaring a variable of type Path when I use it, inside the - now amended - fileInArray() method:

public ArrayList<String> fileInArray(){//save file text into array
        try{
            scannerOutput = new Scanner(new File(workingDir, fileName));
        }
        catch(FileNotFoundException fileNotFoundException){
            System.err.println("Error opening the file");
            System.exit(1);
        }
        Path path = Paths.get(fileName);
        return Files.readAllLines(path, ENCODING);
        ...

So, could it be that path beeds to be declared as a private variable inside the constructor?

I'm importing the nio class import java.nio.file.Paths; and that seems to be enough to cover both Path and Paths, although Paths appears to be an interface, is that a problem?

Yes, it is. importing Paths does not import Path. Why would it?
You need to import them both, or justimport java.nio.file.*;

Yes, it is. importing Paths does not import Path. Why would it?

Right, I don't know why I thought that. Is there a general rule then that we should follow when using import statements, like when is it better to import only one or more than one classes as opposed to import them all with xxx.*?

Like in the case discussed above, I amended it to be import java.nio.file.*;, but how about this instead:

import java.nio.file.Files;//for readAllLines()
import java.nio.file.Paths;
import java.nio.file.Path;

Also, slightly different thing:

readAllLines returns a List<String> to that's how you should declare your return type and your receiving variable.

I think I need to change a few more things as well then, like, I declared allSentences as
private ArrayList< String > allSentences = new ArrayList< String >();
but that is wrong then, it should be private List< String > allSentences = new ArrayList< String >();
I've always thought, quite clearly erroneously, that ArrayList and List were effectively, if not the same thing, at least compatible types, meaing I could use them interchangeably.
Same with the method signature, it needs to be
public List<String> fileInArray(){...}
as opposed to
public ArrayList<String> fileInArray(){...}
Let me try and see if it works

imports:
In general the best thing is to only import the classes you need. That avoids problems where, unknown to you, there are classes in different packages with the same name. Having said that, in a big gui most people will import javax.swing.* rather than every single Swing class they use.

List vs Arraylist

List is an interface, ArrayList is a class that implements List. You can (often do) declare a variable of type List because all you care about is that it has all the List methods. But if you want to create an instance you have to decide what class to use, eg ArrayList, LinkedList, Vector etc.
Now in the case of readAllLines the doc says it returns a List. Of course, in reality it will have to return a real instance of ArrayList, LinkedList or whatever, but you won't know which at compile time, and maybe it will change with the next point release of Java. So in your code you can only declare your method's return type as List, and you can only assign that to variables of type List.
(This is a common thing in the Java API - a method will be declared as returning a List, or a Collection, or some other interface or abstract class. The API implementation will then chose a suitable concrete class to create its variables, depending on trade-offs that you know nothing about.)

OK cool, thanks.
Talking about return types, in here the compiler complained of a uncaught exception - I'm afraid I'm not terribly good with the exception, meaning that I'm never sure when to use them but in any case, I added one as requested:

try{
            return Files.readAllLines(path, ENCODING);
        }
        catch(IOException ioexception){
            System.err.println("General Error with IO");
            ioexception.printStackTrace();
            System.exit(1);
        }

but it's now saying

G:\JAVA\GUI\2015\createFrames\files\withSearch>javac *.java
SentenceRecorder.java:160: error: missing return statement
        }
        ^
1 error

Now, readAllLines() returns a List<String> so both the signature and the returned data are compatible, that leaves me with only one explanation: is it wrong to include a return statement inside a try block?
Sorry, that's how the whole method looks like now:

public List<String> fileInArray(){//save file text into array
        try{
            scannerOutput = new Scanner(new File(workingDir, fileName));
        }
        catch(FileNotFoundException fileNotFoundException){
            System.err.println("Error opening the file");
            System.exit(1);
        }
        Path path = Paths.get(fileName);
        try{
            return Files.readAllLines(path, ENCODING);
        }
        catch(IOException ioexception){
            System.err.println("General Error with IO");
            ioexception.printStackTrace();
            System.exit(1);
        }
    }

Yes, that's a well-known case of the compiler being stupid.
Lines 16 it just sees a call to a method in the System class, but it doesn't realise that that will terminate the program. So it thinks your code will drop through to lines 17 and 18 and reach the end of the method without returning a valid value.
To get round that you can just stick a return null; after your System.exit. That will keep the compiler happy.

ps What's that unused Scanner doing in there?

To get round that you can just stick a return null; after your System.exit. That will keep the compiler happy.

OK done.

ps What's that unused Scanner doing in there?

The idea was that, as I used a scannerInputto write to file, I'd use a scannerOutput to read from file, but it hasn't been necessary so far, so I'll comment that out for now, might be useful later, not sure. To be completely honest with you, I don't understand how we've opened the file and copied the info over to the List. I thought that this Path path = Paths.get(fileName); was just retrieving the file name and this return Files.readAllLines(path, ENCODING);merely reading through the files, so evidently this line opens also the files magically, or I simply missed it altogether, which, you know, it might be entirely possible since the compiler now is throwing an awful lot of runtime errors at me:

G:\JAVA\GUI\2015\createFrames\files\withSearch>java SentenceRecorderTest
General Error with IO
java.nio.file.NoSuchFileException: sample.txt
        at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsFileSystemProvider.newByteChannel(Unknown Source)
        at java.nio.file.Files.newByteChannel(Unknown Source)
        at java.nio.file.Files.newByteChannel(Unknown Source)
        at java.nio.file.spi.FileSystemProvider.newInputStream(Unknown Source)
        at java.nio.file.Files.newInputStream(Unknown Source)
        at java.nio.file.Files.newBufferedReader(Unknown Source)
        at java.nio.file.Files.readAllLines(Unknown Source)
        at SentenceRecorder.fileInArray(SentenceRecorder.java:153)
        at SentenceRecorder.<init>(SentenceRecorder.java:71)
        at SentenceRecorderTest.main(SentenceRecorderTest.java:5)

If I put back the scannerOutput line with the try and catch block I only get a "can't open the file" error instead.
I presume that one way or another we have to open the file before copying the content over to a List?
Here is the full code http://pastebin.com/DRPQSbuB (put it in pastebin so I don't pollute the post too much)

readAllLines does indeed magically find the file, open it, read all its lines into a List, and closes the file again. You just have to create the right Path object and call readAlllLines passing that Path.

You get FileNotFound or NoSuchFileException because Java can't find your file. EIther it doesn't exist or (more probably) it's looking in the wrong folder.

Try printing your path.toAbsolutePath() which will show you exactly where it is looking

readAllLines does indeed magically find the file, open it, read all its lines into a List, and closes the file again.

Ahah, I love that!

You get FileNotFound or NoSuchFileException because Java can't find your file. EIther it doesn't exist

Oh, that's it, the file doesn't exist! When I created the new folder for the version of the project "with search", I've copied the java files only and removed the generated text file instead, because it would be created when I run the application, well, that was the case when I run the old application without the search. In the new version, with the search, the application tries to access the file even if it hasn't been created as yet. Well, that might be a problem then, because it means that the first time you run the application, when the file still doesn't exist, if you press the search button (which I haven't added it as yet), it will try to find something in a file that doesn't exist. Now, as I haven't created the GUI as yet, I think it's OK now, in fact I've copied the old text file into the new folder and the application works (it returns the content of the file in the console as I wanted ) but I have to bear this in mind when the search button is in place.
One thing though, there is no line separator in the console output, everything runs together, and the readAllLine constructor doesn't seem to be able to take an extra parameter for a separator - which is the way I did it with the append method instead:
fileWriter.append(stringInput + separator);
The API though, says that the readAllLines method does recognise line terminators but it doesn't say how to use them it - I could reuse my separator here - and since it's not an iteration I can't inject the separator anywhere. I'm just thinking, when I then eventually loop through the sentences to find the search term, it will be the separator that will allow me to recognize a portion of text as a sentence and therfore return it as oppose to return the whole chunk of text, won't it? the separator will act like a kind of delimiter in a sense.

readAllLines should create a List with one entry for each line in the input file. It recognises all the normal ways of separating lines for all normal operating systems.You can check that by print the list's contents.
That means that if you have written your sentences to the file with sensible separators then readAllLines will give you each sentence in it's own entry in the List - no need for you to worry about separators.

It woud be sensible to deal properly with the file not being created yet - maybe catch the file not found and return an empty List<>. or (better) use Files.isReadable to check that things are OK before trying to read the file. thus avoiding the exception.

Uhm, no when I print the list in the console there is no space:

G:\JAVA\GUI\2015\createFrames\files\withSearch>java SentenceRecorderTest
Array contains:
jojosentence 1This is another sentenceAnd another oneAnd more

That means that if you have written your sentences to the file with sensible separators then readAllLines will give you each sentence in it's own entry in the List

Well I have, if you remember I'm getting the system separator, but for some reasons it doesn't seem to be working in the console. I don't know whether when I print it inside a textfield it will make any difference though, maybe it will. Let's leave this now and see what happens when the file text is displayed in the text field.

use Files.isReadable to check that things are OK before trying to read the file. thus avoiding the exception.

Sounds good. Now I can code it in such a way that if Files.isReadable returns false will print a messagein the console, later I can display that message inside the textField.
I'll also try to implement the search now: pass a keyword and if found in the text return the sentence, will see how that goes

I would not try to move on until I was ure the data is what it should be.
I don't know how to interpret that output since I can't see the code that created it. Just print the list's contents without trying to interpret or format it yourself, as in
System.out.println(allSentences);

eh, darn formatting! You're absolutely right, it's printing OK now :-), with System.out.println(allSentences.get(count));
I was also reflecting on something.
Basically, I've added the isReadable() method, but the compiler still wants me to have to exception as it came back with an error:

G:\JAVA\GUI\2015\createFrames\files\withSearch>javac *.java
SentenceRecorder.java:163: error: unreported exception IOException; must be caught or declared to be thrown
                        return Files.readAllLines(path, ENCODING);
                                                 ^
1 error

so, the code is quite clunky now:

public List<String> fileInArray(){//save file text into array            
        Path path = Paths.get(fileName);            
        if(Files.isReadable(path)){//if the file exists etc
            try{
                return Files.readAllLines(path, ENCODING);
                //searchWord("sentence");
            }
            catch(IOException ioexception){
                System.err.println("General Error with IO");
                ioexception.printStackTrace();
                System.exit(1);
            }

            return null;            
        }
        else{
            System.out.println("The file is empty. You need to save something in it first.");
            return null;
        }           
    }
    public void printSentenceArray(){
        System.out.println("Array contains: ");
        for(int count = 0; count < allSentences.size(); count++){
            //System.out.printf("%s", allSentences.get(count));
            System.out.println(allSentences.get(count));
        }           
    }

It seems overkill to have an if statement with a try/catch block and then an else statement

if(Files.isReadable(path)){
    try{...}
    catch{...}
}
else{...}

And moreover, I need to think how I'm gonna fit the keyword functionality in between. I was thinking to call a searchWord() method from somewhere there, but the structure is making it rather difficult, maybe I should call it inside the try block before return Files.readAllLines(path, ENCODING);

Like the doc says - isReadable gives you the right answer at the instant you call it, but there's no guarantee the file will still be there and readable when you (later) try to read it, so you still need to be prepared to handle the exception. However, in that case something weird has gome wrong. So it makes sense to do both:

if (NOT file is readable) - it just hasn't been written yet - you can deal with this and carry on...
but then...
you try to read it and get an exception - that's weird. No way to deal with it, just diagnose and exit.

Persnally I would test for NOT readable because the code is cleaner

if (not readable) {
    show message
    return null (or empy List )
}
// file looks OK, carry on...
try {
    readAllLines
    ...
} catch {
   wtf?  printstacktrace & exit
}

searching does not belong anywhere in that code. You just need to load allSentences once, when you initialise the program.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.