I have it coded to where a small list can have dupes removed rather efficiently. But, the larger the list gets, the longer it takes to the point that it is better to not even try. A list of 15,000 lines would takes hours and hours... probably longer to remove a lot of duplicates.

Here is what I have done:

procedure TMainForm.RemoveDuplicatesButtonClick(Sender: TObject);
var
  i, j: integer;
  label
    Recheck;
  begin
    Recheck:
    for i := 0 to ListView1.Items.Count - 1 do
      for j := 0 to ListView1.Items.Count - 1 do
        begin
        if i = j then continue;
        if ListView1.Items[i].Caption = ListView1.Items[j].Caption then
          begin
          ListView1.Items.Delete(j);
            goto Recheck;
          end;
          ListView1.Columns[0].Caption := inttostr(ListView1.items.Count);
          end;
          CurrentStatusLabel.caption := 'Removed Duplicates';
    end;

While this does work, like I said, it gets too slow for large lists. Is there a way to somehow load the list into memory to do this procedure? I know an array would be much faster, nut I'm not that far yet. This is in Tlistview of course Duoas as you well know by now.

Also, When I try to give a correct count of duplicates removed. It is not correct so I changed the label caption to just "Removed Duplicates".

Thanks as always for any help. :)

Recommended Answers

All 14 Replies

The TListView is going to be really slow no matter what. However, there are a couple of things you can do to help.

In your loop on line 9, you are starting at zero again. You don't need to check things already known not to be duplicates. Say instead: for j := i+1 to ListView1.Items.Count-1 do In the same vein, when you goto recheck you are starting all over. Since the items are ordered, you might want to work backwards: for j := ListView1.Items.Count-1 downto i+1 do When a duplicate is found and removed, you just continue on your merry way and ignore all the stuff you have already checked (and possibly removed).

You might want to consider using the ListView1.FindCaption method instead of the doing the inner loop yourself. That way you can just loop until FindCaption returns nil. I think that would probably add a time savings since the class can usually access its members faster than you can using the classes property accessors.

You can count how many you removed by incrementing a counter whenever you actually remove a duplicate and at no time else. Set the counter to zero before you do anything else. I'm sure you had something very close to this when you tried before.

Good luck.

I added this to see what was happening when I when backwards in the large list.

CurrentStatusLabel.caption := 'Removed' + inttostr(j) + 'Duplicates';
 Application.ProcessMessages;

It counts backwards and goes all the way to down to zero still. Should I have done more than just change that line? I looked at the syntax for ListView1.FindCaption and I couldn't get that to compile. I will read up on that.

This is the now changed code:

procedure TMainForm.RemoveDuplicatesButtonClick(Sender: TObject); 
var
  i, j: integer;
  label
    Recheck;
  begin
     Recheck:
    for i := 0 to ListView1.Items.Count - 1 do
       for j := ListView1.Items.Count-1 downto i+1 do
        begin
        if i = j then continue;
        if ListView1.Items[i].Caption = ListView1.Items[j].Caption then
          begin
          ListView1.Items.Delete(j);
            goto Recheck;
          end;
          ListView1.Columns[0].Caption := inttostr(ListView1.items.Count);
          CurrentStatusLabel.caption := 'Removed' + inttostr(j) + 'Duplicates';
          Application.ProcessMessages;
        end;
        //CurrentStatusLabel.caption := 'Removed Duplicates';   // this doesn't give the correct number
    end;

and my count is still off... I know it is because I havent done the count properly. I think...

Thanks for your help as always.

Sorry, I should have made myself more clear.

Try this:

procedure TMainForm.RemoveDuplicatesButtonClick(Sender: TObject); 
  var
    i, j, num_removed: integer;
  begin
    num_removed := 0;
    for i := 0 to ListView1.Items.Count - 1 do
      for j := ListView1.Items.Count-1 downto i+1 do
        begin
        if i = j then continue;  // <-- this line shouldn't be necessary
        if ListView1.Items[i].Caption = ListView1.Items[j].Caption then
          begin
          ListView1.Items.Delete(j);
          inc( num_removed )
          end;
        Application.ProcessMessages;
        end;
    ListView1.Columns[0].Caption := inttostr(ListView1.items.Count);
    CurrentStatusLabel.caption := 'Removed ' + inttostr(num_removed) + ' Duplicates';
  end;

I didn't actually test this code --it's late and I'm going to bed now... Also, with the revised j loop I don't think you need to test for i = j, it should never happen. (Alas, my brain is shot right now...)

Good luck.

commented: Always an enormous help to me and has never given up in his efforts to help me. No amount of thanks descirbes my appreciation for this individual! +1

Hello Sqidd,

May I suggest that you get hold of Danny Thorpe's book on Delphi programming? The last time I checked it was out of print but available second hand on Amazon at over $100. A lot of money but money very well spent.

There are a number of things wrong with your code.

  • For starters you actually use a GOTO statement. In over a decade as a Delphi programmer I have never used a GOTO. Quite apart from structural issues, GOTOs generate very poor quality assembler code. NEVER use them.
  • Secondly, what you are doing is little short of a selection sort. With 15000 items it will be relatively slow.
  • You wouldn't normally notice the lack of speed. The problem is that with what you are doing you are deleting listview items one at a time. Each such deletion triggers multiple messages that culminate with a control redraw. Messages are slow. Redraws are slooow and get sloooooooower as the number of items in your listview increase.

As a general obserrvation when you find that you have issues with speed it is better to take a step back and examine your assumptions rather than spend time trying to make it all go faster. In the present instance the two questions that spring to mind right away are

  1. Why not prevent the listview from being populated with duplicates to begin with by checking before adding new items?
  2. What on earth is the point of a listview that contains 1000s of items? Surely no one is ever going to look at all those items!

That said, here is what I suggest.

  • Don't attempt to cleanup the listview directly - and for future reference always try to avoid cleaning up the content of visual controls by manipulating them directly. It is far better to do this offscreen and then update the contents of the control
  • First put all your listview items into a sorted stringlist that is setup to ignore duplicates.
  • Then clear the listview and populate it back from the stringlist.

There is a little more to doing this. The core code is shown below

procedure TMaster.DoCleanUp(Sender:TObject);
var i,j,k,ACount:Integer;
    AStrings:TStringList;
    AList:TListItems;
begin
  AList:=lvOne.Items;
  {get a pointer to the listview items. This is faster than attempting
   to dereference using lvOne.Items each time}
  ACount:=AList.Count;//how many items in the list
  AStrings:=TStringList.Create;//create a temporary stringlist
  with AStrings do
  try
    Sorted:=True;
    Duplicates:=dupIgnore;
    //make it a sorted stringlist and ignore duplicates
    for i:=0 to ACount - 1 do AddObject(AList[i].Caption,TObject(i));
    {Add every item in the listview to the stringlist. dupIgnore +
     Sorted ensures that duplicate entries are ignored. We want to
     add entries back to the listview in their original order which
     will not be preserved in the stringlist since it is sorted. So
     we store the original indices in the Objects array of the
     stringlist. Note the TObject(i) typecast.}

    AList.Clear;
    {We have a safe copy of all the listview entries we want to retain so
     why keep the listview entries? Just clear the listview. This issues
     just one LVM_DELETEALLITEMS message as opposed to multiple
     LVM_DELETEITEM messages}
    j:=0;
    {To repopulate the listview we want to start finding its original
     entries starting from the first one, at index 0}
    for i:=0 to ACount - 1 do
    begin
      {The stringlist contains fewer entries than in the original listview
       but nevertheless we have to run this loop as many times as there
       were members in the listview since we need to identify them by
       their original position with the help of the stringlist.Objects
       array}
      k:=IndexOfObject(TObject(j));
      //where is the j'th entry in the listview?
      inc(j);//know that now so next time we want the (j + 1)th entry
      if (k >= 0) then AList.Add.Caption:=Strings[k];
      {The j'th entry could have been a duplicate & may no longer
      exist. Only if it does - add it back to the listview}
    end;
  finally Free end;//don't forget to destroy the stringlist!
  ShowMessage(IntToStr(AList.Count));
end;

The test project I created for this is in the ZIP attachment to this message. In my tests, on a not especially fast computer, the cleanup of a 10000 item listview took just a few seconds. To me this still isn't the ideal solution. I would try to recast the problem so the cleanup isn't necessary in the first place.

Hope this has been useful

commented: above my level in programming but noneless a great effort and explanation with code to review as well. Nice job i must say +1

Yes, that is a much faster way of removing duplicates in an already slow listview procedure. As you mentioned, Line 9 was completely unnecessary and has been since removed.

Thank you for your help. :)

Hello ExplainThat,

I will also take a look at what you have written as well. I am VERY new to programming. So with that said. I may not be able to implement this procedure. But I will try. Thank you for you suggestion.

I just now saw the zip file enclosed in your post ExplainThat, I will take a look at it as I was unable to implement your suggestion. Thank you for your help. I am also trying to disable the RemoveDuplicatesButton if more than 1,000 lines are loaded into ListView. The only thing it does is disable at this point but still performs the procedure. lol

I will get this one I think. Cant be too hard. Or can it?

Thanks again. :)

EDIT -

here is something weird that I cant seem to get right... I have made it to where if Listview has more than 1,000 lines, the button that performs the procedure is still on... once clicked, it greys out and the removal of duplicates does not occur.. I am trying to figure out why it isnt greyed out after 1,000 lines have loaded. I thought I had it right.

It does do what It is supposed to do by not allowing the procedure to continue, but the button has to be clicked for it to grey out. This is annoying and would possibly confuse the user. Why isnt this right?

procedure TMainForm.RemoveDuplicatesButtonClick(Sender: TObject);
  var
    i, j, num_removed: integer;
    begin
    if ListView1.items.Count > 1000 then  //{after 1,000 lines I thought that the next line would gray out the button.}
    RemoveDuplicatesButton.Enabled := False  //{this line prevents the procedure from continuing but wont gray out until the button is clicked.}
    else
    begin
    num_removed := 0;
    begin
    for i := 0 to ListView1.Items.Count - 1 do
      for j := ListView1.Items.Count-1 downto i+1 do
        begin
        if ListView1.Items[i].Caption = ListView1.Items[j].Caption then
          begin
          ListView1.Items.Delete(j);
          inc( num_removed )
          end;
        Application.ProcessMessages;
        end;
    ListView1.Columns[0].Caption := inttostr(ListView1.items.Count);
    CurrentStatusLabel.caption := 'Removed ' + inttostr(num_removed) + ' Duplicates';
  end;
    end;
    end;

i am sure is is something incredibly stupid on my part as usual... :(

Another thing that i havent been able to find anywhere is how to add an icon other than the default icon after the code has been built. I can add an icon in the form header, but not in the built *.exe...

Any ideas on where to look to read up on that? Im sure it is rather simple... So simple actually that noone has written about it, or that I have seen.

The ButtonClick event procedures only get called if the user clicks the button. So nothing will grey if the 'set disabled' code is in the button click procedure. The place to grey the button is in the load procedure: after loading the file, check to see how many items there are. If more than 10,000, then disable the appropriate buttons...

In Windows applications, things only occur when the user clicks something, or types something, or etc. The only exceptions are system events (which you don't need to worry about) and timer events (if you drop a TTimer on your form).

To set the application icon, go to project options --> Application. There should be a spot to set the program icon and the program title.

The answers to your two questions

  • The right place to get the button to disable would be the TListView.OnInsert event. All you need is something like this: btnCleanUp.Enabled:=(lvOne.Items.Count < 999)
  • I think what you want is to change the icon that appears for your application in Windows Explorer? You can do this from the Application tab in the dialog box that pops up when you click on Project.Options.

Before you spend too much time trying to write more code, might I suggest that you do some or all of the following?

  • Familiarize yourself with Delphi stock objects such as TList, TStringList, TFileStream, TMemoryStream etc. You are likely to spend too much time reinventing the wheel if you don't use these objects. They are all in the Classes unit.
  • Make sure you understand how the with keyword works. Also play around with using try..finally..end and try..except..end blocks. As a general rule try..finallys should outnumber try..excepts by a proportion of 10:1 or more.
  • When you use any object - visual or otherwise - make sure you explore the methods, properties and events for that object.
  • Finally, like I suggested earlier, get the Danny Thorpe book. It is the best Delphi programming investment you could ever make.

Good luck!

Squidd it ocurs to me that even if you want to pursue your current solution there is at least one thing you can do to improve it even further -

I note that you are incrementing num_removed each time you remove an item from your listview. This is totally unnecessary. It would be far better to note the number of items in the listview before you start the clean up and then check it again after the cleanup. Something like this

var befLoop:Integer;
befLoop:=lvOne.Items.Count;
//Your cleanup code goes here.
ShowMessage(Format('%d items removed',[befLoop - lvOne.Items.Count]);

b.t.w. given that you are new to this game might I suggest that you get into the habit of naming your controls in a consistent and meaningful way? ListView1 is not a very descriptive name is it? The control serves a purpose in your app so give it a name that reflects that purpose. Also, adopt a naming convention. I suggest a Hugarian notation

buttons - btn...
listviews - lv...

etc.

When you are working on complex forms, such a convention will help you be more productive and efficient.

thank you for your advice.. and yes I agree that proper formatting and a consistent naming of controls and items is the best way to go.

I am looking into everything you mentioned and then some.

Thank you for all of your help.

I have the Icon I want for the exe now. Thanks for the help on that. :)

I also, have the button to disable and grey out ofter the line count is over 1,500. I simply did this @ lines 16 and 17 on load:

procedure TMainForm.LoadListButtonClick(Sender: TObject);  // load/open file
  var
    txt : TextFile;
    Buffer : String;
      begin
        if openDialog1.Execute then
        begin
        AssignFile(txt, openDialog1.FileName);
        Reset(txt);
        while NOT EOF(txt) do
          begin
            ReadLn(txt, Buffer);
            ListView1.Items.Add.Caption := trim(Buffer);
            ListView1.Columns[0].Caption := inttostr(ListView1.items.Count);
            CurrentStatusLabel.Caption := 'Loading File...';
              if ListView1.items.Count >= 1500 then
              RemoveDuplicatesButton.Enabled := False;
            application.ProcessMessages;
          end;
            CurrentStatusLabel.Caption := 'File Loaded Succesfully';
            CloseFile(txt);
      end;
  end;

Works perfectly... I dont think I am going to be able to implement your code ExplainThat... It is just above my level if understanding i believe. But I will continue to read and try to use that later. Thanks for the effort nonetheless, it was very interesting to study. I still cant figure out how you are actually creating that list... lol But again, it is over my head at this point.

I wonder is joining smaller lists into one large list is as much of a hassle as splitting the lists were. I would like to add that feature and alphabatize a list as well.

PLus I am getting errors when no number or a letter is added to split files. I am trying to fix that now too... not much left on this program.. i am pretty darn proud of it. :D

Getting the list to alphabetize wasnt hard at all... one line of code solved that.

Getting the list to numerically sort is a different story however... It goes like so:

1
10
11
12
2
3

and so on...

Anyway thanks for all the help. :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.