I have this code which finds the duplicate rows .I want to check if the top row has weight greater than given minimum support..If true then generate its subsets.

If row 1 i.e 11110011 wight>min_sup then generate subsets....if not then jump to next row

here 1 represents presence of items and 0 represents absence... so row one has 1,2,3,4,9,12 so subsets i need is {1,2,3},{1,3,4},{1,4,9},{1,9,12} {1,2},{1,3}....etc

Can anyone help
*/

`````` package javaapplication42;

import java.io.IOException;

public class Smalldataset {

public static void main(String[] args) {

String[] line=new String[55];
int n, c, d, swap,i=0,j=0,min_suport,row=0,col=0,no_of_selected_item=0;
boolean[] skiplist=new boolean[12];
int[] length=new int[55];
int[] sum=new int[12];
int[][] m=new int[55][12];
int no_of_row=55,no_of_col=12;
try {

min_suport=(60*no_of_row)/100;

String sCurrentLine;

while ((sCurrentLine = br.readLine()) != null)
{
System.out.println(sCurrentLine);
line[i]=sCurrentLine;
i++;
}
col=0;
for(i=0;i<55;i++)
{
for(j=0;j<12;j++)
{

String s=""+line[i].charAt(j);
m[i][j]=Integer.parseInt(s);
// System.out.println(m[i][j]);

}
}

//sum of columns

for(j=0;j<12;j++)
{
sum[j]=0;

for(i=0;i<55;i++)
{
String s=""+m[i][j];
//System.out.print(s);
sum[j]=sum[j]+Integer.parseInt(s);

}
if(sum[j]<min_suport)
skiplist[j]=false;
else
{
skiplist[j]=true;
no_of_selected_item++;
}
}
for(i=0;i<12;i++)
{
System.out.print(" "+sum[i]);

}
int matrix2[][]=new int[56][no_of_selected_item+1];
System.out.println();
System.out.println("First Frequent Item Set");
j=0;
for(i=0;i<12;i++)
{
if(skiplist[i]==true)
{
matrix2[0][j]=i+1;
j++;
System.out.print(" "+sum[i]);
}
}

System.out.println();
System.out.println("Total no of frequent itemset :"+no_of_selected_item);

for(j=0;j<no_of_selected_item;j++)
{

if(skiplist[j]==true)
{
for(i=1;i<56;i++)
{
matrix2[i][j]=m[i-1][j];
}
}

}
System.out.println("New Value of array");
for(i=0;i<no_of_selected_item;i++)
System.out.print("  "+matrix2[0][i]);
System.out.println();
System.out.println("----------------------------------");
for(i=1;i<56;i++)
{
length[i-1]=0;
for(j=0;j<no_of_selected_item;j++)
{
System.out.print("  "+matrix2[i][j]);
//length of rows(no of 1's)
if(matrix2[i][j]==1)
length[i-1]++;
}
matrix2[i][no_of_selected_item]=length[i-1];
System.out.println("  "+length[i-1]);
}

//Sorting

n=55;
for ( c = 0; c < ( n - 1 ); c++)
{
for ( d = 0; d < n - c - 1; d++)
{
if (matrix2[d+1][no_of_selected_item] < matrix2[d+2][no_of_selected_item]) // For descending order use <
{
swap       = matrix2[d+1][no_of_selected_item];
matrix2[d+1][no_of_selected_item]   = matrix2[d+2][no_of_selected_item];
matrix2[d+2][no_of_selected_item] = swap;
for(i=0;i<no_of_selected_item;i++)
{
swap       = matrix2[d+1][i];
matrix2[d+1][i]   = matrix2[d+2][i];
matrix2[d+2][i] = swap;
}
}
}
}
int unique[][]=new int[56][no_of_selected_item+2];
//Sorted array display and creating new array
for(i=0;i<56;i++)
{

for(j=0;j<=no_of_selected_item;j++)
{
System.out.print(matrix2[i][j]);
unique[i][j]=matrix2[i][j];
}
unique[i][no_of_selected_item+1]=1;
System.out.println("  "+matrix2[i][no_of_selected_item]);
}

int no_trans=56;
//Removing redundancy
for(i=1;i<56;i++)
{
if(unique[i][no_of_selected_item+1]!=0)
{
for(j=i+1;j<56;j++)
{
if(matrix2[i][no_of_selected_item]==matrix2[j][no_of_selected_item] && unique[j][no_of_selected_item+1]!=0)
{
for(int k=0;k<no_of_selected_item;k++)
{
if(matrix2[i][k]==matrix2[j][k])
{
if(k==no_of_selected_item-1)
{
unique[i][no_of_selected_item+1]++;
unique[j][no_of_selected_item+1]=0;
}
}
else
break;
}
}

}
}

}
System.out.println("ok");
for(i=0;i<no_trans;i++)
{
if(unique[i][no_of_selected_item+1]!=0)
{
for(j=0;j<no_of_selected_item;j++)
{
System.out.print(unique[i][j]);
}

System.out.println("  "+unique[i][j]+"  "+unique[i][j+1]);
}
}
}

catch (IOException e)
{
System.out.println("Error :"+e);
}
finally
{

try {
if (br != null)br.close();

}
catch (IOException ex)
{
System.out.println("Error :"+ex);
}
}
}
}
``````

I'm confused by your explanation. How do you go from 11110011 to 1,2,3,4,9,12 ? How is "minimum support" defined? What kind of comparison do you want to do? Integer? Can you try to re-explain it. Also, what problems are you having?

``````Actually the code is for finding frequent itens...here 1 represent existence of the item and 0 represents absence. 1,2,3.. are the name of the items(column names)
items->  1 2 3 4 9 12
t1  1 1 1 1 0 1
t2  1 1 1 1 1 1

no_of_times_rows_repeated
t1   35
t2   20

here it means transaction t1 has item 1,2,3,4,12 and t2 consists 1,2,3,4,9,12...

say for example minimum_support=30;

i want to check for each row:

if(no_of_times_rows_repeated>minimum_support)
generate subsets for items which has 1 uder it...

example:if(no_of_times_rows_repeated[t1]>minimum_support) //true
then generate subset of 1,2,3,4,12
1,2,3
1,3,4
1,4,12
2,3,4 etc...
``````
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.