Hello,

I am designing for work a testing environment which uses run-time randomly generated data to test our company's product.

The problem is as follows: I have an inequality of the sort of x(1+y(1+z(1+p+q))) <= MAX, when x,y,z,p,q,MAX are integers, MAX is given as a method parameter and the method needs to find x,y,z,p,q to suffice the inequality. The problem is that I need the solution to be in a uniform probability space, meaning that each solution of the x,y,z,p,q has the same probability to come up when providing the same MAX many times.

If possible, it would be great if you can think of a polynomial solution, if not possible O(n^2) will suffice as well. The algorithm needs to be run with high MAX values and big number of iterations on each test, so good complexity is vital.

Can you think of a way to produce uniform probability space for every type of linear equation?

I would appreciate any guidance on this matter, I am stuck on this algorithm for some time now.

Thank you!

Recommended Answers

All 23 Replies

I'm not sure that I understand what you are after. The equation can have many solutions. Do you want to calculate the definition space and randomly select variables x,y... inside that space? What language are you using? It should be easy to find an algorithm that produces random uniform variables.

I'm not sure that I understand what you are after. The equation can have many solutions. Do you want to calculate the definition space and randomly select variables x,y... inside that space? What language are you using? It should be easy to find an algorithm that produces random uniform variables.

You have a large solution space for the inequality, and I want one solution from all the possible solution to be returned - and I want that solution to be randomly picked with a uniform probability. Meaning that if I have for a given MAX n solutions, I want that each solution will have 1/n chance of being returned.

I'm not sure that I understand what you are after. The equation can have many solutions. Do you want to calculate the definition space and randomly select variables x,y... inside that space? What language are you using? It should be easy to find an algorithm that produces random uniform variables.

Missed your second question - I am using Java.

OK i get it now. It should be easy to find the solution space for each variable. Then you can just use the built in uniform random number generator in java to pick one solution. Check the java api for java.math and java.math.BigInteger. Theres even a constructor for uniform random number generation for BigInteger.

BigInteger(int numBits, Random rnd)

Constructs a randomly generated BigInteger, uniformly distributed over the range 0 to (2numBits - 1), inclusive.

OK i get it now. It should be easy to find the solution space for each variable. Then you can just use the built in uniform random number generator in java to pick one solution. Check the java api for java.math and java.math.BigInteger. Theres even a constructor for uniform random number generation for BigInteger.

BigInteger(int numBits, Random rnd)

Constructs a randomly generated BigInteger, uniformly distributed over the range 0 to (2numBits - 1), inclusive.

Since I can't see it, I'll be happy if you could elaborate how to find the solution space.

I think I wasn't clear on the parameters though. x,y,z,p,q,MAX are all non-negative integers. x and MAX are positive integers. you can look at it as in a tree, where x is the root and the rest are the children. Thus, if y is zero, it's children z,p and q must be also zero. If z is zero, it means that p and q are zero. I hope that this simplifies the solution space. In the tree analogy, MAX is the total number of assets possible in the tree.

Take a look at it this way, your equation boils down to X * (1 + SomeNumber) <= MAX. So X can range from 0 to MAX/(1+SomeNumber). This means the rest of the values (y, z, p, q) don't matter, you can put anything you want in there, as there is a range of X values that will provide a solution (x,y,z,p,q >= 0).

So all you need is to pick random y,z,p,q values, then an x value from the range 0 .. MAX/(1+Y*(1+Z*(1+P+Q))).

You can't really get a uniform distribution because Y, Z, P and Q can range from 0 to infinity. You'll have to place some arbitrary limits on the values.

Are your sure you can't get a uniform distribution and that variables can go to inf.? The set 0 < x(1+y(1+z(1+p+q))) < MAX, is closed i.e. there are a finite number of integer solution. Take this 2D case as an example...

http://www.wolframalpha.com/input/?i=0+%3C%3D+x%281%2By%29+%3C%3D+100

You can choose a random x and y from that set and you have your answer. The 5D case is exactly the same. You should just be able to randomize a variable at a time, the solution space changes, and then take the next variable, and the next, until you have all variables.

Say you get a random number from 0 to MAX, ex. x=2. Then you have y(1+z(1+p+q)) = MAX/2 left, etc.

Anything wrong with my reasoning?

Are your sure you can't get a uniform distribution and that variables can go to inf.? The set 0 < x(1+y(1+z(1+p+q))) < MAX, is closed i.e. there are a finite number of integer solution. Take this 2D case as an example...

http://www.wolframalpha.com/input/?i=0+%3C%3D+x%281%2By%29+%3C%3D+100

You can choose a random x and y from that set and you have your answer. The 5D case is exactly the same. You should just be able to randomize a variable at a time, the solution space changes, and then take the next variable, and the next, until you have all variables.

Say you get a random number from 0 to MAX, ex. x=2. Then you have y(1+z(1+p+q)) = MAX/2 left, etc.

Anything wrong with my reasoning?

Hmmm, the thing is that choosing a uniform distribution for x, any results above MAX/2 ensures that no y,z,p,q will be chosen. Same goes afterwards for y,z etc'. If I fix the range for x to be [1, MAX/2] then all the other possibilities x=[MAX/2,MAX] will not be included, thus I am missing cases from the solution space...

Note. my equation was wrong, should say y(1+z(1+p+q)) = 1.

How about randomizing the order in which you pick the variables to randomize? Take for example a list with 5 elements [1,2,3,4,5] which represent the variables x,y... randomize the ordering of the list, and then generate random numbers for each variable in that order.

If your'e unsure about whether some algorithm truly picks random solutions uniformly you could implement it and run it on a small testset a couple of million times and se if the distribution is uniform or not.

The problem with randomizing the order in which the variables are picked is that you can generate values for which no solution is possible. You really need to start at X and work your way through the variables. You could get some form of uniformity if you are testing for a specific MAX value. Generate a value for X that is uniform across all possible X values (for that specific MAX), do the same for Y values (based on your X value), etc.

I agree with Momerath regarding the order of the generation - after running some test scenarios I've encountered reaching no integer solution when not starting from x. The problem with Momerath's solution though is that for a given MAX choosing uniformly x from [0,MAX], 50% of the solutions, that is [MAX/2,MAX] will result in y=z=p=q=0, meaning that tactic is surely not uniform. On the other hand, not giving x all the possibilities, let's say cutting it's initial distribution to [0,MAX/2], we will clearly not cover the entire wanted solution space...

If you change the order in which you generate the variables you have to take into account the relationship between the variables. Of course there is a way yo choose random variables in any given order, what is stopping you? You just have to make sure you pick values inside the space. Could you give the exact mathematical relation between the variables?

x has to be strictly positive, y,z,p,q can be zero. If y is zero, then z,p,q must be zero as well. if z equals zero, then p,q are zero as well (think of it as levels in a tree, where x is top, y beneath, z beneath y and p,q are at the same level). You can assume that MAX is strictly positive as well.

Is that all the relations? If y is not 0 can z,p,q be anything, or must they be equal and larger or something?

if y is not zero, z,p,q can be whatever they want, as long as the main inequality holds.

If they can be anything, then you have an infinite search space. No matter what valid values you put for x/y/z (such that there is a solution), p and q are additive, which means you can cancel the effects of one by adjusting the value of the other. For example, lets say that the inner part (1+p+q) must be 3 for there to be a solution. You could set p=1, q=1 or p=2, q=0, or p=3,q=-1, etc. Since that search space is infinite, there is no way to evenly distribute your queries into it. I believe the best you'll be able to do is what I suggested before. Pick an X that holds true from an even distribution of X, then a Y that holds true from an even distribution of Y in that X search space. Repeat for Z. Then any p/q value that still keeps the inequality true.

wasn't all variables positive integers? If not then there are an infinite number of solutions.

edit.

"x,y,z,p,q,MAX are all non-negative integers".

Ok, I apologize and will try to clarify: x is a positive integer, MAX also. y,z,p,q are non-negative, meaning integers that can be positive or zero. if y is zero, then z,p,q are zero as well. If z is zero, then p,q are zero. If y is not zero, z,p,q can be in the range of zero and above, complete integers. If z is not zero, then p,q can be zero and above, complete integers.

Hope that clarifies.

Sorry, I was looking at the first definition of x,y,z,p,q,Max where you said they were integers, not the later redefinition.

You are going to have to generate all possible solutions for a given MAX value to be able to randomly pick one for an even distribution. I don't see any other way. Which brings up the question: Why does it have to be even?

Sorry, I was looking at the first definition of x,y,z,p,q,Max where you said they were integers, not the later redefinition.

You are going to have to generate all possible solutions for a given MAX value to be able to randomly pick one for an even distribution. I don't see any other way. Which brings up the question: Why does it have to be even?

I am creating a test environment that generates random data at runtime. Given MAX, the random data is generated from the populated variables x,y,z,p,q. In order to cover all possibilities, I want the variables to be generated in a uniform fashion.

This is more troublesome than I thought.
If we define 5 groups of weight, such that in group 1 X get's the highest value, group 2 has Y getting the highest value, and the same goes for Z,P,Q.
Do you think that it is possible, giving MAX value, to get a uniform distribution on those groups, and inside those groups getting something that can be as close as possible to a uniform distribution (but not necessary uniform)?

Hope that this simplifies the question enough.

Let's say we have n random variabes t that are drawn from uniform distributions. The lower boundary of all those distributions is 0, but each distribution has its own upper boundary B(n). That is: t(n) is drawn from the uniform distribution [0, B(n)].

1) What is the probability that in one trial t(1) is the smallest of all t?
1) What is the probability that in one trial t(1) is the greatest of all t?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.