Hi All

I have a query about Numpy randn() function to generate random samples from standard normal distribution. I want to add some random samples using this function to my data and I want these samples must be in a range of 1 and -1.

I using this function

y = randn(100,20)
a = X + y

X is a old data matrix(100 rows and 20 columns)with values normalised between 0 and 1
a is a datamatrix with random samples y added to each cell.

but when I do this a.max() is always coming greater than 1. Is there any way to get the y in a range of 1 and -1. or Someway I can specify a range in the function itself.

2
Contributors
5
Replies
6
Views
6 Years
Discussion Span
Last Post by pythonbegin

Sorry, in the last line I mean to say- Is there any way to get the 'a' in a range of 1 and -1.

Hi All

I have a query about Numpy randn() function to generate random samples from standard normal distribution. I want to add some random samples using this function to my data and I want these samples must be in a range of 1 and -1.

I using this function

y = randn(100,20)
a = X + y

X is a old data matrix(100 rows and 20 columns)with values normalised between 0 and 1
a is a datamatrix with random samples y added to each cell.

but when I do this a.max() is always coming greater than 1. Is there any way to get the y in a range of 1 and -1. or Someway I can specify a range in the function itself.

If you only want to add some random noise to your data, you could add a normal random sample with a small standard deviation and clip the result in the interval [-1.0, 1.0] like this

sigma = 0.1
a = numpy.clip(X + sigma * randn(100, 20), -1.0, 1.0)

However, due to the clipping, your array won't exactly be a normal perturbation of the initial array, and this distorsion increases with the value of sigma. Mathematically speaking, it is impossible to require both a normal distribution and bounds on the possible values, but if sigma is small enough, the amount of clipped values will be small and the distorsion can be neglected.

short and sweet
goog explanation

Thanks for the excellent description.

Perfect. Is ther anyway to use lognormal in the same way as randn? to choose sample from log normal distribution.

If you only want to add some random noise to your data, you could add a normal random sample with a small standard deviation and clip the result in the interval [-1.0, 1.0] like this

sigma = 0.1
a = numpy.clip(X + sigma * randn(100, 20), -1.0, 1.0)

However, due to the clipping, your array won't exactly be a normal perturbation of the initial array, and this distorsion increases with the value of sigma. Mathematically speaking, it is impossible to require both a normal distribution and bounds on the possible values, but if sigma is small enough, the amount of clipped values will be small and the distorsion can be neglected.

Thanks for the excellent description.

Perfect. Is ther anyway to use lognormal in the same way as randn? to choose sample from log normal distribution.

I suppose so. Write numpy.random.lognormal(mean, sigma, (100,20)) . But can you explain what your matrix is and why you want to add random samples, and also why should the result be in [-1,1] ?

Thanks.
Data is a series of output from a program (values between 0 and 1 generated from log normal gaussian distribution). I checked the performance of the model on this data and now i want to add some random noise to it and want to check the performance again to see effect of noise. As original values are in the of 0 to 1, I need to add noise in the range of 0 and 1. Once random samples generated in the range of [-1,1], i can take absolute values for range of [0,1].