Let's assume couple things.

I created a nice website based on compiled program language (Node.js, ASP, Java, C#, whatever). It works nicely. But I'd like to get an estimate on how much hardware I'd need. So I could run 1 hosting, and it would result in "this" amount of impact on servers (memory, processor cycles, networking etc.). But that's 1 person with 1 hired server. I'd like to test how would it be, if 10.000 people were on each of 10.000 servers.

Any way to find that out, before releasing it?

Primarily, I'd like to check if there are no "bumps" in performance. For example 1 user requesting a hard sequence won't be noticable, maybe a small bump in computer's graphs. But if 10.000 people request same resource, seemingly "nothing-to-see-here" imperfection, will turn into a nightmare.

Like I said, how can I profile what could possibly happen if 10.000 people, got on 10.000 different servers.

All hypothetical, we all know I can barely write regular JavaScript and C# xD

Edited 1 Week Ago by Aeonix

I don't know your design or why this is a concern.

Just wondering.

If I look at prior questions like https://www.daniweb.com/programming/web-development/threads/506651/how-to-make-this-script-animation-jitter-less that is client side and a server has nothing to do with the issue you asked about there.

These two are totally unrelated.

About 5 years ago a run of the mill web server could easily hit 10K requests a second. If your site is lucky to get 10K requests why would a 1 second delay matter?

It's not about seconds delay it's about checking out where are real choking points of performance in real-life scenario's. Even if it's 1 second delay. What about millions of active users?

It's not about real project. More about me being interested. The question is rather simple, how do you know performance of your server doesn't suck, if you haven't tried masses requesting many things at the same time.. Your script may handle 10 users easily, but with million users requesting the same, server might choke down, think about storage bottlenecks. Sure processor, memory and network will handle it, but you may find out that storage may delay response in long-run. I'd like to know if there's a realistic way to predict that, or stress it.

Write some scripts that hammer the site and all the underlying services involved with work to do. At some point there will be something in the chain that can't keep up. A denial of service attack on your own site. Hopefully you can find that breaking point and figure out that that is X user requests per Y time without having to find X human users.

Write some scripts that hammer the site and all the underlying services involved with work to do. At some point there will be something in the chain that can't keep up.

Well that's a question. If there are any recommendations or thoughts or something.

A denial of service attack on your own site.

This is illegitimate mass of data. Nothing what users will return.

If there are any recommendations or thoughts or something

Whatever is involved in making your website deliver what it needs. If you make databases queries/deletions/insertions, try to make a bunch of them at the same time, figure out when/how it can no longer keep up. If you're using sockets, pipes, whatever, same thing. Write a program that does that and hammer it till it breaks. Accessses a file? Write a program that spawns a thousand processes that all need to access that one file. Get familiar with libcurl and hammer your site with requests from that. Etcetera. Open up the task manager and figure out what processes are used with a request. Write a program that spawns thousands of processes that hit those processes as hard as you can. At some point, you'll find the weak link/bottleneck.

illegitimate

Ooookay. Show me a website that has tens thousand users on each of 10,000 servers like in your hypothetical that releases the website without doing denial of service tests on their own webite and I'll show you a website that will very quickly become a victim of a denial-of-service attack. And that's for malicious, intentional denials of service. They also do denial of service attacks on their own sites as part of stress testing, detecting bugs, improving efficiency, etc., etc.

Regarding masses of data that will supposedly never occur in real life, I invite you to open up Wireshark and see all the traffic that transmits between the servers and your computer on some sites, and that doesn't include all the data that's processes completely on the server side or between servers. Don't forget all the traffic that transmits between the ad servers and your website before your site is even allowed to load. And that's assuming everyone and everything is playing nice and no process refuses to yield its turn and its resources until it's sure it's done or re-sending all the data over and over again because it didn't get an ACK or whatever, and that there are no bugs/endless loops anywhere. All a denial-of-service attack is is doing that on purpose for maximum havoc. I've lost count of how many ACCIDENTAL fork bombs, etc. I've written that have crashed systems.

Comments
Some good thoughts.

Ooookay. Show me a website that has tens thousand users on each of 10,000 servers like in your hypothetical that releases the website without doing denial of service tests on their own webite

I think you misunderstood me. 10.000 users who legimiately use website and send real queries, versus self-induced DoS attack are not the same. That's what I meant, first sends normal data, second is just garbage spam (assuming there's no upper-layer leak). One is interpreted by written script that I've written, second is just spam, that has no actual direction (maybe it has, but it's not regular input).

[snip]

Well. That's not what I meant. I meant stress induced by real users, not the illegitimate ones. You can't really prepare for DoS attack, if someone gets your IP, you're pretty much done for, unless someone like Cloud Flare (and couple others) have some ways to "mitigate" them. The question is not about "how can I find where my website has the hardest time" it's about "how can I find where my website will start spiking and bottlenecking itself by use of legitimate users". It's more of "I have finished great website/program, how can I know it works smoothly with 10.000 people?".

Edited 1 Week Ago by Aeonix

I don't see a view of your site, what it does and so on. So 10K a second to millions will obviously have you design with a cloud. I'm running into folk that can flip (as in burgers) out a dedicated web server but it doesn't scale. That takes an entirely different mindset and often if ever a one person show.

With that, I want to segue into discussing the concept of Minimum Viable Solutions (or products.)

It happens all the time that a new web master or let's back that up to just "Designer" will get lost on the way to the bazaar. While it's all well and great that folk bring up testing, I've seen designers miss the market as they fret about scaling.

Well we have a terminology difference. I'm using the idea of mimicking a Denial of Service attack in a broad sense to mean an intentional "breaking" of the product in order to find the weak point or the "bottleneck" as you put it, the stuff that Quality Control people do. Yes, they'll test it for "normal", "legitimate" usage, sure, but you can bet that they'll zone in on the weak spots because they're trying to break it, and the way you break things quickly is to focus on the edge cases, not the "normal" cases. In your scenario, they'll almost certainly attack it in a different manner than your normal "denial of service" attacks, but their core personality is the same as a hacker's and that Programming professor who tells you your program doesn't work because it didn't handle XYZ case, then laughs at you when you protest that no one would ever think to enter that ridiculous XYZ input. The guys designing industrial laundry machines told their QA guy to be stop being stupid because no one would ever put a dog into a dryer to dry it off. Those warning labels are on there for a reason.

End result is it's annoying as hell, but it makes you a better programmer and it catches bugs way faster. Call it "stress testing" if you like. Overalll point is that it's sometimes difficult to tell the difference between a QA Engineer that you submit your work to and a hacker. It sure FEELS like they're both trying to make your life miserable. Then you argue about whether something's worth fixing, but at least you're having the discussion. This mindset will help you simulate your hundreds of millions of users. But it really won't actually because your "Hello World" website will get a whole lot more complicated when it gets sprinkled across the 10,000 servers. As RProffitt pointed out, it won't scale.

Completely contradicting myself from before (actually not really a contradiction), Rproffit also brings up an excellent point about people who just have to get everything just perfect and handling every little thing before they release things and by the time they do, the opportunity window has closed. As the saying goes, "The best is the enemy of the good". Minimum viable solution indeed.

Comments
What we have here is not a failure to communicate.
The article starter has earned a lot of community kudos, and such articles offer a bounty for quality replies.