0

Hi All,
I am having a system which receive data from around 30K gps . Each unit will be inserting concurrently around 1 minute.
1.What tweak should I add to cater for this? I target my ram to be 48Gb with 24 cpus.

  1. Once the data grow should I do partition or leave it . I am worried as the data grow the insert will be slow?
  2. What should I anticipate and handling for heavy traffic?
  3. I plan to have replication data so that the reading can be done from the slave? But what I am worried is that if I delete data from the master then it will effect slave too right?
  4. Is possible to delete the partition from master but not from the slave?
3
Contributors
36
Replies
91
Views
5 Months
Discussion Span
Last Post by newbie14
Featured Replies
  • 1

    I wonder if by gps you meant GPS? If so I can share a little but not all about a fleet tracking system that I was a team member on. The GPS data would only go down to about 5 second intervals on the product so this was more of … Read More

  • 1
    cereal 1,524   5 Months Ago

    Just an addition. To test the queries you can try mysqlslap: https://dev.mysql.com/doc/refman/5.7/en/mysqlslap.html Read More

  • 1

    @newbie. We never saw it. I didn't want to bring this up but how is your internet connection concurrent? That is, at the data center we used the packets would arrive one after another. Yes, to a human it seems concurrent but it really isn't. So to test load you … Read More

1

I wonder if by gps you meant GPS? If so I can share a little but not all about a fleet tracking system that I was a team member on. The GPS data would only go down to about 5 second intervals on the product so this was more of a bottleneck in comms (communication) than the SQL and app that filed the data.

As to heavy traffic this system still runs on a virtual host and you are 100X more than what we use on a fleet of about 1K vehicles.

My view is you are overdesigning. Or maybe you didn't try it on a simple server yet.

0

Yes I mean gps device. Our interval will be one minute data. The comm server will be java. I plan to have multiple comm server. I have tried so far with around 2K devices. So i want to prepare now for bigger traffic. What I am worry can mysql handle this traffic?

0

I'm going with yes. Why? On a virtual host with MySQL and some JAVA code filing data away, 1K vehicles seem fine. The fleet uses cellular modems to send in location and the virtual host is on some hosting company.

I can only share so much about this but you with dedicated machine and 100 times the machine we use the only reason I could see it fail is a coding/config error. Or "Pilot error."

0

What java tios will you give I am doing connection pooling n prepared statement. What else will you help out based on your experience?

0

@newbie14. Sorry but your last post needs clarifying. I think I shared enough so far that you seem to be headed to overkill but then I don't know your code and system you went with.

0

@rprofitt. What I mean is my how I collect the gps data is via a java application. In the java application I do connection pooling and prepared statement. That is what I mean so for no clarification. Should I put my code snippet?

0

Sorry newbie14 but your questions are unclear. For example while I worked on a team that setup and have a live GPS tracker with so many features your questions are unclear and maybe you are not an English speaker. You wrote "What java tios ..." and didn't explain what tios means. It's not a word I encountered in Java or during our development cycle.

So back to the top. Your hardware specification of " ram to be 48Gb with 24 cpus. " sounds like sheer overkill to me. The system we ended up with is a virtually hosted app and handled thousands of vehicles. I can't imagine how many your server will handle but it must close to over a million.

0

@rproffitt sorry for my earlier post I got some typo error. Let me share below is how my tcp communicator snippet code looks like. Tcp communicator decodes the GPS location and inserts that data into the database and before inserting we do some selects and updates but all I do it using prepared statements. Right now, one thread of TCP communicator serves one device request.Immediately after creating the thread we get one connection from the pool. After decoding the GPS data is where we perform the multiple select, update and insert for each data. As number of devices are increasing, the number of concurrent connections to our Mysql database are also increasing. We are now anticipating 30 to 50 thousand devices pumping data every minute.Currently below is how the snippet of the whole tcp communicator looks like. I know eventually we will facing both insert bottleneck into the database. What will the best solution to over come this scenario? Will Java also be able to handle this many concurrency ?

public class comm8888 {
    HikariDataSource connectionPool = null;
    private Socket receivedSocketConn1;
    ConnectionHandler(Socket receivedSocketConn1) {
      this.receivedSocketConn1=receivedSocketConn1;
    }
    Connection dbconn = null;
    public void run() { // etc
     DataOutputStream w = null;
     DataInputStream r = null;  
     String message="";
     receivedSocketConn1.setSoTimeout(60000);
       dbconn = connectionPool.getConnection();
     dbconn.setAutoCommit(false);
     try {
         w = new DataOutputStream(new BufferedOutputStream(receivedSocketConn1.getOutputStream()));
         r = new DataInputStream(new BufferedInputStream(receivedSocketConn1.getInputStream()));
         while ((m=r.read()) != -1){
             //multiple prepared based sql select,update and insert here.
         }
     }
     finally{
        try {
            if ( dbconn != null ) {
              dbconn.close();
            }
        }
        catch(SQLException ex){
             ex.printStackTrace();
        }
        try{
           if ( w != null ){
                w.close();
                r.close();
                receivedSocketConn1.close();
            }
        }
        catch(IOException ex){
           ex.printStackTrace(System.out);
        }
      }
   }
}

    public static void main(String[] args) {
      new comm8888();
    }
    comm8888() {
      try {

          HikariConfig config = new HikariConfig();
                config.setJdbcUrl("jdbc:mysql://localhost:3306/testdata"); 
                config.setUsername("****"); 
                config.setPassword("****");      
                config.setMaximumPoolSize(20);      
          connectionPool = new HikariDataSource(config); // setup the connection pool           
       }
          catch (Exception e) {
                e.printStackTrace(System.out);
         }
          try 
          {
                   final ServerSocket serverSocketConn = new ServerSocket(8888);                
                   while (true){
                            try {
                                    Socket socketConn1 = serverSocketConn.accept();
                                    new Thread(new ConnectionHandler(socketConn1)).start();                     
                            }
                            catch(Exception e){
                                e.printStackTrace(System.out);
                            }
                        }
          } 
          catch (Exception e) {
             e.printStackTrace(System.out);

          }

    }
} 
0

@N so you need to do load testing. My bet is you won't find this to bottleneck to RAM or CPU. And if we had a few thousand with a virtual host, and I'm repeating, the hardware you listed looks good from a million clients if the internet and disk bandwidth holds up.

0

@rprofitt I wonder how to do a parallel 50 thousand load testing ? What tool is available but if I write my own socket client that will be sequential not parallel. If you notice at the tcp communicator I will be doing a number select and updates and finally only do the insert. So in one minute can it handle up to nearly 50 thousand of these type of select,update and insert queries? How many thousands you had and what was your hardware specs? I am sure I will need to do some tweak at the db level like innodb tweak and query indexing ? Over time the db will increase its size and insert will slow down right?

0

Slow downs in most MySQL systems I've dealt with are usually caused by simple things like SELECT * statements.

I'm a little amazed you are approaching this scale of a system without a programming team. Here we had about a dozen working on the product at one point. I guess there are companies that would hope you as the lone programmer will do it all.

You certainly crushed the RAM and CPU count.

0

@rprofitt did you face slow down while insert multiple concurrent data which already have hundred of millions of data. Any idea or experience handling those traffic ?We are a team not just alone programmer. The issue now we are looking into solution for this as we anticipate the growth in near future we dont want to be caught later and not knowing that to do when we are facing the traffic so we are doing research and study from now onwards. I dont get you when you said you crushed the ram and cpu you mean its too much for the ram?

0

@cereal thank you I have been reading about it. The problem is that I cant send the same data for the 50K connection which I need to make I need to simulate as different device id for that purpose can mysqlslap handle this?

1

@newbie. We never saw it. I didn't want to bring this up but how is your internet connection concurrent? That is, at the data center we used the packets would arrive one after another. Yes, to a human it seems concurrent but it really isn't.

So to test load you just load up the input and see how many packets your system can process per second. Bingo, you have your throughput numbers. My bet is your internet feed or drives will saturate before the 24 CPUs do.

0

@rpfrofitt Wow this something new to me. I thought all the while the data is concurrent at the data center which what you said makes sense though.

How can I create multiple client via java I have this code snippet to do a stress test. The problem this is a sequential test how to make it simulate as parallel test?

public class stress1 extends Thread {
public static void main(String[] argv) {
    try {
        for (int i = 0; i < 50; i++) {
        Socket socket = new Socket("192.168.2.102", 8000);
        PrintWriter out = new PrintWriter(socket.getOutputStream(), true);
        BufferedReader in = new BufferedReader(new InputStreamReader(
        socket.getInputStream()));
        out.println("$ACTEST35,30061104075130528955N10024852E000068*03A1*");
        System.out.println(in.readLine() + i);
        out.close();
        in.close();
        socket.close();
        }
    } catch (Exception e) {
        e.printStackTrace();
   }
  }
}
0

@N, there you go with an example over an IP sending data. That information will arrive in sequence and not parallel. You seem stuck on this and I don't have a good explainer to get you unstuck.

Forget parallel for many reasons like how IP stacks buffer and more then focus on throughput. Test and find the answer to the simpler question of "How many requests per second can my system process?"

0

@rproffitt ok I will go with your advice. So in other words everything is buffered at the socket level is that what your trying to explain to me? So how to test for this "How many requests per second can my system process?" What about the code snippet which I have put some to do the stress test? So if its trully not parallel then is all sequential why do we need multithread just would like to understand moer details?

Edited by newbie14

0

@N, I am not trying that hard to get you unstuck from the usual new to this view or parallel processing. You can imagine what would happen to the internet if there was no buffer in systems today. I've given you a view to where your monster machine is overkill by sharing a little about the system the company here went with.

I've seen this before and usually the persons that go down this path burn out as they pile a million bucks into servers and run out of cash before the company lifted off.

Here the company went with a cloud so it could scale later. The group is a lot more seasoned and we knew to not overdesign this. We went with a simple desktop server during demos, later it fit on a laptop and finally went into a virtual host in a cloud system. It runs fine with thousands of clients.

PS. Get a measure of throughput. How many transactions per second can you achieve. Then think where the bottleneck is. With 24 CPUs it won't be the CPUs.

Edited by rproffitt: Clarifying

0

@rprofitt actually I am trying to get unstuck from the parallel processing view. Let me understand at the lowest level say now I got 10 devices trying to make connection to my socket server. So what is happening is that all the data is line up in the buffer right. So then how is the socket start here.

Socket socketConn1 = serverSocketConn.accept();
                                    new Thread(new ConnectionHandler(socketConn1)).start();    

Can I say that is this done in sequential method? So is it safe for the device to keep holding the socket and not close for a long time or is better to close?

Actually the monster machine we have not setup we are planning to setup not yet on the cloud too.

Can you give me more tips on this " Get a measure of throughput. How many transactions per second can you achieve. " How get this metrics any method or can I work based on my stress test script ? Please give more hints as I am willing to learn and change accordingly.

0

AHH, now that's where I can see a design blow up. We went with UDP rather than TCP to avoid a pile of open sockets. Using UDP meant we developed our own protocol and you may not ask me for that. But if you were to try to open thousands of TCP sockets I can see where you would blow up.

0

@rproffitt ok I got it unfortunately the device itself support tcp only so not much of choice left here. No problem I wont ask you about your udp. I want to first understand so you said every tcp which comes in is buffered rite. Then it pick up the socket.accept and create a new socket thread is that correct? There after the device is connected and send the data over the socket which have been build ? Am I right here? So thereafter the socket is build is it still sequentially done right. So how I do a stress test on how many thousand of socket it can handle in one second or one minute to know the throughput?

0

@N, TCP only? You're headed towards a small disaster. I can't know what product you are using and if your device's coder learned about this and took steps to mitigate it. We went UDP and our own protocol to competely avoid that issue plus drop communication costs.

Look at the byte count on a simple (?) open close of a TCP socket and then how may byte (cost) of the UDP send. Since our system is cost averse we needed UDP and then our own protocol. There are companies that go TCP and you see it affect the monthly fees.

-> About the testing. I see you have replies above and it appears you are in the market for more team members. In short (again) start testing. You have no CPU limitation from what I see from your top post.

0

@rpfroffitt yes I know tcp is kind of more costly cause we need to setup the communication etc. Before going further on testing I want know more the sequential buffering for tcp I am not so clear on that? Can you bried me more on that kind of confuse?

I know about testing need tips what and how to test? Kind of very lost here?

0

I see the testing tips were given so my advice is to craft up your tests now as best you can.

-> I want to broach another subject as well. We found that no single client went over a few thousand vehicles. This meant we could spread the load to other virtual servers and databases. Let me be clear.

NO CUSTOMER wanted their data mixed into a "central" database. They wanted their data to be their data. This is another factor in why you'll never get over a few thousand clients per database.

0

@rproffitt seriously I am still lost on the understanding of the sequential part when you say that at the socket.accept and how they get it running in multithreaded. The issue I am still not clear is the testing how to craft up the test still lost at these 2 points.

I agree this not a single client exactly is more like a semi govt application which has many mini customer and later we need to do some big data stuff on this data. That is the reason we need one single central database.

0

@n. There are so many priors about handling multiple connections that I won't broach that here. For such a big design, I think you need to do more research, code more, test more and talk to my contact on the system I worked on.

You could have left out a big secret like this was your homework, thesis or something else. For such a big system you never find a team of one.

0

@rprofitt yes I know there many prior to handling multiple connections e.g. bandwidth, servers etc. Yes I will be willing to talk to your contact or to you on personal email. For now I just want to do a load test and throughput test to find what will be exact server requirements when we reach that traffic and scale. So that is what to be exact I am trying to learn and understand.

I told you this not a team of one. We have designer, programmer, map guys, server guys etc. So definitely not a single person running is crazy no way to achieve it either.

0

@n, thanks for telling me about the team. I do think you may run into some awful issue with TCP/IP only. My contact went pretty deep why but for the most part it was all about cost. I saw issues with holding a TCP connection open and then the mobile client moving from tower to tower. I never heard if the OSes and carriers found a way to fix it so your apps don't have to handle that.

Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.