Re: Urgent, tomact cpu fills up to 100% after month without problems

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Urgent, tomact cpu fills up to 100% after month without problems

Lari Hotari -
You might have a bug in your app that causes an infinite loop.
You can do a thread dump in unix by sending a SIGQUIT (3) signal to the Java process PID.

For example: kill -3 <pid of java process running tomcat>

the thread dump will go in to catalina.out by default. It won't terminate the Java process so you can normally use this method in production environments.

-Lari


13.01.2014 18:33, Christian Rommel wrote:
Hello,

We need urgent help, also willing to pay if someone is kind enough to help us over teamviewer, my skype is "rommel.megusta".

We are running grails 2.2.3 on tomcat since months. Now suddenly after five minutes all 32 cores go up to 100%, rebooting doesn't help. 

Also after shutting down apache, cpu stays up at 100%, even though no more requests are able to get it.

We have no idea and are desperate.

Bye Chris

Reply | Threaded
Open this post in threaded view
|

Re: Urgent, tomact cpu fills up to 100% after month without problems

Gil-2
If your application was running correctly before. Most probably you have a hardware problem.  Before you start going into your application. Try running it in another server.

Regards

Gil


On Mon, Jan 13, 2014 at 6:48 PM, Lari Hotari <[hidden email]> wrote:
You might have a bug in your app that causes an infinite loop.
You can do a thread dump in unix by sending a SIGQUIT (3) signal to the Java process PID.

For example: kill -3 <pid of java process running tomcat>

the thread dump will go in to catalina.out by default. It won't terminate the Java process so you can normally use this method in production environments.

-Lari


13.01.2014 18:33, Christian Rommel wrote:
Hello,

We need urgent help, also willing to pay if someone is kind enough to help us over teamviewer, my skype is "rommel.megusta".

We are running grails 2.2.3 on tomcat since months. Now suddenly after five minutes all 32 cores go up to 100%, rebooting doesn't help. 

Also after shutting down apache, cpu stays up at 100%, even though no more requests are able to get it.

We have no idea and are desperate.

Bye Chris


Reply | Threaded
Open this post in threaded view
|

Re: Urgent, tomact cpu fills up to 100% after month without problems

Graeme Rocher-2
Could be that the volume of data you have in your database has increased as well, resulting in larger query sizes and memory exhaustion 

Cheers


On Mon, Jan 13, 2014 at 6:53 PM, Gil <[hidden email]> wrote:
If your application was running correctly before. Most probably you have a hardware problem.  Before you start going into your application. Try running it in another server.

Regards

Gil


On Mon, Jan 13, 2014 at 6:48 PM, Lari Hotari <[hidden email]> wrote:
You might have a bug in your app that causes an infinite loop.
You can do a thread dump in unix by sending a SIGQUIT (3) signal to the Java process PID.

For example: kill -3 <pid of java process running tomcat>

the thread dump will go in to catalina.out by default. It won't terminate the Java process so you can normally use this method in production environments.

-Lari


13.01.2014 18:33, Christian Rommel wrote:
Hello,

We need urgent help, also willing to pay if someone is kind enough to help us over teamviewer, my skype is "rommel.megusta".

We are running grails 2.2.3 on tomcat since months. Now suddenly after five minutes all 32 cores go up to 100%, rebooting doesn't help. 

Also after shutting down apache, cpu stays up at 100%, even though no more requests are able to get it.

We have no idea and are desperate.

Bye Chris





--
Graeme Rocher
Grails Project Lead
SpringSource
Reply | Threaded
Open this post in threaded view
|

Re: Urgent, tomact cpu fills up to 100% after month without problems

Lari Hotari -

Did you already do the thread dumps when you encounter the problem?
It's one of the best ways to find out what's going on in the JVM when it gets crazy.

You can use "kill -3 PID" or "jstack PID" to get the JVM to output the threaddump to stdout (which goes to catalina.out on tomcat by default).
Don't post the dump to the mailing list. It's worth uploading it to gist/pastebin/dropbox and just linking it. It might also contain sensitive information so it might not be good to share it publicly.

Usually it's worth doing a few subsequent dumps a few seconds a part. This way you can diff the dumps to see what changes and what stays the same. You might have to do diffing manually on thread basis, but that's another story. First get a hold of the dumps...

(unrelated question: Why are you running on such a huge heap space? It's usually better to run several smaller Tomcat instances and use a load balancer like Apache mod_proxy_balancer ( http://httpd.apache.org/docs/current/mod/mod_proxy_balancer.html) in front of them.)

A very common cause for 100% CPU / infinite loop problem is the concurrent use of java.util.HashMap:
http://stackoverflow.com/questions/13695832/explain-the-timing-causing-hashmap-put-to-execute-an-infinite-loop
Of course your problem can be something different. The threaddump will most likely point you to the correct direction.

Lari


14.01.2014 00:46, Christian Rommel wrote:
Hello,

thank you all for the ideas so far. The problem seems (my guess) to be related to the data pool. dataSourceUnproxied.numActive keeps growing and growing without the load changing. At the beginning we have around 5 open connections, grows to 50 and more.  

After clean tomcat startup of 32 cores maybe 4 are busy. Over the next maybe 15 minutes (time varies) all other cores get busy as well (open connections growing simultaneously). 

grails 2.2.3
ubuntu 12.04
java oracle 1.7.0_45-b18
mysql 5.5.34
tomcat 7.0.42

Here are the Datasource config. 

We are using (same problem with the standard pool) the plugin ":jdbc-pool:7.0.47"

dataSource {
    pooled   = true
    url = "**"
    driverClassName = "com.mysql.jdbc.Driver"
    dialect = 'org.hibernate.dialect.MySQL5InnoDBDialect'
    username = **
    password = **

    properties {
        maxWait = 60000
        maxActive = 100
        removeAbandoned=true
        removeAbandonedTimeout=30000
        logAbandoned=true
        minEvictableIdleTimeMillis=1800000
        timeBetweenEvictionRunsMillis=1800000
        numTestsPerEvictionRun=3
        testOnBorrow=true
        testWhileIdle=true
        testOnReturn=true
        validationQuery="SELECT 1"
    }
}  

CATALINA_OPTS="-server -XX:MaxPermSize=1024m -XX:+UseG1GC -Dfile.encoding=UTF-8 -Djava.awt.headless=true -Xms16384M -Xmx16384M -XX:+DisableExplicitGC -XX:+UseCodeCacheFlushing -XX:CodeCacheFlushingMinimumFreeSpace=45m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=1101 -Djava.rmi.server.hostname=127.0.0.1"


We appreciate any ideas.

Thank you
Bye Chris 


From: [hidden email]
Date: Mon, 13 Jan 2014 22:36:34 +0100
To: [hidden email]
Subject: Re: [grails-user] Urgent, tomact cpu fills up to 100% after month without problems

Could be that the volume of data you have in your database has increased as well, resulting in larger query sizes and memory exhaustion 

Cheers


On Mon, Jan 13, 2014 at 6:53 PM, Gil <[hidden email]> wrote:
If your application was running correctly before. Most probably you have a hardware problem.  Before you start going into your application. Try running it in another server.

Regards

Gil


On Mon, Jan 13, 2014 at 6:48 PM, Lari Hotari <[hidden email]> wrote:
You might have a bug in your app that causes an infinite loop.
You can do a thread dump in unix by sending a SIGQUIT (3) signal to the Java process PID.

For example: kill -3 <pid of java process running tomcat>

the thread dump will go in to catalina.out by default. It won't terminate the Java process so you can normally use this method in production environments.

-Lari


13.01.2014 18:33, Christian Rommel wrote:
Hello,

We need urgent help, also willing to pay if someone is kind enough to help us over teamviewer, my skype is "rommel.megusta".

We are running grails 2.2.3 on tomcat since months. Now suddenly after five minutes all 32 cores go up to 100%, rebooting doesn't help. 

Also after shutting down apache, cpu stays up at 100%, even though no more requests are able to get it.

We have no idea and are desperate.

Bye Chris





--
Graeme Rocher
Grails Project Lead
SpringSource

Reply | Threaded
Open this post in threaded view
|

Re: Urgent, tomact cpu fills up to 100% after month without problems

Tobias Kraft
In reply to this post by Graeme Rocher-2
Hi, 

do you open connections to the database in some cases manually. Maybe you don't release the connections in error cases. 

You should always surround it with a finally block and close the connection inside of the finally block. 

Sql sql = new Sql(myDataSource)
try {
  retVal = sql.rows(query)?.collect { row -> row2User(row) }
}
finally {
   sql.close()
}


2014/1/13 Christian Rommel <[hidden email]>
Hello,

thank you all for the ideas so far. The problem seems (my guess) to be related to the data pool. dataSourceUnproxied.numActive keeps growing and growing without the load changing. At the beginning we have around 5 open connections, grows to 50 and more.  

After clean tomcat startup of 32 cores maybe 4 are busy. Over the next maybe 15 minutes (time varies) all other cores get busy as well (open connections growing simultaneously). 

grails 2.2.3
ubuntu 12.04
java oracle 1.7.0_45-b18
mysql 5.5.34
tomcat 7.0.42

Here are the Datasource config. 

We are using (same problem with the standard pool) the plugin ":jdbc-pool:7.0.47"

dataSource {
    pooled   = true
    url = "**"
    driverClassName = "com.mysql.jdbc.Driver"
    dialect = 'org.hibernate.dialect.MySQL5InnoDBDialect'
    username = **
    password = **

    properties {
        maxWait = 60000
        maxActive = 100
        removeAbandoned=true
        removeAbandonedTimeout=30000
        logAbandoned=true
        minEvictableIdleTimeMillis=1800000
        timeBetweenEvictionRunsMillis=1800000
        numTestsPerEvictionRun=3
        testOnBorrow=true
        testWhileIdle=true
        testOnReturn=true
        validationQuery="SELECT 1"
    }
}  

CATALINA_OPTS="-server -XX:MaxPermSize=1024m -XX:+UseG1GC -Dfile.encoding=UTF-8 -Djava.awt.headless=true -Xms16384M -Xmx16384M -XX:+DisableExplicitGC -XX:+UseCodeCacheFlushing -XX:CodeCacheFlushingMinimumFreeSpace=45m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=1101 -Djava.rmi.server.hostname=127.0.0.1"


We appreciate any ideas.

Thank you
Bye Chris 


From: [hidden email]
Date: Mon, 13 Jan 2014 22:36:34 +0100
To: [hidden email]
Subject: Re: [grails-user] Urgent, tomact cpu fills up to 100% after month without problems


Could be that the volume of data you have in your database has increased as well, resulting in larger query sizes and memory exhaustion 

Cheers


On Mon, Jan 13, 2014 at 6:53 PM, Gil <[hidden email]> wrote:
If your application was running correctly before. Most probably you have a hardware problem.  Before you start going into your application. Try running it in another server.

Regards

Gil


On Mon, Jan 13, 2014 at 6:48 PM, Lari Hotari <[hidden email]> wrote:
You might have a bug in your app that causes an infinite loop.
You can do a thread dump in unix by sending a SIGQUIT (3) signal to the Java process PID.

For example: kill -3 <pid of java process running tomcat>

the thread dump will go in to catalina.out by default. It won't terminate the Java process so you can normally use this method in production environments.

-Lari


13.01.2014 18:33, Christian Rommel wrote:
Hello,

We need urgent help, also willing to pay if someone is kind enough to help us over teamviewer, my skype is "rommel.megusta".

We are running grails 2.2.3 on tomcat since months. Now suddenly after five minutes all 32 cores go up to 100%, rebooting doesn't help. 

Also after shutting down apache, cpu stays up at 100%, even though no more requests are able to get it.

We have no idea and are desperate.

Bye Chris





--
Graeme Rocher
Grails Project Lead
SpringSource