Quantcast

memory leaks stopping batch process

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

memory leaks stopping batch process

Alan Bowsher-2
I have a Grails batch process (via Quartz) that is dying because of gc thrashing.  I profiled it with JVisualVM (see the attached picture memory.png).  Notice that once you get to java.util.Vector the Generations drops off, and everything at that level and below stays pretty flat while everything above continues to grow and grow.  Since everything else are primitives, and none of my application classes appear to be hanging around, I decided to focus in on the 3 classes that are at least somewhat higher-level.

Attached are the stack traces for those 3 classes.  Two appear to be related to a flush() and one to a Hibernate query.  In all 3 cases there appear to be Grails classes involved, so I'm guessing these are issues in the Grails codebase.

I don't have a code sample to attach because it's all proprietary and I'd need to cut it down to some sort of example, so I do apologize for that.  Time permitting, I'll see what I can come up with, but if I'm doing this right, then I would think others would see similar issues.

The code involved is not that complicated.  It does a Find based on a date, does a couple of other simple table look-ups, changes a single value in the original item being looked up, and saves it out to the database.  It all works correctly, it just runs out of memory.

We do flush and clear the session every 100 entities processed, and we are doing the DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear() (note that before the clear() in the debugger, this map never seemed to return any items anyway, so I don't think it's involved).

Also, since this is in the context of a web app, I tried setting the object being saved's errors collection to null after the save().

Also, I tried upgrading to 1.3.7 and it didn't help.

Would anyone have any insight?

I will try to dig deeper but (as we all are) I'm in a hurry and wanted to get the question out there to see if anyone else had hit this.  The only fallback I can see right now is to rewrite this entirely outside of Gorm (since both reads and writes seem to be causing problems).

In fact, if this turns out to be an unsolvable problem, I may have to consider moving away from Grails entirely, because for a high-traffic website, this problem would eventually make the web application fail as well.  Right now I'm hoping there's an easy solution (or that I'm wrong).

Thanks for any help,
Alan
 

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

leak1.png (126K) Download Attachment
leak2.png (123K) Download Attachment
leak3.png (74K) Download Attachment
memory.png (33K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: memory leaks stopping batch process

Alan Bowsher-2
Well, turning off validation on the save seemed to take care of most of the problem. I haven't run the full thing yet and have some code to straighten out after trying lots of different things before I'm sure that that's the only issue.

So, this works in this case, but there definitely is some sort of leak in the validation code.  This doesn't appear to be the exact same case at Burt's famous one (http://burtbeckwith.com/blog/?p=73), although I could be wrong because I don't understand all the Grails internals.  In my case it seems to happen underneath AbstractDynamicPersistentMethod.setupErrorsProperty (see the screenshots).

I'll post back if we are able to figure out anything else.

On Mon, Jun 6, 2011 at 8:20 AM, Alan Bowsher <[hidden email]> wrote:
I have a Grails batch process (via Quartz) that is dying because of gc thrashing.  I profiled it with JVisualVM (see the attached picture memory.png).  Notice that once you get to java.util.Vector the Generations drops off, and everything at that level and below stays pretty flat while everything above continues to grow and grow.  Since everything else are primitives, and none of my application classes appear to be hanging around, I decided to focus in on the 3 classes that are at least somewhat higher-level.

Attached are the stack traces for those 3 classes.  Two appear to be related to a flush() and one to a Hibernate query.  In all 3 cases there appear to be Grails classes involved, so I'm guessing these are issues in the Grails codebase.

I don't have a code sample to attach because it's all proprietary and I'd need to cut it down to some sort of example, so I do apologize for that.  Time permitting, I'll see what I can come up with, but if I'm doing this right, then I would think others would see similar issues.

The code involved is not that complicated.  It does a Find based on a date, does a couple of other simple table look-ups, changes a single value in the original item being looked up, and saves it out to the database.  It all works correctly, it just runs out of memory.

We do flush and clear the session every 100 entities processed, and we are doing the DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear() (note that before the clear() in the debugger, this map never seemed to return any items anyway, so I don't think it's involved).

Also, since this is in the context of a web app, I tried setting the object being saved's errors collection to null after the save().

Also, I tried upgrading to 1.3.7 and it didn't help.

Would anyone have any insight?

I will try to dig deeper but (as we all are) I'm in a hurry and wanted to get the question out there to see if anyone else had hit this.  The only fallback I can see right now is to rewrite this entirely outside of Gorm (since both reads and writes seem to be causing problems).

In fact, if this turns out to be an unsolvable problem, I may have to consider moving away from Grails entirely, because for a high-traffic website, this problem would eventually make the web application fail as well.  Right now I'm hoping there's an easy solution (or that I'm wrong).

Thanks for any help,
Alan
 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: memory leaks stopping batch process

basejump (Josh)
We are doing millions of records with Gorm now and getting high enough sub 10 minute speeds and never seem to get above an acceptable and remarkably low memory level
Those results I cannot publish but the gist of what we did to get there can be found http://github.com/basejump/grails-gpars-batch-load-benchmark
I hope to get some good real world "non confidential"for both update and insert benchmarks. That is actually a little more work that is sounds when millions of records are in play with 20+ fields for the domain


On Jun 6, 2011, at 12:35 PM, Alan Bowsher wrote:

Well, turning off validation on the save seemed to take care of most of the problem. I haven't run the full thing yet and have some code to straighten out after trying lots of different things before I'm sure that that's the only issue.

So, this works in this case, but there definitely is some sort of leak in the validation code.  This doesn't appear to be the exact same case at Burt's famous one (http://burtbeckwith.com/blog/?p=73), although I could be wrong because I don't understand all the Grails internals.  In my case it seems to happen underneath AbstractDynamicPersistentMethod.setupErrorsProperty (see the screenshots).

I'll post back if we are able to figure out anything else.

On Mon, Jun 6, 2011 at 8:20 AM, Alan Bowsher <[hidden email]> wrote:
I have a Grails batch process (via Quartz) that is dying because of gc thrashing.  I profiled it with JVisualVM (see the attached picture memory.png).  Notice that once you get to java.util.Vector the Generations drops off, and everything at that level and below stays pretty flat while everything above continues to grow and grow.  Since everything else are primitives, and none of my application classes appear to be hanging around, I decided to focus in on the 3 classes that are at least somewhat higher-level.

Attached are the stack traces for those 3 classes.  Two appear to be related to a flush() and one to a Hibernate query.  In all 3 cases there appear to be Grails classes involved, so I'm guessing these are issues in the Grails codebase.

I don't have a code sample to attach because it's all proprietary and I'd need to cut it down to some sort of example, so I do apologize for that.  Time permitting, I'll see what I can come up with, but if I'm doing this right, then I would think others would see similar issues.

The code involved is not that complicated.  It does a Find based on a date, does a couple of other simple table look-ups, changes a single value in the original item being looked up, and saves it out to the database.  It all works correctly, it just runs out of memory.

We do flush and clear the session every 100 entities processed, and we are doing the DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear() (note that before the clear() in the debugger, this map never seemed to return any items anyway, so I don't think it's involved).

Also, since this is in the context of a web app, I tried setting the object being saved's errors collection to null after the save().

Also, I tried upgrading to 1.3.7 and it didn't help.

Would anyone have any insight?

I will try to dig deeper but (as we all are) I'm in a hurry and wanted to get the question out there to see if anyone else had hit this.  The only fallback I can see right now is to rewrite this entirely outside of Gorm (since both reads and writes seem to be causing problems).

In fact, if this turns out to be an unsolvable problem, I may have to consider moving away from Grails entirely, because for a high-traffic website, this problem would eventually make the web application fail as well.  Right now I'm hoping there's an easy solution (or that I'm wrong).

Thanks for any help,
Alan
 


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: memory leaks stopping batch process

Mohamed Seifeddine
Have you tried

sessionFactory.currentSession.flush()
sessionFactory.currentSession.clear()
org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()

When you are ready to flush?


On Tue, Jun 7, 2011 at 4:58 AM, basejump (Josh) <[hidden email]> wrote:
We are doing millions of records with Gorm now and getting high enough sub 10 minute speeds and never seem to get above an acceptable and remarkably low memory level
Those results I cannot publish but the gist of what we did to get there can be found http://github.com/basejump/grails-gpars-batch-load-benchmark
I hope to get some good real world "non confidential"for both update and insert benchmarks. That is actually a little more work that is sounds when millions of records are in play with 20+ fields for the domain


On Jun 6, 2011, at 12:35 PM, Alan Bowsher wrote:

Well, turning off validation on the save seemed to take care of most of the problem. I haven't run the full thing yet and have some code to straighten out after trying lots of different things before I'm sure that that's the only issue.

So, this works in this case, but there definitely is some sort of leak in the validation code.  This doesn't appear to be the exact same case at Burt's famous one (http://burtbeckwith.com/blog/?p=73), although I could be wrong because I don't understand all the Grails internals.  In my case it seems to happen underneath AbstractDynamicPersistentMethod.setupErrorsProperty (see the screenshots).

I'll post back if we are able to figure out anything else.

On Mon, Jun 6, 2011 at 8:20 AM, Alan Bowsher <[hidden email]> wrote:
I have a Grails batch process (via Quartz) that is dying because of gc thrashing.  I profiled it with JVisualVM (see the attached picture memory.png).  Notice that once you get to java.util.Vector the Generations drops off, and everything at that level and below stays pretty flat while everything above continues to grow and grow.  Since everything else are primitives, and none of my application classes appear to be hanging around, I decided to focus in on the 3 classes that are at least somewhat higher-level.

Attached are the stack traces for those 3 classes.  Two appear to be related to a flush() and one to a Hibernate query.  In all 3 cases there appear to be Grails classes involved, so I'm guessing these are issues in the Grails codebase.

I don't have a code sample to attach because it's all proprietary and I'd need to cut it down to some sort of example, so I do apologize for that.  Time permitting, I'll see what I can come up with, but if I'm doing this right, then I would think others would see similar issues.

The code involved is not that complicated.  It does a Find based on a date, does a couple of other simple table look-ups, changes a single value in the original item being looked up, and saves it out to the database.  It all works correctly, it just runs out of memory.

We do flush and clear the session every 100 entities processed, and we are doing the DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear() (note that before the clear() in the debugger, this map never seemed to return any items anyway, so I don't think it's involved).

Also, since this is in the context of a web app, I tried setting the object being saved's errors collection to null after the save().

Also, I tried upgrading to 1.3.7 and it didn't help.

Would anyone have any insight?

I will try to dig deeper but (as we all are) I'm in a hurry and wanted to get the question out there to see if anyone else had hit this.  The only fallback I can see right now is to rewrite this entirely outside of Gorm (since both reads and writes seem to be causing problems).

In fact, if this turns out to be an unsolvable problem, I may have to consider moving away from Grails entirely, because for a high-traffic website, this problem would eventually make the web application fail as well.  Right now I'm hoping there's an easy solution (or that I'm wrong).

Thanks for any help,
Alan
 



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: memory leaks stopping batch process

Alan Bowsher-2
Thanks Mohamed... yes, we are doing those things.

Good news is, things seem to be fine as long as I turn off the validation on the save.  Bad news is, there's a leak there.

On Tue, Jun 7, 2011 at 8:05 AM, Mohamed Seifeddine <[hidden email]> wrote:
Have you tried

sessionFactory.currentSession.flush()
sessionFactory.currentSession.clear()
org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()

When you are ready to flush?


On Tue, Jun 7, 2011 at 4:58 AM, basejump (Josh) <[hidden email]> wrote:
We are doing millions of records with Gorm now and getting high enough sub 10 minute speeds and never seem to get above an acceptable and remarkably low memory level
Those results I cannot publish but the gist of what we did to get there can be found http://github.com/basejump/grails-gpars-batch-load-benchmark
I hope to get some good real world "non confidential"for both update and insert benchmarks. That is actually a little more work that is sounds when millions of records are in play with 20+ fields for the domain


On Jun 6, 2011, at 12:35 PM, Alan Bowsher wrote:

Well, turning off validation on the save seemed to take care of most of the problem. I haven't run the full thing yet and have some code to straighten out after trying lots of different things before I'm sure that that's the only issue.

So, this works in this case, but there definitely is some sort of leak in the validation code.  This doesn't appear to be the exact same case at Burt's famous one (http://burtbeckwith.com/blog/?p=73), although I could be wrong because I don't understand all the Grails internals.  In my case it seems to happen underneath AbstractDynamicPersistentMethod.setupErrorsProperty (see the screenshots).

I'll post back if we are able to figure out anything else.

On Mon, Jun 6, 2011 at 8:20 AM, Alan Bowsher <[hidden email]> wrote:
I have a Grails batch process (via Quartz) that is dying because of gc thrashing.  I profiled it with JVisualVM (see the attached picture memory.png).  Notice that once you get to java.util.Vector the Generations drops off, and everything at that level and below stays pretty flat while everything above continues to grow and grow.  Since everything else are primitives, and none of my application classes appear to be hanging around, I decided to focus in on the 3 classes that are at least somewhat higher-level.

Attached are the stack traces for those 3 classes.  Two appear to be related to a flush() and one to a Hibernate query.  In all 3 cases there appear to be Grails classes involved, so I'm guessing these are issues in the Grails codebase.

I don't have a code sample to attach because it's all proprietary and I'd need to cut it down to some sort of example, so I do apologize for that.  Time permitting, I'll see what I can come up with, but if I'm doing this right, then I would think others would see similar issues.

The code involved is not that complicated.  It does a Find based on a date, does a couple of other simple table look-ups, changes a single value in the original item being looked up, and saves it out to the database.  It all works correctly, it just runs out of memory.

We do flush and clear the session every 100 entities processed, and we are doing the DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear() (note that before the clear() in the debugger, this map never seemed to return any items anyway, so I don't think it's involved).

Also, since this is in the context of a web app, I tried setting the object being saved's errors collection to null after the save().

Also, I tried upgrading to 1.3.7 and it didn't help.

Would anyone have any insight?

I will try to dig deeper but (as we all are) I'm in a hurry and wanted to get the question out there to see if anyone else had hit this.  The only fallback I can see right now is to rewrite this entirely outside of Gorm (since both reads and writes seem to be causing problems).

In fact, if this turns out to be an unsolvable problem, I may have to consider moving away from Grails entirely, because for a high-traffic website, this problem would eventually make the web application fail as well.  Right now I'm hoping there's an easy solution (or that I'm wrong).

Thanks for any help,
Alan
 




Loading...