Quantcast

Memory leak

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Memory leak

burtbeckwith
I'm building a large database for load testing, so I'm creating millions of
domain instances. I'm trying to use the standard bulk insert approach in
Hibernate, i.e. flush() and clear() the session periodically, with the whole
operation running in a transaction.

The problem is the instances are still around - I run out of memory after a
few hours with a heap set at 1G. I've turned off 2nd-level caching and looked
at the Session in a debugger and clear() definitely works - it's empty
afterwards. There are no mapped collections, and I'm not keeping any explicit
references to the instances.

But running in a profiler I can see the instance count steadily increase and
never decrease. Running gc() has no effect.

Any thoughts on what might be keeping a reference to these instances?

Burt

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Erick Erickson
When you say "the whole operation running in a transaction" are you
talking about a transaction around the whole database load or around
an iteration of adding a bunch of records, doing a flush and clear?

Because I can imagine that Hibernate is holding a bunch of memory for
everything done in a transaction and if you're loading millions of instances
in a transaction....

Best
Erick@AShotInTheDark....

On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith <[hidden email]> wrote:
I'm building a large database for load testing, so I'm creating millions of
domain instances. I'm trying to use the standard bulk insert approach in
Hibernate, i.e. flush() and clear() the session periodically, with the whole
operation running in a transaction.

The problem is the instances are still around - I run out of memory after a
few hours with a heap set at 1G. I've turned off 2nd-level caching and looked
at the Session in a debugger and clear() definitely works - it's empty
afterwards. There are no mapped collections, and I'm not keeping any explicit
references to the instances.

But running in a profiler I can see the instance count steadily increase and
never decrease. Running gc() has no effect.

Any thoughts on what might be keeping a reference to these instances?

Burt

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Les Hazlewood
In reply to this post by burtbeckwith
The only thing I can think of is that it might be related to the transaction manager - for a long running transaction, the TM is probably retaining log entries to accommodate a rollback.  For such a long running transaction, that could be a lot of overhead.  What TM implementation are you using?

I myself just finished an application which required ridiculous amounts of data to be loaded on a daily basis in batch mode (gigabytes, millions of records across many tables).  I basically ensured that the insert operations happened in 1000 count 'mini' batches for 2 reasons:

1) to prevent a TM timeout (default was 5 minutes)
2) to prevent the TM overhead from getting too large (memory)

Here's some code I used to do that (Java + Spring TransactionTemplate).  Now I don't know if you are experiencing the same thing I did, but you might find it useful.

- note that the Iterator passed in represented a custom implementation that was reading records from a file stream - the file was over 10 Gigs, and the Iterator represented each 'chunked' record in the stream.  
- The 'flushCount' was a config variable from a .properties file - in our case, it was equal to 1000).
- the flush() method just called a DAO which internally called hibernateTemplate.flush() and then immediately hibernateTemplate.clear();
- you could try to extend the transaction timeout beyond the default (e.g. 5 minutes) by configuring the transaction manager, but
  odds are high you'd get into a TM housekeeping memory problem

protected int importBatch(final Iterator iterator, final int flushCount) {
        //Batch inserts need to be done as small chunks within their own transaction to avoid
        //transaction timeouts (and rollbacks) due to large data sets, so
        //specify PROPAGATION_REQUIRES_NEW.
        //
        //If we did NOT do this, and tried to import hundreds of thousands or millions of records,
        //odds are high that the transaction would time out
        //(e.g. after 5 minutes) and everything would be rolled back.  This is not desired.
        TransactionDefinition txnDef =
                new DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_NEW);

        TransactionTemplate txnTemplate = new TransactionTemplate(getTransactionManager(), txnDef);
        Integer importCount = (Integer) txnTemplate.execute(new TransactionCallback() {
            public Object doInTransaction(TransactionStatus status) {
                int batchRecordCount = 0;

                while (iterator.hasNext() ) {
                    Record record = (Record) iterator.next();
                    if (record != null) { //can be null if any records are skipped
                        importRecord(record);
                        batchRecordCount++;
                        notifyRecordImported(record);
                        if (batchRecordCount % flushCount == 0) {
                            flush();
                        }
                    }
                }
                if (batchRecordCount > 0) {
                    flush();
                }

                if (log.isInfoEnabled()) {
                    log.info("Batch imported " + batchRecordCount + " records in a single transaction.");
                }
                return batchRecordCount;
            }
        });
        return importCount != null ? importCount : 0;
    }

Note that if any of those 'mini' transactions fail, you have to do a manual rollback of all of the records that went in.  We accounted for this by storing an id in the table where these fields were inserted, and if any error occurred, performed a bulk delete ('delete from blah where manual_tx_id = foo').

HTH,

Les

On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith <[hidden email]> wrote:
I'm building a large database for load testing, so I'm creating millions of
domain instances. I'm trying to use the standard bulk insert approach in
Hibernate, i.e. flush() and clear() the session periodically, with the whole
operation running in a transaction.

The problem is the instances are still around - I run out of memory after a
few hours with a heap set at 1G. I've turned off 2nd-level caching and looked
at the Session in a debugger and clear() definitely works - it's empty
afterwards. There are no mapped collections, and I'm not keeping any explicit
references to the instances.

But running in a profiler I can see the instance count steadily increase and
never decrease. Running gc() has no effect.

Any thoughts on what might be keeping a reference to these instances?

Burt

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Les Hazlewood
Oops, I didn't give you all the code - the method I posted earlier performed one and only one 'mini' batch insert.  It was called from another method that coordinated calling that method as necessary:

public void importProductBatched(Product product) {
        if (log.isInfoEnabled()) {
            log.info("Performing a batching import for product of type [" + product.getClass().getName() + "]");
        }
        preBatch();
        Iterator i = product.iterator();
        while (i.hasNext()) {
            int batchImported = importBatch(i, getBatchCount(), getFlushCount());
            product.setCommittedBytesRead(product.getBytesRead());
            product.setCommittedRecordsRead(((RecordIterator) i).getIndex());
            log.info("Batched imported Count" + batchImported + " bytes read =" + product.getBytesRead() + "|" + ((RecordIterator) i).getIndex());
        }
        if (log.isInfoEnabled()) {
            log.info("Successfully imported batch for product of type [" + product.getClass().getName() + "]");
        }
        postBatch();
    }

And then the importBatch method:

protected int importBatch(final Iterator iterator, final int maxRecords, final int flushCount) {
        //Batch inserts need to be done as small chunks within their own transaction to avoid
        //transaction timeouts (and rollbacks) due to large data sets, so
        //specify PROPAGATION_REQUIRES_NEW.
        //
        //If we did NOT do this, and imported in one transaction, and the Product contained
        //thousands of records (as does happen during batch mode), odds are high that the transaction would time out
        //(e.g. after 5 minutes) and everything would be rolled back.  This is not desired.
        TransactionDefinition txnDef =
                new DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_NEW);

        TransactionTemplate txnTemplate = new TransactionTemplate(getTransactionManager(), txnDef);
        Integer importCount = (Integer) txnTemplate.execute(new TransactionCallback() {
            public Object doInTransaction(TransactionStatus status) {
                int batchRecordCount = 0;

                while (iterator.hasNext() && batchRecordCount < maxRecords) {
                    Record record = (Record) iterator.next();
                    if (record != null) { //can be null if any records are skipped
                        importRecord(record);
                        batchRecordCount++;
                        notifyRecordImported(record);
                        if (batchRecordCount % flushCount == 0) {
                            flush();
                        }
                    }
                }
                if (batchRecordCount > 0) {
                    flush();
                }

                if (log.isInfoEnabled()) {
                    log.info("Batch imported " + batchRecordCount + " records in a single transaction.");
                }
                return batchRecordCount;
            }
        });
        return importCount != null ? importCount : 0;
    }

Regards,

Les

On Thu, Sep 4, 2008 at 5:30 PM, Les Hazlewood <[hidden email]> wrote:
The only thing I can think of is that it might be related to the transaction manager - for a long running transaction, the TM is probably retaining log entries to accommodate a rollback.  For such a long running transaction, that could be a lot of overhead.  What TM implementation are you using?

I myself just finished an application which required ridiculous amounts of data to be loaded on a daily basis in batch mode (gigabytes, millions of records across many tables).  I basically ensured that the insert operations happened in 1000 count 'mini' batches for 2 reasons:

1) to prevent a TM timeout (default was 5 minutes)
2) to prevent the TM overhead from getting too large (memory)

Here's some code I used to do that (Java + Spring TransactionTemplate).  Now I don't know if you are experiencing the same thing I did, but you might find it useful.

- note that the Iterator passed in represented a custom implementation that was reading records from a file stream - the file was over 10 Gigs, and the Iterator represented each 'chunked' record in the stream.  
- The 'flushCount' was a config variable from a .properties file - in our case, it was equal to 1000).
- the flush() method just called a DAO which internally called hibernateTemplate.flush() and then immediately hibernateTemplate.clear();
- you could try to extend the transaction timeout beyond the default (e.g. 5 minutes) by configuring the transaction manager, but
  odds are high you'd get into a TM housekeeping memory problem

protected int importBatch(final Iterator iterator, final int flushCount) {
        //Batch inserts need to be done as small chunks within their own transaction to avoid
        //transaction timeouts (and rollbacks) due to large data sets, so
        //specify PROPAGATION_REQUIRES_NEW.
        //
        //If we did NOT do this, and tried to import hundreds of thousands or millions of records,
        //odds are high that the transaction would time out
        //(e.g. after 5 minutes) and everything would be rolled back.  This is not desired.
        TransactionDefinition txnDef =
                new DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_NEW);

        TransactionTemplate txnTemplate = new TransactionTemplate(getTransactionManager(), txnDef);
        Integer importCount = (Integer) txnTemplate.execute(new TransactionCallback() {
            public Object doInTransaction(TransactionStatus status) {
                int batchRecordCount = 0;

                while (iterator.hasNext() ) {
                    Record record = (Record) iterator.next();
                    if (record != null) { //can be null if any records are skipped
                        importRecord(record);
                        batchRecordCount++;
                        notifyRecordImported(record);
                        if (batchRecordCount % flushCount == 0) {
                            flush();
                        }
                    }
                }
                if (batchRecordCount > 0) {
                    flush();
                }

                if (log.isInfoEnabled()) {
                    log.info("Batch imported " + batchRecordCount + " records in a single transaction.");
                }
                return batchRecordCount;
            }
        });
        return importCount != null ? importCount : 0;
    }

Note that if any of those 'mini' transactions fail, you have to do a manual rollback of all of the records that went in.  We accounted for this by storing an id in the table where these fields were inserted, and if any error occurred, performed a bulk delete ('delete from blah where manual_tx_id = foo').

HTH,

Les

On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith <[hidden email]> wrote:
I'm building a large database for load testing, so I'm creating millions of
domain instances. I'm trying to use the standard bulk insert approach in
Hibernate, i.e. flush() and clear() the session periodically, with the whole
operation running in a transaction.

The problem is the instances are still around - I run out of memory after a
few hours with a heap set at 1G. I've turned off 2nd-level caching and looked
at the Session in a debugger and clear() definitely works - it's empty
afterwards. There are no mapped collections, and I'm not keeping any explicit
references to the instances.

But running in a profiler I can see the instance count steadily increase and
never decrease. Running gc() has no effect.

Any thoughts on what might be keeping a reference to these instances?

Burt

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email




Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

burtbeckwith
I spent quite a while on this today and think the issue has to do with Grails
and has nothing to do with transactions. Btw - it's not a memory leak in the
classic sense, but a GC issue - domain instances were somehow referenced and
couldn't be GC'd. The profile instance count just kept growing and growing.

I kept stripping down the code to try to isolate what was going on, and it
even happened when I just created a single, 1-field dummy domain class and
didn't call a single method on it. The code was something like

   for (int i = 0; i < 100000; i++) {
      new Thing(name: "thing_${i}")
      Thread.sleep(50); // to allow time to watch things in the profiler
   }

While the loop was running I'd run gc() periodically, and it had no effect on
the domain class instance count, but clearly it was working since other class
counts dropped and the overall heap size dropped. Once the method exited,
finally the gc() call dropped the instance count down to near zero.

Burt

On Thursday 04 September 2008 5:34:39 pm Les Hazlewood wrote:

> Oops, I didn't give you all the code - the method I posted earlier
> performed one and only one 'mini' batch insert.  It was called from another
> method that coordinated calling that method as necessary:
> public void importProductBatched(Product product) {
>         if (log.isInfoEnabled()) {
>             log.info("Performing a batching import for product of type [" +
> product.getClass().getName() + "]");
>         }
>         preBatch();
>         Iterator i = product.iterator();
>         while (i.hasNext()) {
>             int batchImported = importBatch(i, getBatchCount(),
> getFlushCount());
>             product.setCommittedBytesRead(product.getBytesRead());
>             product.setCommittedRecordsRead(((RecordIterator)
> i).getIndex());
>             log.info("Batched imported Count" + batchImported + " bytes
> read =" + product.getBytesRead() + "|" + ((RecordIterator) i).getIndex());
> }
>         if (log.isInfoEnabled()) {
>             log.info("Successfully imported batch for product of type [" +
> product.getClass().getName() + "]");
>         }
>         postBatch();
>     }
>
> And then the importBatch method:
>
> protected int importBatch(final Iterator iterator, final int maxRecords,
> final int flushCount) {
>         //Batch inserts need to be done as small chunks within their own
> transaction to avoid
>         //transaction timeouts (and rollbacks) due to large data sets, so
>         //specify PROPAGATION_REQUIRES_NEW.
>         //
>         //If we did NOT do this, and imported in one transaction, and the
> Product contained
>         //thousands of records (as does happen during batch mode), odds are
> high that the transaction would time out
>         //(e.g. after 5 minutes) and everything would be rolled back.  This
> is not desired.
>         TransactionDefinition txnDef =
>                 new
> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_NEW
>);
>
>         TransactionTemplate txnTemplate = new
> TransactionTemplate(getTransactionManager(), txnDef);
>         Integer importCount = (Integer) txnTemplate.execute(new
> TransactionCallback() {
>             public Object doInTransaction(TransactionStatus status) {
>                 int batchRecordCount = 0;
>
>                 while (iterator.hasNext() && batchRecordCount < maxRecords)
> {
>                     Record record = (Record) iterator.next();
>                     if (record != null) { //can be null if any records are
> skipped
>                         importRecord(record);
>                         batchRecordCount++;
>                         notifyRecordImported(record);
>                         if (batchRecordCount % flushCount == 0) {
>                             flush();
>                         }
>                     }
>                 }
>                 if (batchRecordCount > 0) {
>                     flush();
>                 }
>
>                 if (log.isInfoEnabled()) {
>                     log.info("Batch imported " + batchRecordCount + "
> records in a single transaction.");
>                 }
>                 return batchRecordCount;
>             }
>         });
>         return importCount != null ? importCount : 0;
>     }
>
> Regards,
>
> Les
>
> On Thu, Sep 4, 2008 at 5:30 PM, Les Hazlewood <[hidden email]> wrote:
> > The only thing I can think of is that it might be related to the
> > transaction manager - for a long running transaction, the TM is probably
> > retaining log entries to accommodate a rollback.  For such a long running
> > transaction, that could be a lot of overhead.  What TM implementation are
> > you using?
> > I myself just finished an application which required ridiculous amounts
> > of data to be loaded on a daily basis in batch mode (gigabytes, millions
> > of records across many tables).  I basically ensured that the insert
> > operations happened in 1000 count 'mini' batches for 2 reasons:
> >
> > 1) to prevent a TM timeout (default was 5 minutes)
> > 2) to prevent the TM overhead from getting too large (memory)
> >
> > Here's some code I used to do that (Java + Spring TransactionTemplate).
> >  Now I don't know if you are experiencing the same thing I did, but you
> > might find it useful.
> >
> > - note that the Iterator passed in represented a custom implementation
> > that was reading records from a file stream - the file was over 10 Gigs,
> > and the Iterator represented each 'chunked' record in the stream.
> > - The 'flushCount' was a config variable from a .properties file - in our
> > case, it was equal to 1000).
> > - the flush() method just called a DAO which internally called
> > hibernateTemplate.flush() and then immediately hibernateTemplate.clear();
> > - you could try to extend the transaction timeout beyond the default
> > (e.g. 5 minutes) by configuring the transaction manager, but
> >   odds are high you'd get into a TM housekeeping memory problem
> >
> > protected int importBatch(final Iterator iterator, final int flushCount)
> > { //Batch inserts need to be done as small chunks within their own
> > transaction to avoid
> >         //transaction timeouts (and rollbacks) due to large data sets, so
> >         //specify PROPAGATION_REQUIRES_NEW.
> >         //
> >         //If we did NOT do this, and tried to import hundreds of
> > thousands or millions of records,
> >         //odds are high that the transaction would time out
> >         //(e.g. after 5 minutes) and everything would be rolled back.
> > This is not desired.
> >         TransactionDefinition txnDef =
> >                 new
> > DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_N
> >EW);
> >
> >         TransactionTemplate txnTemplate = new
> > TransactionTemplate(getTransactionManager(), txnDef);
> >         Integer importCount = (Integer) txnTemplate.execute(new
> > TransactionCallback() {
> >             public Object doInTransaction(TransactionStatus status) {
> >                 int batchRecordCount = 0;
> >
> >                 while (iterator.hasNext() ) {
> >                     Record record = (Record) iterator.next();
> >                     if (record != null) { //can be null if any records
> > are skipped
> >                         importRecord(record);
> >                         batchRecordCount++;
> >                         notifyRecordImported(record);
> >                         if (batchRecordCount % flushCount == 0) {
> >                             flush();
> >                         }
> >                     }
> >                 }
> >                 if (batchRecordCount > 0) {
> >                     flush();
> >                 }
> >
> >                 if (log.isInfoEnabled()) {
> >                     log.info("Batch imported " + batchRecordCount + "
> > records in a single transaction.");
> >                 }
> >                 return batchRecordCount;
> >             }
> >         });
> >         return importCount != null ? importCount : 0;
> >     }
> >
> > Note that if any of those 'mini' transactions fail, you have to do a
> > manual rollback of all of the records that went in.  We accounted for
> > this by storing an id in the table where these fields were inserted, and
> > if any error occurred, performed a bulk delete ('delete from blah where
> > manual_tx_id = foo').
> >
> > HTH,
> >
> > Les
> >
> > On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith
<[hidden email]>wrote:

> >> I'm building a large database for load testing, so I'm creating millions
> >> of
> >> domain instances. I'm trying to use the standard bulk insert approach in
> >> Hibernate, i.e. flush() and clear() the session periodically, with the
> >> whole
> >> operation running in a transaction.
> >>
> >> The problem is the instances are still around - I run out of memory
> >> after a
> >> few hours with a heap set at 1G. I've turned off 2nd-level caching and
> >> looked
> >> at the Session in a debugger and clear() definitely works - it's empty
> >> afterwards. There are no mapped collections, and I'm not keeping any
> >> explicit
> >> references to the instances.
> >>
> >> But running in a profiler I can see the instance count steadily increase
> >> and
> >> never decrease. Running gc() has no effect.
> >>
> >> Any thoughts on what might be keeping a reference to these instances?
> >>
> >> Burt
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe from this list, please visit:
> >>
> >>    http://xircles.codehaus.org/manage_email



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Les Hazlewood
Hrm - that sounds kinda serious - and may even be a Groovy thing and not a Grails thing.  Anyone from G2One wanna comment?

On Thu, Sep 4, 2008 at 8:15 PM, Burt Beckwith <[hidden email]> wrote:
I spent quite a while on this today and think the issue has to do with Grails
and has nothing to do with transactions. Btw - it's not a memory leak in the
classic sense, but a GC issue - domain instances were somehow referenced and
couldn't be GC'd. The profile instance count just kept growing and growing.

I kept stripping down the code to try to isolate what was going on, and it
even happened when I just created a single, 1-field dummy domain class and
didn't call a single method on it. The code was something like

  for (int i = 0; i < 100000; i++) {
     new Thing(name: "thing_${i}")
     Thread.sleep(50); // to allow time to watch things in the profiler
  }

While the loop was running I'd run gc() periodically, and it had no effect on
the domain class instance count, but clearly it was working since other class
counts dropped and the overall heap size dropped. Once the method exited,
finally the gc() call dropped the instance count down to near zero.

Burt

On Thursday 04 September 2008 5:34:39 pm Les Hazlewood wrote:
> Oops, I didn't give you all the code - the method I posted earlier
> performed one and only one 'mini' batch insert.  It was called from another
> method that coordinated calling that method as necessary:
> public void importProductBatched(Product product) {
>         if (log.isInfoEnabled()) {
>             log.info("Performing a batching import for product of type [" +
> product.getClass().getName() + "]");
>         }
>         preBatch();
>         Iterator i = product.iterator();
>         while (i.hasNext()) {
>             int batchImported = importBatch(i, getBatchCount(),
> getFlushCount());
>             product.setCommittedBytesRead(product.getBytesRead());
>             product.setCommittedRecordsRead(((RecordIterator)
> i).getIndex());
>             log.info("Batched imported Count" + batchImported + " bytes
> read =" + product.getBytesRead() + "|" + ((RecordIterator) i).getIndex());
> }
>         if (log.isInfoEnabled()) {
>             log.info("Successfully imported batch for product of type [" +
> product.getClass().getName() + "]");
>         }
>         postBatch();
>     }
>
> And then the importBatch method:
>
> protected int importBatch(final Iterator iterator, final int maxRecords,
> final int flushCount) {
>         //Batch inserts need to be done as small chunks within their own
> transaction to avoid
>         //transaction timeouts (and rollbacks) due to large data sets, so
>         //specify PROPAGATION_REQUIRES_NEW.
>         //
>         //If we did NOT do this, and imported in one transaction, and the
> Product contained
>         //thousands of records (as does happen during batch mode), odds are
> high that the transaction would time out
>         //(e.g. after 5 minutes) and everything would be rolled back.  This
> is not desired.
>         TransactionDefinition txnDef =
>                 new
> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_NEW
>);
>
>         TransactionTemplate txnTemplate = new
> TransactionTemplate(getTransactionManager(), txnDef);
>         Integer importCount = (Integer) txnTemplate.execute(new
> TransactionCallback() {
>             public Object doInTransaction(TransactionStatus status) {
>                 int batchRecordCount = 0;
>
>                 while (iterator.hasNext() && batchRecordCount < maxRecords)
> {
>                     Record record = (Record) iterator.next();
>                     if (record != null) { //can be null if any records are
> skipped
>                         importRecord(record);
>                         batchRecordCount++;
>                         notifyRecordImported(record);
>                         if (batchRecordCount % flushCount == 0) {
>                             flush();
>                         }
>                     }
>                 }
>                 if (batchRecordCount > 0) {
>                     flush();
>                 }
>
>                 if (log.isInfoEnabled()) {
>                     log.info("Batch imported " + batchRecordCount + "
> records in a single transaction.");
>                 }
>                 return batchRecordCount;
>             }
>         });
>         return importCount != null ? importCount : 0;
>     }
>
> Regards,
>
> Les
>
> On Thu, Sep 4, 2008 at 5:30 PM, Les Hazlewood <[hidden email]> wrote:
> > The only thing I can think of is that it might be related to the
> > transaction manager - for a long running transaction, the TM is probably
> > retaining log entries to accommodate a rollback.  For such a long running
> > transaction, that could be a lot of overhead.  What TM implementation are
> > you using?
> > I myself just finished an application which required ridiculous amounts
> > of data to be loaded on a daily basis in batch mode (gigabytes, millions
> > of records across many tables).  I basically ensured that the insert
> > operations happened in 1000 count 'mini' batches for 2 reasons:
> >
> > 1) to prevent a TM timeout (default was 5 minutes)
> > 2) to prevent the TM overhead from getting too large (memory)
> >
> > Here's some code I used to do that (Java + Spring TransactionTemplate).
> >  Now I don't know if you are experiencing the same thing I did, but you
> > might find it useful.
> >
> > - note that the Iterator passed in represented a custom implementation
> > that was reading records from a file stream - the file was over 10 Gigs,
> > and the Iterator represented each 'chunked' record in the stream.
> > - The 'flushCount' was a config variable from a .properties file - in our
> > case, it was equal to 1000).
> > - the flush() method just called a DAO which internally called
> > hibernateTemplate.flush() and then immediately hibernateTemplate.clear();
> > - you could try to extend the transaction timeout beyond the default
> > (e.g. 5 minutes) by configuring the transaction manager, but
> >   odds are high you'd get into a TM housekeeping memory problem
> >
> > protected int importBatch(final Iterator iterator, final int flushCount)
> > { //Batch inserts need to be done as small chunks within their own
> > transaction to avoid
> >         //transaction timeouts (and rollbacks) due to large data sets, so
> >         //specify PROPAGATION_REQUIRES_NEW.
> >         //
> >         //If we did NOT do this, and tried to import hundreds of
> > thousands or millions of records,
> >         //odds are high that the transaction would time out
> >         //(e.g. after 5 minutes) and everything would be rolled back.
> > This is not desired.
> >         TransactionDefinition txnDef =
> >                 new
> > DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_N
> >EW);
> >
> >         TransactionTemplate txnTemplate = new
> > TransactionTemplate(getTransactionManager(), txnDef);
> >         Integer importCount = (Integer) txnTemplate.execute(new
> > TransactionCallback() {
> >             public Object doInTransaction(TransactionStatus status) {
> >                 int batchRecordCount = 0;
> >
> >                 while (iterator.hasNext() ) {
> >                     Record record = (Record) iterator.next();
> >                     if (record != null) { //can be null if any records
> > are skipped
> >                         importRecord(record);
> >                         batchRecordCount++;
> >                         notifyRecordImported(record);
> >                         if (batchRecordCount % flushCount == 0) {
> >                             flush();
> >                         }
> >                     }
> >                 }
> >                 if (batchRecordCount > 0) {
> >                     flush();
> >                 }
> >
> >                 if (log.isInfoEnabled()) {
> >                     log.info("Batch imported " + batchRecordCount + "
> > records in a single transaction.");
> >                 }
> >                 return batchRecordCount;
> >             }
> >         });
> >         return importCount != null ? importCount : 0;
> >     }
> >
> > Note that if any of those 'mini' transactions fail, you have to do a
> > manual rollback of all of the records that went in.  We accounted for
> > this by storing an id in the table where these fields were inserted, and
> > if any error occurred, performed a bulk delete ('delete from blah where
> > manual_tx_id = foo').
> >
> > HTH,
> >
> > Les
> >
> > On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith
<[hidden email]>wrote:
> >> I'm building a large database for load testing, so I'm creating millions
> >> of
> >> domain instances. I'm trying to use the standard bulk insert approach in
> >> Hibernate, i.e. flush() and clear() the session periodically, with the
> >> whole
> >> operation running in a transaction.
> >>
> >> The problem is the instances are still around - I run out of memory
> >> after a
> >> few hours with a heap set at 1G. I've turned off 2nd-level caching and
> >> looked
> >> at the Session in a debugger and clear() definitely works - it's empty
> >> afterwards. There are no mapped collections, and I'm not keeping any
> >> explicit
> >> references to the instances.
> >>
> >> But running in a profiler I can see the instance count steadily increase
> >> and
> >> never decrease. Running gc() has no effect.
> >>
> >> Any thoughts on what might be keeping a reference to these instances?
> >>
> >> Burt
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe from this list, please visit:
> >>
> >>    http://xircles.codehaus.org/manage_email



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Graeme-Rocher
I believe the issue is that we override the default constructor to
perform data binding. The binding errors object is bound to either the
current thread or the current request

Since these errors are scoped to the request they won't be GC'ed until
the request completes. You could regard this as a bug. One work around
may be do do:

  for (int i = 0; i < 100000; i++) {
     def t = new Thing(name: "thing_${i}")
     t.errors = null
     Thread.sleep(50); // to allow time to watch things in the profiler
  }

I'm not sure what a better solution is for now. Maybe we could use a
ReferenceQueue or something to check if its a candidate for GC'ing and
then automatically clear the errors... hmmm needs some thought

Cheers

On Fri, Sep 5, 2008 at 3:24 AM, Les Hazlewood <[hidden email]> wrote:

> Hrm - that sounds kinda serious - and may even be a Groovy thing and not a
> Grails thing.  Anyone from G2One wanna comment?
>
> On Thu, Sep 4, 2008 at 8:15 PM, Burt Beckwith <[hidden email]> wrote:
>>
>> I spent quite a while on this today and think the issue has to do with
>> Grails
>> and has nothing to do with transactions. Btw - it's not a memory leak in
>> the
>> classic sense, but a GC issue - domain instances were somehow referenced
>> and
>> couldn't be GC'd. The profile instance count just kept growing and
>> growing.
>>
>> I kept stripping down the code to try to isolate what was going on, and it
>> even happened when I just created a single, 1-field dummy domain class and
>> didn't call a single method on it. The code was something like
>>
>>   for (int i = 0; i < 100000; i++) {
>>      new Thing(name: "thing_${i}")
>>      Thread.sleep(50); // to allow time to watch things in the profiler
>>   }
>>
>> While the loop was running I'd run gc() periodically, and it had no effect
>> on
>> the domain class instance count, but clearly it was working since other
>> class
>> counts dropped and the overall heap size dropped. Once the method exited,
>> finally the gc() call dropped the instance count down to near zero.
>>
>> Burt
>>
>> On Thursday 04 September 2008 5:34:39 pm Les Hazlewood wrote:
>> > Oops, I didn't give you all the code - the method I posted earlier
>> > performed one and only one 'mini' batch insert.  It was called from
>> > another
>> > method that coordinated calling that method as necessary:
>> > public void importProductBatched(Product product) {
>> >         if (log.isInfoEnabled()) {
>> >             log.info("Performing a batching import for product of type
>> > [" +
>> > product.getClass().getName() + "]");
>> >         }
>> >         preBatch();
>> >         Iterator i = product.iterator();
>> >         while (i.hasNext()) {
>> >             int batchImported = importBatch(i, getBatchCount(),
>> > getFlushCount());
>> >             product.setCommittedBytesRead(product.getBytesRead());
>> >             product.setCommittedRecordsRead(((RecordIterator)
>> > i).getIndex());
>> >             log.info("Batched imported Count" + batchImported + " bytes
>> > read =" + product.getBytesRead() + "|" + ((RecordIterator)
>> > i).getIndex());
>> > }
>> >         if (log.isInfoEnabled()) {
>> >             log.info("Successfully imported batch for product of type ["
>> > +
>> > product.getClass().getName() + "]");
>> >         }
>> >         postBatch();
>> >     }
>> >
>> > And then the importBatch method:
>> >
>> > protected int importBatch(final Iterator iterator, final int maxRecords,
>> > final int flushCount) {
>> >         //Batch inserts need to be done as small chunks within their own
>> > transaction to avoid
>> >         //transaction timeouts (and rollbacks) due to large data sets,
>> > so
>> >         //specify PROPAGATION_REQUIRES_NEW.
>> >         //
>> >         //If we did NOT do this, and imported in one transaction, and
>> > the
>> > Product contained
>> >         //thousands of records (as does happen during batch mode), odds
>> > are
>> > high that the transaction would time out
>> >         //(e.g. after 5 minutes) and everything would be rolled back.
>> >  This
>> > is not desired.
>> >         TransactionDefinition txnDef =
>> >                 new
>> >
>> > DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_NEW
>> >);
>> >
>> >         TransactionTemplate txnTemplate = new
>> > TransactionTemplate(getTransactionManager(), txnDef);
>> >         Integer importCount = (Integer) txnTemplate.execute(new
>> > TransactionCallback() {
>> >             public Object doInTransaction(TransactionStatus status) {
>> >                 int batchRecordCount = 0;
>> >
>> >                 while (iterator.hasNext() && batchRecordCount <
>> > maxRecords)
>> > {
>> >                     Record record = (Record) iterator.next();
>> >                     if (record != null) { //can be null if any records
>> > are
>> > skipped
>> >                         importRecord(record);
>> >                         batchRecordCount++;
>> >                         notifyRecordImported(record);
>> >                         if (batchRecordCount % flushCount == 0) {
>> >                             flush();
>> >                         }
>> >                     }
>> >                 }
>> >                 if (batchRecordCount > 0) {
>> >                     flush();
>> >                 }
>> >
>> >                 if (log.isInfoEnabled()) {
>> >                     log.info("Batch imported " + batchRecordCount + "
>> > records in a single transaction.");
>> >                 }
>> >                 return batchRecordCount;
>> >             }
>> >         });
>> >         return importCount != null ? importCount : 0;
>> >     }
>> >
>> > Regards,
>> >
>> > Les
>> >
>> > On Thu, Sep 4, 2008 at 5:30 PM, Les Hazlewood <[hidden email]> wrote:
>> > > The only thing I can think of is that it might be related to the
>> > > transaction manager - for a long running transaction, the TM is
>> > > probably
>> > > retaining log entries to accommodate a rollback.  For such a long
>> > > running
>> > > transaction, that could be a lot of overhead.  What TM implementation
>> > > are
>> > > you using?
>> > > I myself just finished an application which required ridiculous
>> > > amounts
>> > > of data to be loaded on a daily basis in batch mode (gigabytes,
>> > > millions
>> > > of records across many tables).  I basically ensured that the insert
>> > > operations happened in 1000 count 'mini' batches for 2 reasons:
>> > >
>> > > 1) to prevent a TM timeout (default was 5 minutes)
>> > > 2) to prevent the TM overhead from getting too large (memory)
>> > >
>> > > Here's some code I used to do that (Java + Spring
>> > > TransactionTemplate).
>> > >  Now I don't know if you are experiencing the same thing I did, but
>> > > you
>> > > might find it useful.
>> > >
>> > > - note that the Iterator passed in represented a custom implementation
>> > > that was reading records from a file stream - the file was over 10
>> > > Gigs,
>> > > and the Iterator represented each 'chunked' record in the stream.
>> > > - The 'flushCount' was a config variable from a .properties file - in
>> > > our
>> > > case, it was equal to 1000).
>> > > - the flush() method just called a DAO which internally called
>> > > hibernateTemplate.flush() and then immediately
>> > > hibernateTemplate.clear();
>> > > - you could try to extend the transaction timeout beyond the default
>> > > (e.g. 5 minutes) by configuring the transaction manager, but
>> > >   odds are high you'd get into a TM housekeeping memory problem
>> > >
>> > > protected int importBatch(final Iterator iterator, final int
>> > > flushCount)
>> > > { //Batch inserts need to be done as small chunks within their own
>> > > transaction to avoid
>> > >         //transaction timeouts (and rollbacks) due to large data sets,
>> > > so
>> > >         //specify PROPAGATION_REQUIRES_NEW.
>> > >         //
>> > >         //If we did NOT do this, and tried to import hundreds of
>> > > thousands or millions of records,
>> > >         //odds are high that the transaction would time out
>> > >         //(e.g. after 5 minutes) and everything would be rolled back.
>> > > This is not desired.
>> > >         TransactionDefinition txnDef =
>> > >                 new
>> > >
>> > > DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_N
>> > >EW);
>> > >
>> > >         TransactionTemplate txnTemplate = new
>> > > TransactionTemplate(getTransactionManager(), txnDef);
>> > >         Integer importCount = (Integer) txnTemplate.execute(new
>> > > TransactionCallback() {
>> > >             public Object doInTransaction(TransactionStatus status) {
>> > >                 int batchRecordCount = 0;
>> > >
>> > >                 while (iterator.hasNext() ) {
>> > >                     Record record = (Record) iterator.next();
>> > >                     if (record != null) { //can be null if any records
>> > > are skipped
>> > >                         importRecord(record);
>> > >                         batchRecordCount++;
>> > >                         notifyRecordImported(record);
>> > >                         if (batchRecordCount % flushCount == 0) {
>> > >                             flush();
>> > >                         }
>> > >                     }
>> > >                 }
>> > >                 if (batchRecordCount > 0) {
>> > >                     flush();
>> > >                 }
>> > >
>> > >                 if (log.isInfoEnabled()) {
>> > >                     log.info("Batch imported " + batchRecordCount + "
>> > > records in a single transaction.");
>> > >                 }
>> > >                 return batchRecordCount;
>> > >             }
>> > >         });
>> > >         return importCount != null ? importCount : 0;
>> > >     }
>> > >
>> > > Note that if any of those 'mini' transactions fail, you have to do a
>> > > manual rollback of all of the records that went in.  We accounted for
>> > > this by storing an id in the table where these fields were inserted,
>> > > and
>> > > if any error occurred, performed a bulk delete ('delete from blah
>> > > where
>> > > manual_tx_id = foo').
>> > >
>> > > HTH,
>> > >
>> > > Les
>> > >
>> > > On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith
>> <[hidden email]>wrote:
>> > >> I'm building a large database for load testing, so I'm creating
>> > >> millions
>> > >> of
>> > >> domain instances. I'm trying to use the standard bulk insert approach
>> > >> in
>> > >> Hibernate, i.e. flush() and clear() the session periodically, with
>> > >> the
>> > >> whole
>> > >> operation running in a transaction.
>> > >>
>> > >> The problem is the instances are still around - I run out of memory
>> > >> after a
>> > >> few hours with a heap set at 1G. I've turned off 2nd-level caching
>> > >> and
>> > >> looked
>> > >> at the Session in a debugger and clear() definitely works - it's
>> > >> empty
>> > >> afterwards. There are no mapped collections, and I'm not keeping any
>> > >> explicit
>> > >> references to the instances.
>> > >>
>> > >> But running in a profiler I can see the instance count steadily
>> > >> increase
>> > >> and
>> > >> never decrease. Running gc() has no effect.
>> > >>
>> > >> Any thoughts on what might be keeping a reference to these instances?
>> > >>
>> > >> Burt
>> > >>
>> > >> ---------------------------------------------------------------------
>> > >> To unsubscribe from this list, please visit:
>> > >>
>> > >>    http://xircles.codehaus.org/manage_email
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>    http://xircles.codehaus.org/manage_email
>>
>>
>
>



--
Graeme Rocher
Grails Project Lead
G2One, Inc. Chief Technology Officer
http://www.g2one.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Marc Palmer Local

On 5 Sep 2008, at 09:02, Graeme Rocher wrote:

> I believe the issue is that we override the default constructor to
> perform data binding. The binding errors object is bound to either the
> current thread or the current request
>
> Since these errors are scoped to the request they won't be GC'ed until
> the request completes. You could regard this as a bug. One work around
> may be do do:
>
>  for (int i = 0; i < 100000; i++) {
>     def t = new Thing(name: "thing_${i}")
>     t.errors = null
>     Thread.sleep(50); // to allow time to watch things in the profiler
>  }
>
> I'm not sure what a better solution is for now. Maybe we could use a
> ReferenceQueue or something to check if its a candidate for GC'ing and
> then automatically clear the errors... hmmm needs some thought
>

Can't we introduce some new notion where we can allow such GORM  
operations take effect "out of a request/thread context"?

i.e:

AnyDomainClass.batchOperation {
    ... do anything in here and errors will be discarded automatically  
etc ...
}


Marc


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Graeme-Rocher
On Fri, Sep 5, 2008 at 11:17 AM, Marc Palmer <[hidden email]> wrote:

>
> On 5 Sep 2008, at 09:02, Graeme Rocher wrote:
>
>> I believe the issue is that we override the default constructor to
>> perform data binding. The binding errors object is bound to either the
>> current thread or the current request
>>
>> Since these errors are scoped to the request they won't be GC'ed until
>> the request completes. You could regard this as a bug. One work around
>> may be do do:
>>
>>  for (int i = 0; i < 100000; i++) {
>>    def t = new Thing(name: "thing_${i}")
>>    t.errors = null
>>    Thread.sleep(50); // to allow time to watch things in the profiler
>>  }
>>
>> I'm not sure what a better solution is for now. Maybe we could use a
>> ReferenceQueue or something to check if its a candidate for GC'ing and
>> then automatically clear the errors... hmmm needs some thought
>>
>
> Can't we introduce some new notion where we can allow such GORM operations
> take effect "out of a request/thread context"?
>
> i.e:
>
> AnyDomainClass.batchOperation {
>   ... do anything in here and errors will be discarded automatically etc ...
> }
>
>

Yes that is a possibility

Cheers

> Marc
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>   http://xircles.codehaus.org/manage_email
>
>
>



--
Graeme Rocher
Grails Project Lead
G2One, Inc. Chief Technology Officer
http://www.g2one.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

burtbeckwith
In reply to this post by Graeme-Rocher
This worked, but was a little more work than setting errors to null. I was running this in the console so there's no request, and the PROPERTY_INSTANCE_MAP configured in DomainClassGrailsPlugin was holding the errors. When I cleared that map after each loop interation everything worked fine:

   import org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin

   for (int i = 0; i < 100000; i++) {
      new Thing(name: "thing_${i}")
      DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
      Thread.sleep(50) // to allow time to watch things in the profiler
   }

Burt

Graeme Rocher-2 wrote
I believe the issue is that we override the default constructor to
perform data binding. The binding errors object is bound to either the
current thread or the current request

Since these errors are scoped to the request they won't be GC'ed until
the request completes. You could regard this as a bug. One work around
may be do do:

  for (int i = 0; i < 100000; i++) {
     def t = new Thing(name: "thing_${i}")
     t.errors = null
     Thread.sleep(50); // to allow time to watch things in the profiler
  }

I'm not sure what a better solution is for now. Maybe we could use a
ReferenceQueue or something to check if its a candidate for GC'ing and
then automatically clear the errors... hmmm needs some thought

Cheers

On Fri, Sep 5, 2008 at 3:24 AM, Les Hazlewood <les@hazlewood.com> wrote:
> Hrm - that sounds kinda serious - and may even be a Groovy thing and not a
> Grails thing.  Anyone from G2One wanna comment?
>
> On Thu, Sep 4, 2008 at 8:15 PM, Burt Beckwith <burt@burtbeckwith.com> wrote:
>>
>> I spent quite a while on this today and think the issue has to do with
>> Grails
>> and has nothing to do with transactions. Btw - it's not a memory leak in
>> the
>> classic sense, but a GC issue - domain instances were somehow referenced
>> and
>> couldn't be GC'd. The profile instance count just kept growing and
>> growing.
>>
>> I kept stripping down the code to try to isolate what was going on, and it
>> even happened when I just created a single, 1-field dummy domain class and
>> didn't call a single method on it. The code was something like
>>
>>   for (int i = 0; i < 100000; i++) {
>>      new Thing(name: "thing_${i}")
>>      Thread.sleep(50); // to allow time to watch things in the profiler
>>   }
>>
>> While the loop was running I'd run gc() periodically, and it had no effect
>> on
>> the domain class instance count, but clearly it was working since other
>> class
>> counts dropped and the overall heap size dropped. Once the method exited,
>> finally the gc() call dropped the instance count down to near zero.
>>
>> Burt
>>
>> On Thursday 04 September 2008 5:34:39 pm Les Hazlewood wrote:
>> > Oops, I didn't give you all the code - the method I posted earlier
>> > performed one and only one 'mini' batch insert.  It was called from
>> > another
>> > method that coordinated calling that method as necessary:
>> > public void importProductBatched(Product product) {
>> >         if (log.isInfoEnabled()) {
>> >             log.info("Performing a batching import for product of type
>> > [" +
>> > product.getClass().getName() + "]");
>> >         }
>> >         preBatch();
>> >         Iterator i = product.iterator();
>> >         while (i.hasNext()) {
>> >             int batchImported = importBatch(i, getBatchCount(),
>> > getFlushCount());
>> >             product.setCommittedBytesRead(product.getBytesRead());
>> >             product.setCommittedRecordsRead(((RecordIterator)
>> > i).getIndex());
>> >             log.info("Batched imported Count" + batchImported + " bytes
>> > read =" + product.getBytesRead() + "|" + ((RecordIterator)
>> > i).getIndex());
>> > }
>> >         if (log.isInfoEnabled()) {
>> >             log.info("Successfully imported batch for product of type ["
>> > +
>> > product.getClass().getName() + "]");
>> >         }
>> >         postBatch();
>> >     }
>> >
>> > And then the importBatch method:
>> >
>> > protected int importBatch(final Iterator iterator, final int maxRecords,
>> > final int flushCount) {
>> >         //Batch inserts need to be done as small chunks within their own
>> > transaction to avoid
>> >         //transaction timeouts (and rollbacks) due to large data sets,
>> > so
>> >         //specify PROPAGATION_REQUIRES_NEW.
>> >         //
>> >         //If we did NOT do this, and imported in one transaction, and
>> > the
>> > Product contained
>> >         //thousands of records (as does happen during batch mode), odds
>> > are
>> > high that the transaction would time out
>> >         //(e.g. after 5 minutes) and everything would be rolled back.
>> >  This
>> > is not desired.
>> >         TransactionDefinition txnDef =
>> >                 new
>> >
>> > DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_NEW
>> >);
>> >
>> >         TransactionTemplate txnTemplate = new
>> > TransactionTemplate(getTransactionManager(), txnDef);
>> >         Integer importCount = (Integer) txnTemplate.execute(new
>> > TransactionCallback() {
>> >             public Object doInTransaction(TransactionStatus status) {
>> >                 int batchRecordCount = 0;
>> >
>> >                 while (iterator.hasNext() && batchRecordCount <
>> > maxRecords)
>> > {
>> >                     Record record = (Record) iterator.next();
>> >                     if (record != null) { //can be null if any records
>> > are
>> > skipped
>> >                         importRecord(record);
>> >                         batchRecordCount++;
>> >                         notifyRecordImported(record);
>> >                         if (batchRecordCount % flushCount == 0) {
>> >                             flush();
>> >                         }
>> >                     }
>> >                 }
>> >                 if (batchRecordCount > 0) {
>> >                     flush();
>> >                 }
>> >
>> >                 if (log.isInfoEnabled()) {
>> >                     log.info("Batch imported " + batchRecordCount + "
>> > records in a single transaction.");
>> >                 }
>> >                 return batchRecordCount;
>> >             }
>> >         });
>> >         return importCount != null ? importCount : 0;
>> >     }
>> >
>> > Regards,
>> >
>> > Les
>> >
>> > On Thu, Sep 4, 2008 at 5:30 PM, Les Hazlewood <les@hazlewood.com> wrote:
>> > > The only thing I can think of is that it might be related to the
>> > > transaction manager - for a long running transaction, the TM is
>> > > probably
>> > > retaining log entries to accommodate a rollback.  For such a long
>> > > running
>> > > transaction, that could be a lot of overhead.  What TM implementation
>> > > are
>> > > you using?
>> > > I myself just finished an application which required ridiculous
>> > > amounts
>> > > of data to be loaded on a daily basis in batch mode (gigabytes,
>> > > millions
>> > > of records across many tables).  I basically ensured that the insert
>> > > operations happened in 1000 count 'mini' batches for 2 reasons:
>> > >
>> > > 1) to prevent a TM timeout (default was 5 minutes)
>> > > 2) to prevent the TM overhead from getting too large (memory)
>> > >
>> > > Here's some code I used to do that (Java + Spring
>> > > TransactionTemplate).
>> > >  Now I don't know if you are experiencing the same thing I did, but
>> > > you
>> > > might find it useful.
>> > >
>> > > - note that the Iterator passed in represented a custom implementation
>> > > that was reading records from a file stream - the file was over 10
>> > > Gigs,
>> > > and the Iterator represented each 'chunked' record in the stream.
>> > > - The 'flushCount' was a config variable from a .properties file - in
>> > > our
>> > > case, it was equal to 1000).
>> > > - the flush() method just called a DAO which internally called
>> > > hibernateTemplate.flush() and then immediately
>> > > hibernateTemplate.clear();
>> > > - you could try to extend the transaction timeout beyond the default
>> > > (e.g. 5 minutes) by configuring the transaction manager, but
>> > >   odds are high you'd get into a TM housekeeping memory problem
>> > >
>> > > protected int importBatch(final Iterator iterator, final int
>> > > flushCount)
>> > > { //Batch inserts need to be done as small chunks within their own
>> > > transaction to avoid
>> > >         //transaction timeouts (and rollbacks) due to large data sets,
>> > > so
>> > >         //specify PROPAGATION_REQUIRES_NEW.
>> > >         //
>> > >         //If we did NOT do this, and tried to import hundreds of
>> > > thousands or millions of records,
>> > >         //odds are high that the transaction would time out
>> > >         //(e.g. after 5 minutes) and everything would be rolled back.
>> > > This is not desired.
>> > >         TransactionDefinition txnDef =
>> > >                 new
>> > >
>> > > DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_N
>> > >EW);
>> > >
>> > >         TransactionTemplate txnTemplate = new
>> > > TransactionTemplate(getTransactionManager(), txnDef);
>> > >         Integer importCount = (Integer) txnTemplate.execute(new
>> > > TransactionCallback() {
>> > >             public Object doInTransaction(TransactionStatus status) {
>> > >                 int batchRecordCount = 0;
>> > >
>> > >                 while (iterator.hasNext() ) {
>> > >                     Record record = (Record) iterator.next();
>> > >                     if (record != null) { //can be null if any records
>> > > are skipped
>> > >                         importRecord(record);
>> > >                         batchRecordCount++;
>> > >                         notifyRecordImported(record);
>> > >                         if (batchRecordCount % flushCount == 0) {
>> > >                             flush();
>> > >                         }
>> > >                     }
>> > >                 }
>> > >                 if (batchRecordCount > 0) {
>> > >                     flush();
>> > >                 }
>> > >
>> > >                 if (log.isInfoEnabled()) {
>> > >                     log.info("Batch imported " + batchRecordCount + "
>> > > records in a single transaction.");
>> > >                 }
>> > >                 return batchRecordCount;
>> > >             }
>> > >         });
>> > >         return importCount != null ? importCount : 0;
>> > >     }
>> > >
>> > > Note that if any of those 'mini' transactions fail, you have to do a
>> > > manual rollback of all of the records that went in.  We accounted for
>> > > this by storing an id in the table where these fields were inserted,
>> > > and
>> > > if any error occurred, performed a bulk delete ('delete from blah
>> > > where
>> > > manual_tx_id = foo').
>> > >
>> > > HTH,
>> > >
>> > > Les
>> > >
>> > > On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith
>> <burt@burtbeckwith.com>wrote:
>> > >> I'm building a large database for load testing, so I'm creating
>> > >> millions
>> > >> of
>> > >> domain instances. I'm trying to use the standard bulk insert approach
>> > >> in
>> > >> Hibernate, i.e. flush() and clear() the session periodically, with
>> > >> the
>> > >> whole
>> > >> operation running in a transaction.
>> > >>
>> > >> The problem is the instances are still around - I run out of memory
>> > >> after a
>> > >> few hours with a heap set at 1G. I've turned off 2nd-level caching
>> > >> and
>> > >> looked
>> > >> at the Session in a debugger and clear() definitely works - it's
>> > >> empty
>> > >> afterwards. There are no mapped collections, and I'm not keeping any
>> > >> explicit
>> > >> references to the instances.
>> > >>
>> > >> But running in a profiler I can see the instance count steadily
>> > >> increase
>> > >> and
>> > >> never decrease. Running gc() has no effect.
>> > >>
>> > >> Any thoughts on what might be keeping a reference to these instances?
>> > >>
>> > >> Burt
>> > >>
>> > >> ---------------------------------------------------------------------
>> > >> To unsubscribe from this list, please visit:
>> > >>
>> > >>    http://xircles.codehaus.org/manage_email
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>    http://xircles.codehaus.org/manage_email
>>
>>
>
>



--
Graeme Rocher
Grails Project Lead
G2One, Inc. Chief Technology Officer
http://www.g2one.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Graeme-Rocher
Well at least you have a workaround :-)

Cheers

On Fri, Sep 5, 2008 at 4:17 PM, burtbeckwith <[hidden email]> wrote:

>
> This worked, but was a little more work than setting errors to null. I was
> running this in the console so there's no request, and the
> PROPERTY_INSTANCE_MAP configured in DomainClassGrailsPlugin was holding the
> errors. When I cleared that map after each loop interation everything worked
> fine:
>
>   import org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin
>
>   for (int i = 0; i < 100000; i++) {
>      new Thing(name: "thing_${i}")
>      DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
>      Thread.sleep(50) // to allow time to watch things in the profiler
>   }
>
> Burt
>
>
> Graeme Rocher-2 wrote:
>>
>> I believe the issue is that we override the default constructor to
>> perform data binding. The binding errors object is bound to either the
>> current thread or the current request
>>
>> Since these errors are scoped to the request they won't be GC'ed until
>> the request completes. You could regard this as a bug. One work around
>> may be do do:
>>
>>   for (int i = 0; i < 100000; i++) {
>>      def t = new Thing(name: "thing_${i}")
>>      t.errors = null
>>      Thread.sleep(50); // to allow time to watch things in the profiler
>>   }
>>
>> I'm not sure what a better solution is for now. Maybe we could use a
>> ReferenceQueue or something to check if its a candidate for GC'ing and
>> then automatically clear the errors... hmmm needs some thought
>>
>> Cheers
>>
>> On Fri, Sep 5, 2008 at 3:24 AM, Les Hazlewood <[hidden email]> wrote:
>>> Hrm - that sounds kinda serious - and may even be a Groovy thing and not
>>> a
>>> Grails thing.  Anyone from G2One wanna comment?
>>>
>>> On Thu, Sep 4, 2008 at 8:15 PM, Burt Beckwith <[hidden email]>
>>> wrote:
>>>>
>>>> I spent quite a while on this today and think the issue has to do with
>>>> Grails
>>>> and has nothing to do with transactions. Btw - it's not a memory leak in
>>>> the
>>>> classic sense, but a GC issue - domain instances were somehow referenced
>>>> and
>>>> couldn't be GC'd. The profile instance count just kept growing and
>>>> growing.
>>>>
>>>> I kept stripping down the code to try to isolate what was going on, and
>>>> it
>>>> even happened when I just created a single, 1-field dummy domain class
>>>> and
>>>> didn't call a single method on it. The code was something like
>>>>
>>>>   for (int i = 0; i < 100000; i++) {
>>>>      new Thing(name: "thing_${i}")
>>>>      Thread.sleep(50); // to allow time to watch things in the profiler
>>>>   }
>>>>
>>>> While the loop was running I'd run gc() periodically, and it had no
>>>> effect
>>>> on
>>>> the domain class instance count, but clearly it was working since other
>>>> class
>>>> counts dropped and the overall heap size dropped. Once the method
>>>> exited,
>>>> finally the gc() call dropped the instance count down to near zero.
>>>>
>>>> Burt
>>>>
>>>> On Thursday 04 September 2008 5:34:39 pm Les Hazlewood wrote:
>>>> > Oops, I didn't give you all the code - the method I posted earlier
>>>> > performed one and only one 'mini' batch insert.  It was called from
>>>> > another
>>>> > method that coordinated calling that method as necessary:
>>>> > public void importProductBatched(Product product) {
>>>> >         if (log.isInfoEnabled()) {
>>>> >             log.info("Performing a batching import for product of type
>>>> > [" +
>>>> > product.getClass().getName() + "]");
>>>> >         }
>>>> >         preBatch();
>>>> >         Iterator i = product.iterator();
>>>> >         while (i.hasNext()) {
>>>> >             int batchImported = importBatch(i, getBatchCount(),
>>>> > getFlushCount());
>>>> >             product.setCommittedBytesRead(product.getBytesRead());
>>>> >             product.setCommittedRecordsRead(((RecordIterator)
>>>> > i).getIndex());
>>>> >             log.info("Batched imported Count" + batchImported + "
>>>> bytes
>>>> > read =" + product.getBytesRead() + "|" + ((RecordIterator)
>>>> > i).getIndex());
>>>> > }
>>>> >         if (log.isInfoEnabled()) {
>>>> >             log.info("Successfully imported batch for product of type
>>>> ["
>>>> > +
>>>> > product.getClass().getName() + "]");
>>>> >         }
>>>> >         postBatch();
>>>> >     }
>>>> >
>>>> > And then the importBatch method:
>>>> >
>>>> > protected int importBatch(final Iterator iterator, final int
>>>> maxRecords,
>>>> > final int flushCount) {
>>>> >         //Batch inserts need to be done as small chunks within their
>>>> own
>>>> > transaction to avoid
>>>> >         //transaction timeouts (and rollbacks) due to large data sets,
>>>> > so
>>>> >         //specify PROPAGATION_REQUIRES_NEW.
>>>> >         //
>>>> >         //If we did NOT do this, and imported in one transaction, and
>>>> > the
>>>> > Product contained
>>>> >         //thousands of records (as does happen during batch mode),
>>>> odds
>>>> > are
>>>> > high that the transaction would time out
>>>> >         //(e.g. after 5 minutes) and everything would be rolled back.
>>>> >  This
>>>> > is not desired.
>>>> >         TransactionDefinition txnDef =
>>>> >                 new
>>>> >
>>>> >
>>>> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_NEW
>>>> >);
>>>> >
>>>> >         TransactionTemplate txnTemplate = new
>>>> > TransactionTemplate(getTransactionManager(), txnDef);
>>>> >         Integer importCount = (Integer) txnTemplate.execute(new
>>>> > TransactionCallback() {
>>>> >             public Object doInTransaction(TransactionStatus status) {
>>>> >                 int batchRecordCount = 0;
>>>> >
>>>> >                 while (iterator.hasNext() && batchRecordCount <
>>>> > maxRecords)
>>>> > {
>>>> >                     Record record = (Record) iterator.next();
>>>> >                     if (record != null) { //can be null if any records
>>>> > are
>>>> > skipped
>>>> >                         importRecord(record);
>>>> >                         batchRecordCount++;
>>>> >                         notifyRecordImported(record);
>>>> >                         if (batchRecordCount % flushCount == 0) {
>>>> >                             flush();
>>>> >                         }
>>>> >                     }
>>>> >                 }
>>>> >                 if (batchRecordCount > 0) {
>>>> >                     flush();
>>>> >                 }
>>>> >
>>>> >                 if (log.isInfoEnabled()) {
>>>> >                     log.info("Batch imported " + batchRecordCount + "
>>>> > records in a single transaction.");
>>>> >                 }
>>>> >                 return batchRecordCount;
>>>> >             }
>>>> >         });
>>>> >         return importCount != null ? importCount : 0;
>>>> >     }
>>>> >
>>>> > Regards,
>>>> >
>>>> > Les
>>>> >
>>>> > On Thu, Sep 4, 2008 at 5:30 PM, Les Hazlewood <[hidden email]>
>>>> wrote:
>>>> > > The only thing I can think of is that it might be related to the
>>>> > > transaction manager - for a long running transaction, the TM is
>>>> > > probably
>>>> > > retaining log entries to accommodate a rollback.  For such a long
>>>> > > running
>>>> > > transaction, that could be a lot of overhead.  What TM
>>>> implementation
>>>> > > are
>>>> > > you using?
>>>> > > I myself just finished an application which required ridiculous
>>>> > > amounts
>>>> > > of data to be loaded on a daily basis in batch mode (gigabytes,
>>>> > > millions
>>>> > > of records across many tables).  I basically ensured that the insert
>>>> > > operations happened in 1000 count 'mini' batches for 2 reasons:
>>>> > >
>>>> > > 1) to prevent a TM timeout (default was 5 minutes)
>>>> > > 2) to prevent the TM overhead from getting too large (memory)
>>>> > >
>>>> > > Here's some code I used to do that (Java + Spring
>>>> > > TransactionTemplate).
>>>> > >  Now I don't know if you are experiencing the same thing I did, but
>>>> > > you
>>>> > > might find it useful.
>>>> > >
>>>> > > - note that the Iterator passed in represented a custom
>>>> implementation
>>>> > > that was reading records from a file stream - the file was over 10
>>>> > > Gigs,
>>>> > > and the Iterator represented each 'chunked' record in the stream.
>>>> > > - The 'flushCount' was a config variable from a .properties file -
>>>> in
>>>> > > our
>>>> > > case, it was equal to 1000).
>>>> > > - the flush() method just called a DAO which internally called
>>>> > > hibernateTemplate.flush() and then immediately
>>>> > > hibernateTemplate.clear();
>>>> > > - you could try to extend the transaction timeout beyond the default
>>>> > > (e.g. 5 minutes) by configuring the transaction manager, but
>>>> > >   odds are high you'd get into a TM housekeeping memory problem
>>>> > >
>>>> > > protected int importBatch(final Iterator iterator, final int
>>>> > > flushCount)
>>>> > > { //Batch inserts need to be done as small chunks within their own
>>>> > > transaction to avoid
>>>> > >         //transaction timeouts (and rollbacks) due to large data
>>>> sets,
>>>> > > so
>>>> > >         //specify PROPAGATION_REQUIRES_NEW.
>>>> > >         //
>>>> > >         //If we did NOT do this, and tried to import hundreds of
>>>> > > thousands or millions of records,
>>>> > >         //odds are high that the transaction would time out
>>>> > >         //(e.g. after 5 minutes) and everything would be rolled
>>>> back.
>>>> > > This is not desired.
>>>> > >         TransactionDefinition txnDef =
>>>> > >                 new
>>>> > >
>>>> > >
>>>> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_N
>>>> > >EW);
>>>> > >
>>>> > >         TransactionTemplate txnTemplate = new
>>>> > > TransactionTemplate(getTransactionManager(), txnDef);
>>>> > >         Integer importCount = (Integer) txnTemplate.execute(new
>>>> > > TransactionCallback() {
>>>> > >             public Object doInTransaction(TransactionStatus status)
>>>> {
>>>> > >                 int batchRecordCount = 0;
>>>> > >
>>>> > >                 while (iterator.hasNext() ) {
>>>> > >                     Record record = (Record) iterator.next();
>>>> > >                     if (record != null) { //can be null if any
>>>> records
>>>> > > are skipped
>>>> > >                         importRecord(record);
>>>> > >                         batchRecordCount++;
>>>> > >                         notifyRecordImported(record);
>>>> > >                         if (batchRecordCount % flushCount == 0) {
>>>> > >                             flush();
>>>> > >                         }
>>>> > >                     }
>>>> > >                 }
>>>> > >                 if (batchRecordCount > 0) {
>>>> > >                     flush();
>>>> > >                 }
>>>> > >
>>>> > >                 if (log.isInfoEnabled()) {
>>>> > >                     log.info("Batch imported " + batchRecordCount +
>>>> "
>>>> > > records in a single transaction.");
>>>> > >                 }
>>>> > >                 return batchRecordCount;
>>>> > >             }
>>>> > >         });
>>>> > >         return importCount != null ? importCount : 0;
>>>> > >     }
>>>> > >
>>>> > > Note that if any of those 'mini' transactions fail, you have to do a
>>>> > > manual rollback of all of the records that went in.  We accounted
>>>> for
>>>> > > this by storing an id in the table where these fields were inserted,
>>>> > > and
>>>> > > if any error occurred, performed a bulk delete ('delete from blah
>>>> > > where
>>>> > > manual_tx_id = foo').
>>>> > >
>>>> > > HTH,
>>>> > >
>>>> > > Les
>>>> > >
>>>> > > On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith
>>>> <[hidden email]>wrote:
>>>> > >> I'm building a large database for load testing, so I'm creating
>>>> > >> millions
>>>> > >> of
>>>> > >> domain instances. I'm trying to use the standard bulk insert
>>>> approach
>>>> > >> in
>>>> > >> Hibernate, i.e. flush() and clear() the session periodically, with
>>>> > >> the
>>>> > >> whole
>>>> > >> operation running in a transaction.
>>>> > >>
>>>> > >> The problem is the instances are still around - I run out of memory
>>>> > >> after a
>>>> > >> few hours with a heap set at 1G. I've turned off 2nd-level caching
>>>> > >> and
>>>> > >> looked
>>>> > >> at the Session in a debugger and clear() definitely works - it's
>>>> > >> empty
>>>> > >> afterwards. There are no mapped collections, and I'm not keeping
>>>> any
>>>> > >> explicit
>>>> > >> references to the instances.
>>>> > >>
>>>> > >> But running in a profiler I can see the instance count steadily
>>>> > >> increase
>>>> > >> and
>>>> > >> never decrease. Running gc() has no effect.
>>>> > >>
>>>> > >> Any thoughts on what might be keeping a reference to these
>>>> instances?
>>>> > >>
>>>> > >> Burt
>>>> > >>
>>>> > >>
>>>> ---------------------------------------------------------------------
>>>> > >> To unsubscribe from this list, please visit:
>>>> > >>
>>>> > >>    http://xircles.codehaus.org/manage_email
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe from this list, please visit:
>>>>
>>>>    http://xircles.codehaus.org/manage_email
>>>>
>>>>
>>>
>>>
>>
>>
>>
>> --
>> Graeme Rocher
>> Grails Project Lead
>> G2One, Inc. Chief Technology Officer
>> http://www.g2one.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>     http://xircles.codehaus.org/manage_email
>>
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Memory-leak-tp19320007p19333509.html
> Sent from the grails - user mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>    http://xircles.codehaus.org/manage_email
>
>
>



--
Graeme Rocher
Grails Project Lead
G2One, Inc. Chief Technology Officer
http://www.g2one.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Daniel Honig
Is there a JIRA issue on this, I'd like to go vote for it :)

On Fri, Sep 5, 2008 at 12:16 PM, Graeme Rocher <[hidden email]> wrote:

> Well at least you have a workaround :-)
>
> Cheers
>
> On Fri, Sep 5, 2008 at 4:17 PM, burtbeckwith <[hidden email]> wrote:
>>
>> This worked, but was a little more work than setting errors to null. I was
>> running this in the console so there's no request, and the
>> PROPERTY_INSTANCE_MAP configured in DomainClassGrailsPlugin was holding the
>> errors. When I cleared that map after each loop interation everything worked
>> fine:
>>
>>   import org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin
>>
>>   for (int i = 0; i < 100000; i++) {
>>      new Thing(name: "thing_${i}")
>>      DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
>>      Thread.sleep(50) // to allow time to watch things in the profiler
>>   }
>>
>> Burt
>>
>>
>> Graeme Rocher-2 wrote:
>>>
>>> I believe the issue is that we override the default constructor to
>>> perform data binding. The binding errors object is bound to either the
>>> current thread or the current request
>>>
>>> Since these errors are scoped to the request they won't be GC'ed until
>>> the request completes. You could regard this as a bug. One work around
>>> may be do do:
>>>
>>>   for (int i = 0; i < 100000; i++) {
>>>      def t = new Thing(name: "thing_${i}")
>>>      t.errors = null
>>>      Thread.sleep(50); // to allow time to watch things in the profiler
>>>   }
>>>
>>> I'm not sure what a better solution is for now. Maybe we could use a
>>> ReferenceQueue or something to check if its a candidate for GC'ing and
>>> then automatically clear the errors... hmmm needs some thought
>>>
>>> Cheers
>>>
>>> On Fri, Sep 5, 2008 at 3:24 AM, Les Hazlewood <[hidden email]> wrote:
>>>> Hrm - that sounds kinda serious - and may even be a Groovy thing and not
>>>> a
>>>> Grails thing.  Anyone from G2One wanna comment?
>>>>
>>>> On Thu, Sep 4, 2008 at 8:15 PM, Burt Beckwith <[hidden email]>
>>>> wrote:
>>>>>
>>>>> I spent quite a while on this today and think the issue has to do with
>>>>> Grails
>>>>> and has nothing to do with transactions. Btw - it's not a memory leak in
>>>>> the
>>>>> classic sense, but a GC issue - domain instances were somehow referenced
>>>>> and
>>>>> couldn't be GC'd. The profile instance count just kept growing and
>>>>> growing.
>>>>>
>>>>> I kept stripping down the code to try to isolate what was going on, and
>>>>> it
>>>>> even happened when I just created a single, 1-field dummy domain class
>>>>> and
>>>>> didn't call a single method on it. The code was something like
>>>>>
>>>>>   for (int i = 0; i < 100000; i++) {
>>>>>      new Thing(name: "thing_${i}")
>>>>>      Thread.sleep(50); // to allow time to watch things in the profiler
>>>>>   }
>>>>>
>>>>> While the loop was running I'd run gc() periodically, and it had no
>>>>> effect
>>>>> on
>>>>> the domain class instance count, but clearly it was working since other
>>>>> class
>>>>> counts dropped and the overall heap size dropped. Once the method
>>>>> exited,
>>>>> finally the gc() call dropped the instance count down to near zero.
>>>>>
>>>>> Burt
>>>>>
>>>>> On Thursday 04 September 2008 5:34:39 pm Les Hazlewood wrote:
>>>>> > Oops, I didn't give you all the code - the method I posted earlier
>>>>> > performed one and only one 'mini' batch insert.  It was called from
>>>>> > another
>>>>> > method that coordinated calling that method as necessary:
>>>>> > public void importProductBatched(Product product) {
>>>>> >         if (log.isInfoEnabled()) {
>>>>> >             log.info("Performing a batching import for product of type
>>>>> > [" +
>>>>> > product.getClass().getName() + "]");
>>>>> >         }
>>>>> >         preBatch();
>>>>> >         Iterator i = product.iterator();
>>>>> >         while (i.hasNext()) {
>>>>> >             int batchImported = importBatch(i, getBatchCount(),
>>>>> > getFlushCount());
>>>>> >             product.setCommittedBytesRead(product.getBytesRead());
>>>>> >             product.setCommittedRecordsRead(((RecordIterator)
>>>>> > i).getIndex());
>>>>> >             log.info("Batched imported Count" + batchImported + "
>>>>> bytes
>>>>> > read =" + product.getBytesRead() + "|" + ((RecordIterator)
>>>>> > i).getIndex());
>>>>> > }
>>>>> >         if (log.isInfoEnabled()) {
>>>>> >             log.info("Successfully imported batch for product of type
>>>>> ["
>>>>> > +
>>>>> > product.getClass().getName() + "]");
>>>>> >         }
>>>>> >         postBatch();
>>>>> >     }
>>>>> >
>>>>> > And then the importBatch method:
>>>>> >
>>>>> > protected int importBatch(final Iterator iterator, final int
>>>>> maxRecords,
>>>>> > final int flushCount) {
>>>>> >         //Batch inserts need to be done as small chunks within their
>>>>> own
>>>>> > transaction to avoid
>>>>> >         //transaction timeouts (and rollbacks) due to large data sets,
>>>>> > so
>>>>> >         //specify PROPAGATION_REQUIRES_NEW.
>>>>> >         //
>>>>> >         //If we did NOT do this, and imported in one transaction, and
>>>>> > the
>>>>> > Product contained
>>>>> >         //thousands of records (as does happen during batch mode),
>>>>> odds
>>>>> > are
>>>>> > high that the transaction would time out
>>>>> >         //(e.g. after 5 minutes) and everything would be rolled back.
>>>>> >  This
>>>>> > is not desired.
>>>>> >         TransactionDefinition txnDef =
>>>>> >                 new
>>>>> >
>>>>> >
>>>>> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_NEW
>>>>> >);
>>>>> >
>>>>> >         TransactionTemplate txnTemplate = new
>>>>> > TransactionTemplate(getTransactionManager(), txnDef);
>>>>> >         Integer importCount = (Integer) txnTemplate.execute(new
>>>>> > TransactionCallback() {
>>>>> >             public Object doInTransaction(TransactionStatus status) {
>>>>> >                 int batchRecordCount = 0;
>>>>> >
>>>>> >                 while (iterator.hasNext() && batchRecordCount <
>>>>> > maxRecords)
>>>>> > {
>>>>> >                     Record record = (Record) iterator.next();
>>>>> >                     if (record != null) { //can be null if any records
>>>>> > are
>>>>> > skipped
>>>>> >                         importRecord(record);
>>>>> >                         batchRecordCount++;
>>>>> >                         notifyRecordImported(record);
>>>>> >                         if (batchRecordCount % flushCount == 0) {
>>>>> >                             flush();
>>>>> >                         }
>>>>> >                     }
>>>>> >                 }
>>>>> >                 if (batchRecordCount > 0) {
>>>>> >                     flush();
>>>>> >                 }
>>>>> >
>>>>> >                 if (log.isInfoEnabled()) {
>>>>> >                     log.info("Batch imported " + batchRecordCount + "
>>>>> > records in a single transaction.");
>>>>> >                 }
>>>>> >                 return batchRecordCount;
>>>>> >             }
>>>>> >         });
>>>>> >         return importCount != null ? importCount : 0;
>>>>> >     }
>>>>> >
>>>>> > Regards,
>>>>> >
>>>>> > Les
>>>>> >
>>>>> > On Thu, Sep 4, 2008 at 5:30 PM, Les Hazlewood <[hidden email]>
>>>>> wrote:
>>>>> > > The only thing I can think of is that it might be related to the
>>>>> > > transaction manager - for a long running transaction, the TM is
>>>>> > > probably
>>>>> > > retaining log entries to accommodate a rollback.  For such a long
>>>>> > > running
>>>>> > > transaction, that could be a lot of overhead.  What TM
>>>>> implementation
>>>>> > > are
>>>>> > > you using?
>>>>> > > I myself just finished an application which required ridiculous
>>>>> > > amounts
>>>>> > > of data to be loaded on a daily basis in batch mode (gigabytes,
>>>>> > > millions
>>>>> > > of records across many tables).  I basically ensured that the insert
>>>>> > > operations happened in 1000 count 'mini' batches for 2 reasons:
>>>>> > >
>>>>> > > 1) to prevent a TM timeout (default was 5 minutes)
>>>>> > > 2) to prevent the TM overhead from getting too large (memory)
>>>>> > >
>>>>> > > Here's some code I used to do that (Java + Spring
>>>>> > > TransactionTemplate).
>>>>> > >  Now I don't know if you are experiencing the same thing I did, but
>>>>> > > you
>>>>> > > might find it useful.
>>>>> > >
>>>>> > > - note that the Iterator passed in represented a custom
>>>>> implementation
>>>>> > > that was reading records from a file stream - the file was over 10
>>>>> > > Gigs,
>>>>> > > and the Iterator represented each 'chunked' record in the stream.
>>>>> > > - The 'flushCount' was a config variable from a .properties file -
>>>>> in
>>>>> > > our
>>>>> > > case, it was equal to 1000).
>>>>> > > - the flush() method just called a DAO which internally called
>>>>> > > hibernateTemplate.flush() and then immediately
>>>>> > > hibernateTemplate.clear();
>>>>> > > - you could try to extend the transaction timeout beyond the default
>>>>> > > (e.g. 5 minutes) by configuring the transaction manager, but
>>>>> > >   odds are high you'd get into a TM housekeeping memory problem
>>>>> > >
>>>>> > > protected int importBatch(final Iterator iterator, final int
>>>>> > > flushCount)
>>>>> > > { //Batch inserts need to be done as small chunks within their own
>>>>> > > transaction to avoid
>>>>> > >         //transaction timeouts (and rollbacks) due to large data
>>>>> sets,
>>>>> > > so
>>>>> > >         //specify PROPAGATION_REQUIRES_NEW.
>>>>> > >         //
>>>>> > >         //If we did NOT do this, and tried to import hundreds of
>>>>> > > thousands or millions of records,
>>>>> > >         //odds are high that the transaction would time out
>>>>> > >         //(e.g. after 5 minutes) and everything would be rolled
>>>>> back.
>>>>> > > This is not desired.
>>>>> > >         TransactionDefinition txnDef =
>>>>> > >                 new
>>>>> > >
>>>>> > >
>>>>> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIRES_N
>>>>> > >EW);
>>>>> > >
>>>>> > >         TransactionTemplate txnTemplate = new
>>>>> > > TransactionTemplate(getTransactionManager(), txnDef);
>>>>> > >         Integer importCount = (Integer) txnTemplate.execute(new
>>>>> > > TransactionCallback() {
>>>>> > >             public Object doInTransaction(TransactionStatus status)
>>>>> {
>>>>> > >                 int batchRecordCount = 0;
>>>>> > >
>>>>> > >                 while (iterator.hasNext() ) {
>>>>> > >                     Record record = (Record) iterator.next();
>>>>> > >                     if (record != null) { //can be null if any
>>>>> records
>>>>> > > are skipped
>>>>> > >                         importRecord(record);
>>>>> > >                         batchRecordCount++;
>>>>> > >                         notifyRecordImported(record);
>>>>> > >                         if (batchRecordCount % flushCount == 0) {
>>>>> > >                             flush();
>>>>> > >                         }
>>>>> > >                     }
>>>>> > >                 }
>>>>> > >                 if (batchRecordCount > 0) {
>>>>> > >                     flush();
>>>>> > >                 }
>>>>> > >
>>>>> > >                 if (log.isInfoEnabled()) {
>>>>> > >                     log.info("Batch imported " + batchRecordCount +
>>>>> "
>>>>> > > records in a single transaction.");
>>>>> > >                 }
>>>>> > >                 return batchRecordCount;
>>>>> > >             }
>>>>> > >         });
>>>>> > >         return importCount != null ? importCount : 0;
>>>>> > >     }
>>>>> > >
>>>>> > > Note that if any of those 'mini' transactions fail, you have to do a
>>>>> > > manual rollback of all of the records that went in.  We accounted
>>>>> for
>>>>> > > this by storing an id in the table where these fields were inserted,
>>>>> > > and
>>>>> > > if any error occurred, performed a bulk delete ('delete from blah
>>>>> > > where
>>>>> > > manual_tx_id = foo').
>>>>> > >
>>>>> > > HTH,
>>>>> > >
>>>>> > > Les
>>>>> > >
>>>>> > > On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith
>>>>> <[hidden email]>wrote:
>>>>> > >> I'm building a large database for load testing, so I'm creating
>>>>> > >> millions
>>>>> > >> of
>>>>> > >> domain instances. I'm trying to use the standard bulk insert
>>>>> approach
>>>>> > >> in
>>>>> > >> Hibernate, i.e. flush() and clear() the session periodically, with
>>>>> > >> the
>>>>> > >> whole
>>>>> > >> operation running in a transaction.
>>>>> > >>
>>>>> > >> The problem is the instances are still around - I run out of memory
>>>>> > >> after a
>>>>> > >> few hours with a heap set at 1G. I've turned off 2nd-level caching
>>>>> > >> and
>>>>> > >> looked
>>>>> > >> at the Session in a debugger and clear() definitely works - it's
>>>>> > >> empty
>>>>> > >> afterwards. There are no mapped collections, and I'm not keeping
>>>>> any
>>>>> > >> explicit
>>>>> > >> references to the instances.
>>>>> > >>
>>>>> > >> But running in a profiler I can see the instance count steadily
>>>>> > >> increase
>>>>> > >> and
>>>>> > >> never decrease. Running gc() has no effect.
>>>>> > >>
>>>>> > >> Any thoughts on what might be keeping a reference to these
>>>>> instances?
>>>>> > >>
>>>>> > >> Burt
>>>>> > >>
>>>>> > >>
>>>>> ---------------------------------------------------------------------
>>>>> > >> To unsubscribe from this list, please visit:
>>>>> > >>
>>>>> > >>    http://xircles.codehaus.org/manage_email
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe from this list, please visit:
>>>>>
>>>>>    http://xircles.codehaus.org/manage_email
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Graeme Rocher
>>> Grails Project Lead
>>> G2One, Inc. Chief Technology Officer
>>> http://www.g2one.com
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe from this list, please visit:
>>>
>>>     http://xircles.codehaus.org/manage_email
>>>
>>>
>>>
>>>
>>
>> --
>> View this message in context: http://www.nabble.com/Memory-leak-tp19320007p19333509.html
>> Sent from the grails - user mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>    http://xircles.codehaus.org/manage_email
>>
>>
>>
>
>
>
> --
> Graeme Rocher
> Grails Project Lead
> G2One, Inc. Chief Technology Officer
> http://www.g2one.com
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>    http://xircles.codehaus.org/manage_email
>
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

burtbeckwith
I'm not sure it's a bug, more of a candidate for the FAQ (or RAQ, since this
is not a typical scenario).

Ordinarily a problem keeping domain instances from being candidates for
garbage collection in a Grails method isn't an issue since most or all of the
instances will be needed. It's only a problem in the rare case where you're
loading or creating a bazillion instances.

I'm not sure what Grails could do to fix this automatically - how would it be
able to detect that an instance is no longer needed? You might put them in
the model for use by the view, and errors might be needed to display
validation messages, or you might just be using them temporarily.

I think it's best to leave the disassociation (setting errors to null) to the
user.

Burt

On Friday 05 September 2008 12:28:45 pm Daniel Honig wrote:
> Is there a JIRA issue on this, I'd like to go vote for it :)
>
> On Fri, Sep 5, 2008 at 12:16 PM, Graeme Rocher <[hidden email]> wrote:
> > Well at least you have a workaround :-)
> >
> > Cheers
> >
> > On Fri, Sep 5, 2008 at 4:17 PM, burtbeckwith <[hidden email]>
wrote:

> >> This worked, but was a little more work than setting errors to null. I
> >> was running this in the console so there's no request, and the
> >> PROPERTY_INSTANCE_MAP configured in DomainClassGrailsPlugin was holding
> >> the errors. When I cleared that map after each loop interation
> >> everything worked fine:
> >>
> >>   import org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin
> >>
> >>   for (int i = 0; i < 100000; i++) {
> >>      new Thing(name: "thing_${i}")
> >>      DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
> >>      Thread.sleep(50) // to allow time to watch things in the profiler
> >>   }
> >>
> >> Burt
> >>
> >> Graeme Rocher-2 wrote:
> >>> I believe the issue is that we override the default constructor to
> >>> perform data binding. The binding errors object is bound to either the
> >>> current thread or the current request
> >>>
> >>> Since these errors are scoped to the request they won't be GC'ed until
> >>> the request completes. You could regard this as a bug. One work around
> >>> may be do do:
> >>>
> >>>   for (int i = 0; i < 100000; i++) {
> >>>      def t = new Thing(name: "thing_${i}")
> >>>      t.errors = null
> >>>      Thread.sleep(50); // to allow time to watch things in the profiler
> >>>   }
> >>>
> >>> I'm not sure what a better solution is for now. Maybe we could use a
> >>> ReferenceQueue or something to check if its a candidate for GC'ing and
> >>> then automatically clear the errors... hmmm needs some thought
> >>>
> >>> Cheers
> >>>
> >>> On Fri, Sep 5, 2008 at 3:24 AM, Les Hazlewood <[hidden email]> wrote:
> >>>> Hrm - that sounds kinda serious - and may even be a Groovy thing and
> >>>> not a
> >>>> Grails thing.  Anyone from G2One wanna comment?
> >>>>
> >>>> On Thu, Sep 4, 2008 at 8:15 PM, Burt Beckwith <[hidden email]>
> >>>>
> >>>> wrote:
> >>>>> I spent quite a while on this today and think the issue has to do
> >>>>> with Grails
> >>>>> and has nothing to do with transactions. Btw - it's not a memory leak
> >>>>> in the
> >>>>> classic sense, but a GC issue - domain instances were somehow
> >>>>> referenced and
> >>>>> couldn't be GC'd. The profile instance count just kept growing and
> >>>>> growing.
> >>>>>
> >>>>> I kept stripping down the code to try to isolate what was going on,
> >>>>> and it
> >>>>> even happened when I just created a single, 1-field dummy domain
> >>>>> class and
> >>>>> didn't call a single method on it. The code was something like
> >>>>>
> >>>>>   for (int i = 0; i < 100000; i++) {
> >>>>>      new Thing(name: "thing_${i}")
> >>>>>      Thread.sleep(50); // to allow time to watch things in the
> >>>>> profiler }
> >>>>>
> >>>>> While the loop was running I'd run gc() periodically, and it had no
> >>>>> effect
> >>>>> on
> >>>>> the domain class instance count, but clearly it was working since
> >>>>> other class
> >>>>> counts dropped and the overall heap size dropped. Once the method
> >>>>> exited,
> >>>>> finally the gc() call dropped the instance count down to near zero.
> >>>>>
> >>>>> Burt
> >>>>>
> >>>>> On Thursday 04 September 2008 5:34:39 pm Les Hazlewood wrote:
> >>>>> > Oops, I didn't give you all the code - the method I posted earlier
> >>>>> > performed one and only one 'mini' batch insert.  It was called from
> >>>>> > another
> >>>>> > method that coordinated calling that method as necessary:
> >>>>> > public void importProductBatched(Product product) {
> >>>>> >         if (log.isInfoEnabled()) {
> >>>>> >             log.info("Performing a batching import for product of
> >>>>> > type [" +
> >>>>> > product.getClass().getName() + "]");
> >>>>> >         }
> >>>>> >         preBatch();
> >>>>> >         Iterator i = product.iterator();
> >>>>> >         while (i.hasNext()) {
> >>>>> >             int batchImported = importBatch(i, getBatchCount(),
> >>>>> > getFlushCount());
> >>>>> >             product.setCommittedBytesRead(product.getBytesRead());
> >>>>> >             product.setCommittedRecordsRead(((RecordIterator)
> >>>>> > i).getIndex());
> >>>>> >             log.info("Batched imported Count" + batchImported + "
> >>>>>
> >>>>> bytes
> >>>>>
> >>>>> > read =" + product.getBytesRead() + "|" + ((RecordIterator)
> >>>>> > i).getIndex());
> >>>>> > }
> >>>>> >         if (log.isInfoEnabled()) {
> >>>>> >             log.info("Successfully imported batch for product of
> >>>>> > type
> >>>>>
> >>>>> ["
> >>>>>
> >>>>> > +
> >>>>> > product.getClass().getName() + "]");
> >>>>> >         }
> >>>>> >         postBatch();
> >>>>> >     }
> >>>>> >
> >>>>> > And then the importBatch method:
> >>>>> >
> >>>>> > protected int importBatch(final Iterator iterator, final int
> >>>>>
> >>>>> maxRecords,
> >>>>>
> >>>>> > final int flushCount) {
> >>>>> >         //Batch inserts need to be done as small chunks within
> >>>>> > their
> >>>>>
> >>>>> own
> >>>>>
> >>>>> > transaction to avoid
> >>>>> >         //transaction timeouts (and rollbacks) due to large data
> >>>>> > sets, so
> >>>>> >         //specify PROPAGATION_REQUIRES_NEW.
> >>>>> >         //
> >>>>> >         //If we did NOT do this, and imported in one transaction,
> >>>>> > and the
> >>>>> > Product contained
> >>>>> >         //thousands of records (as does happen during batch mode),
> >>>>>
> >>>>> odds
> >>>>>
> >>>>> > are
> >>>>> > high that the transaction would time out
> >>>>> >         //(e.g. after 5 minutes) and everything would be rolled
> >>>>> > back. This
> >>>>> > is not desired.
> >>>>> >         TransactionDefinition txnDef =
> >>>>> >                 new
> >>>>>
> >>>>> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIR
> >>>>>ES_NEW
> >>>>>
> >>>>> >);
> >>>>> >
> >>>>> >         TransactionTemplate txnTemplate = new
> >>>>> > TransactionTemplate(getTransactionManager(), txnDef);
> >>>>> >         Integer importCount = (Integer) txnTemplate.execute(new
> >>>>> > TransactionCallback() {
> >>>>> >             public Object doInTransaction(TransactionStatus status)
> >>>>> > { int batchRecordCount = 0;
> >>>>> >
> >>>>> >                 while (iterator.hasNext() && batchRecordCount <
> >>>>> > maxRecords)
> >>>>> > {
> >>>>> >                     Record record = (Record) iterator.next();
> >>>>> >                     if (record != null) { //can be null if any
> >>>>> > records are
> >>>>> > skipped
> >>>>> >                         importRecord(record);
> >>>>> >                         batchRecordCount++;
> >>>>> >                         notifyRecordImported(record);
> >>>>> >                         if (batchRecordCount % flushCount == 0) {
> >>>>> >                             flush();
> >>>>> >                         }
> >>>>> >                     }
> >>>>> >                 }
> >>>>> >                 if (batchRecordCount > 0) {
> >>>>> >                     flush();
> >>>>> >                 }
> >>>>> >
> >>>>> >                 if (log.isInfoEnabled()) {
> >>>>> >                     log.info("Batch imported " + batchRecordCount +
> >>>>> > " records in a single transaction.");
> >>>>> >                 }
> >>>>> >                 return batchRecordCount;
> >>>>> >             }
> >>>>> >         });
> >>>>> >         return importCount != null ? importCount : 0;
> >>>>> >     }
> >>>>> >
> >>>>> > Regards,
> >>>>> >
> >>>>> > Les
> >>>>> >
> >>>>> > On Thu, Sep 4, 2008 at 5:30 PM, Les Hazlewood <[hidden email]>
> >>>>>
> >>>>> wrote:
> >>>>> > > The only thing I can think of is that it might be related to the
> >>>>> > > transaction manager - for a long running transaction, the TM is
> >>>>> > > probably
> >>>>> > > retaining log entries to accommodate a rollback.  For such a long
> >>>>> > > running
> >>>>> > > transaction, that could be a lot of overhead.  What TM
> >>>>>
> >>>>> implementation
> >>>>>
> >>>>> > > are
> >>>>> > > you using?
> >>>>> > > I myself just finished an application which required ridiculous
> >>>>> > > amounts
> >>>>> > > of data to be loaded on a daily basis in batch mode (gigabytes,
> >>>>> > > millions
> >>>>> > > of records across many tables).  I basically ensured that the
> >>>>> > > insert operations happened in 1000 count 'mini' batches for 2
> >>>>> > > reasons:
> >>>>> > >
> >>>>> > > 1) to prevent a TM timeout (default was 5 minutes)
> >>>>> > > 2) to prevent the TM overhead from getting too large (memory)
> >>>>> > >
> >>>>> > > Here's some code I used to do that (Java + Spring
> >>>>> > > TransactionTemplate).
> >>>>> > >  Now I don't know if you are experiencing the same thing I did,
> >>>>> > > but you
> >>>>> > > might find it useful.
> >>>>> > >
> >>>>> > > - note that the Iterator passed in represented a custom
> >>>>>
> >>>>> implementation
> >>>>>
> >>>>> > > that was reading records from a file stream - the file was over
> >>>>> > > 10 Gigs,
> >>>>> > > and the Iterator represented each 'chunked' record in the stream.
> >>>>> > > - The 'flushCount' was a config variable from a .properties file
> >>>>> > > -
> >>>>>
> >>>>> in
> >>>>>
> >>>>> > > our
> >>>>> > > case, it was equal to 1000).
> >>>>> > > - the flush() method just called a DAO which internally called
> >>>>> > > hibernateTemplate.flush() and then immediately
> >>>>> > > hibernateTemplate.clear();
> >>>>> > > - you could try to extend the transaction timeout beyond the
> >>>>> > > default (e.g. 5 minutes) by configuring the transaction manager,
> >>>>> > > but odds are high you'd get into a TM housekeeping memory problem
> >>>>> > >
> >>>>> > > protected int importBatch(final Iterator iterator, final int
> >>>>> > > flushCount)
> >>>>> > > { //Batch inserts need to be done as small chunks within their
> >>>>> > > own transaction to avoid
> >>>>> > >         //transaction timeouts (and rollbacks) due to large data
> >>>>>
> >>>>> sets,
> >>>>>
> >>>>> > > so
> >>>>> > >         //specify PROPAGATION_REQUIRES_NEW.
> >>>>> > >         //
> >>>>> > >         //If we did NOT do this, and tried to import hundreds of
> >>>>> > > thousands or millions of records,
> >>>>> > >         //odds are high that the transaction would time out
> >>>>> > >         //(e.g. after 5 minutes) and everything would be rolled
> >>>>>
> >>>>> back.
> >>>>>
> >>>>> > > This is not desired.
> >>>>> > >         TransactionDefinition txnDef =
> >>>>> > >                 new
> >>>>>
> >>>>> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIR
> >>>>>ES_N
> >>>>>
> >>>>> > >EW);
> >>>>> > >
> >>>>> > >         TransactionTemplate txnTemplate = new
> >>>>> > > TransactionTemplate(getTransactionManager(), txnDef);
> >>>>> > >         Integer importCount = (Integer) txnTemplate.execute(new
> >>>>> > > TransactionCallback() {
> >>>>> > >             public Object doInTransaction(TransactionStatus
> >>>>> > > status)
> >>>>>
> >>>>> {
> >>>>>
> >>>>> > >                 int batchRecordCount = 0;
> >>>>> > >
> >>>>> > >                 while (iterator.hasNext() ) {
> >>>>> > >                     Record record = (Record) iterator.next();
> >>>>> > >                     if (record != null) { //can be null if any
> >>>>>
> >>>>> records
> >>>>>
> >>>>> > > are skipped
> >>>>> > >                         importRecord(record);
> >>>>> > >                         batchRecordCount++;
> >>>>> > >                         notifyRecordImported(record);
> >>>>> > >                         if (batchRecordCount % flushCount == 0) {
> >>>>> > >                             flush();
> >>>>> > >                         }
> >>>>> > >                     }
> >>>>> > >                 }
> >>>>> > >                 if (batchRecordCount > 0) {
> >>>>> > >                     flush();
> >>>>> > >                 }
> >>>>> > >
> >>>>> > >                 if (log.isInfoEnabled()) {
> >>>>> > >                     log.info("Batch imported " + batchRecordCount
> >>>>> > > +
> >>>>>
> >>>>> "
> >>>>>
> >>>>> > > records in a single transaction.");
> >>>>> > >                 }
> >>>>> > >                 return batchRecordCount;
> >>>>> > >             }
> >>>>> > >         });
> >>>>> > >         return importCount != null ? importCount : 0;
> >>>>> > >     }
> >>>>> > >
> >>>>> > > Note that if any of those 'mini' transactions fail, you have to
> >>>>> > > do a manual rollback of all of the records that went in.  We
> >>>>> > > accounted
> >>>>>
> >>>>> for
> >>>>>
> >>>>> > > this by storing an id in the table where these fields were
> >>>>> > > inserted, and
> >>>>> > > if any error occurred, performed a bulk delete ('delete from blah
> >>>>> > > where
> >>>>> > > manual_tx_id = foo').
> >>>>> > >
> >>>>> > > HTH,
> >>>>> > >
> >>>>> > > Les
> >>>>> > >
> >>>>> > > On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith
> >>>>>
> >>>>> <[hidden email]>wrote:
> >>>>> > >> I'm building a large database for load testing, so I'm creating
> >>>>> > >> millions
> >>>>> > >> of
> >>>>> > >> domain instances. I'm trying to use the standard bulk insert
> >>>>>
> >>>>> approach
> >>>>>
> >>>>> > >> in
> >>>>> > >> Hibernate, i.e. flush() and clear() the session periodically,
> >>>>> > >> with the
> >>>>> > >> whole
> >>>>> > >> operation running in a transaction.
> >>>>> > >>
> >>>>> > >> The problem is the instances are still around - I run out of
> >>>>> > >> memory after a
> >>>>> > >> few hours with a heap set at 1G. I've turned off 2nd-level
> >>>>> > >> caching and
> >>>>> > >> looked
> >>>>> > >> at the Session in a debugger and clear() definitely works - it's
> >>>>> > >> empty
> >>>>> > >> afterwards. There are no mapped collections, and I'm not keeping
> >>>>>
> >>>>> any
> >>>>>
> >>>>> > >> explicit
> >>>>> > >> references to the instances.
> >>>>> > >>
> >>>>> > >> But running in a profiler I can see the instance count steadily
> >>>>> > >> increase
> >>>>> > >> and
> >>>>> > >> never decrease. Running gc() has no effect.
> >>>>> > >>
> >>>>> > >> Any thoughts on what might be keeping a reference to these
> >>>>>
> >>>>> instances?
> >>>>>
> >>>>> > >> Burt
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>>
> >>>>> > >> To unsubscribe from this list, please visit:
> >>>>> > >>
> >>>>> > >>    http://xircles.codehaus.org/manage_email
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe from this list, please visit:
> >>>>>
> >>>>>    http://xircles.codehaus.org/manage_email
> >>>
> >>> --
> >>> Graeme Rocher
> >>> Grails Project Lead
> >>> G2One, Inc. Chief Technology Officer
> >>> http://www.g2one.com
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe from this list, please visit:
> >>>
> >>>     http://xircles.codehaus.org/manage_email
> >>
> >> --
> >> View this message in context:
> >> http://www.nabble.com/Memory-leak-tp19320007p19333509.html Sent from the
> >> grails - user mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe from this list, please visit:
> >>
> >>    http://xircles.codehaus.org/manage_email
> >
> > --
> > Graeme Rocher
> > Grails Project Lead
> > G2One, Inc. Chief Technology Officer
> > http://www.g2one.com
> >
> > ---------------------------------------------------------------------
> > To unsubscribe from this list, please visit:
> >
> >    http://xircles.codehaus.org/manage_email
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>     http://xircles.codehaus.org/manage_email



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Daniel Honig
got it. thanks for the clarification.

On Sat, Sep 6, 2008 at 10:41 PM, Burt Beckwith <[hidden email]> wrote:

> I'm not sure it's a bug, more of a candidate for the FAQ (or RAQ, since this
> is not a typical scenario).
>
> Ordinarily a problem keeping domain instances from being candidates for
> garbage collection in a Grails method isn't an issue since most or all of the
> instances will be needed. It's only a problem in the rare case where you're
> loading or creating a bazillion instances.
>
> I'm not sure what Grails could do to fix this automatically - how would it be
> able to detect that an instance is no longer needed? You might put them in
> the model for use by the view, and errors might be needed to display
> validation messages, or you might just be using them temporarily.
>
> I think it's best to leave the disassociation (setting errors to null) to the
> user.
>
> Burt
>
> On Friday 05 September 2008 12:28:45 pm Daniel Honig wrote:
>> Is there a JIRA issue on this, I'd like to go vote for it :)
>>
>> On Fri, Sep 5, 2008 at 12:16 PM, Graeme Rocher <[hidden email]> wrote:
>> > Well at least you have a workaround :-)
>> >
>> > Cheers
>> >
>> > On Fri, Sep 5, 2008 at 4:17 PM, burtbeckwith <[hidden email]>
> wrote:
>> >> This worked, but was a little more work than setting errors to null. I
>> >> was running this in the console so there's no request, and the
>> >> PROPERTY_INSTANCE_MAP configured in DomainClassGrailsPlugin was holding
>> >> the errors. When I cleared that map after each loop interation
>> >> everything worked fine:
>> >>
>> >>   import org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin
>> >>
>> >>   for (int i = 0; i < 100000; i++) {
>> >>      new Thing(name: "thing_${i}")
>> >>      DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
>> >>      Thread.sleep(50) // to allow time to watch things in the profiler
>> >>   }
>> >>
>> >> Burt
>> >>
>> >> Graeme Rocher-2 wrote:
>> >>> I believe the issue is that we override the default constructor to
>> >>> perform data binding. The binding errors object is bound to either the
>> >>> current thread or the current request
>> >>>
>> >>> Since these errors are scoped to the request they won't be GC'ed until
>> >>> the request completes. You could regard this as a bug. One work around
>> >>> may be do do:
>> >>>
>> >>>   for (int i = 0; i < 100000; i++) {
>> >>>      def t = new Thing(name: "thing_${i}")
>> >>>      t.errors = null
>> >>>      Thread.sleep(50); // to allow time to watch things in the profiler
>> >>>   }
>> >>>
>> >>> I'm not sure what a better solution is for now. Maybe we could use a
>> >>> ReferenceQueue or something to check if its a candidate for GC'ing and
>> >>> then automatically clear the errors... hmmm needs some thought
>> >>>
>> >>> Cheers
>> >>>
>> >>> On Fri, Sep 5, 2008 at 3:24 AM, Les Hazlewood <[hidden email]> wrote:
>> >>>> Hrm - that sounds kinda serious - and may even be a Groovy thing and
>> >>>> not a
>> >>>> Grails thing.  Anyone from G2One wanna comment?
>> >>>>
>> >>>> On Thu, Sep 4, 2008 at 8:15 PM, Burt Beckwith <[hidden email]>
>> >>>>
>> >>>> wrote:
>> >>>>> I spent quite a while on this today and think the issue has to do
>> >>>>> with Grails
>> >>>>> and has nothing to do with transactions. Btw - it's not a memory leak
>> >>>>> in the
>> >>>>> classic sense, but a GC issue - domain instances were somehow
>> >>>>> referenced and
>> >>>>> couldn't be GC'd. The profile instance count just kept growing and
>> >>>>> growing.
>> >>>>>
>> >>>>> I kept stripping down the code to try to isolate what was going on,
>> >>>>> and it
>> >>>>> even happened when I just created a single, 1-field dummy domain
>> >>>>> class and
>> >>>>> didn't call a single method on it. The code was something like
>> >>>>>
>> >>>>>   for (int i = 0; i < 100000; i++) {
>> >>>>>      new Thing(name: "thing_${i}")
>> >>>>>      Thread.sleep(50); // to allow time to watch things in the
>> >>>>> profiler }
>> >>>>>
>> >>>>> While the loop was running I'd run gc() periodically, and it had no
>> >>>>> effect
>> >>>>> on
>> >>>>> the domain class instance count, but clearly it was working since
>> >>>>> other class
>> >>>>> counts dropped and the overall heap size dropped. Once the method
>> >>>>> exited,
>> >>>>> finally the gc() call dropped the instance count down to near zero.
>> >>>>>
>> >>>>> Burt
>> >>>>>
>> >>>>> On Thursday 04 September 2008 5:34:39 pm Les Hazlewood wrote:
>> >>>>> > Oops, I didn't give you all the code - the method I posted earlier
>> >>>>> > performed one and only one 'mini' batch insert.  It was called from
>> >>>>> > another
>> >>>>> > method that coordinated calling that method as necessary:
>> >>>>> > public void importProductBatched(Product product) {
>> >>>>> >         if (log.isInfoEnabled()) {
>> >>>>> >             log.info("Performing a batching import for product of
>> >>>>> > type [" +
>> >>>>> > product.getClass().getName() + "]");
>> >>>>> >         }
>> >>>>> >         preBatch();
>> >>>>> >         Iterator i = product.iterator();
>> >>>>> >         while (i.hasNext()) {
>> >>>>> >             int batchImported = importBatch(i, getBatchCount(),
>> >>>>> > getFlushCount());
>> >>>>> >             product.setCommittedBytesRead(product.getBytesRead());
>> >>>>> >             product.setCommittedRecordsRead(((RecordIterator)
>> >>>>> > i).getIndex());
>> >>>>> >             log.info("Batched imported Count" + batchImported + "
>> >>>>>
>> >>>>> bytes
>> >>>>>
>> >>>>> > read =" + product.getBytesRead() + "|" + ((RecordIterator)
>> >>>>> > i).getIndex());
>> >>>>> > }
>> >>>>> >         if (log.isInfoEnabled()) {
>> >>>>> >             log.info("Successfully imported batch for product of
>> >>>>> > type
>> >>>>>
>> >>>>> ["
>> >>>>>
>> >>>>> > +
>> >>>>> > product.getClass().getName() + "]");
>> >>>>> >         }
>> >>>>> >         postBatch();
>> >>>>> >     }
>> >>>>> >
>> >>>>> > And then the importBatch method:
>> >>>>> >
>> >>>>> > protected int importBatch(final Iterator iterator, final int
>> >>>>>
>> >>>>> maxRecords,
>> >>>>>
>> >>>>> > final int flushCount) {
>> >>>>> >         //Batch inserts need to be done as small chunks within
>> >>>>> > their
>> >>>>>
>> >>>>> own
>> >>>>>
>> >>>>> > transaction to avoid
>> >>>>> >         //transaction timeouts (and rollbacks) due to large data
>> >>>>> > sets, so
>> >>>>> >         //specify PROPAGATION_REQUIRES_NEW.
>> >>>>> >         //
>> >>>>> >         //If we did NOT do this, and imported in one transaction,
>> >>>>> > and the
>> >>>>> > Product contained
>> >>>>> >         //thousands of records (as does happen during batch mode),
>> >>>>>
>> >>>>> odds
>> >>>>>
>> >>>>> > are
>> >>>>> > high that the transaction would time out
>> >>>>> >         //(e.g. after 5 minutes) and everything would be rolled
>> >>>>> > back. This
>> >>>>> > is not desired.
>> >>>>> >         TransactionDefinition txnDef =
>> >>>>> >                 new
>> >>>>>
>> >>>>> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIR
>> >>>>>ES_NEW
>> >>>>>
>> >>>>> >);
>> >>>>> >
>> >>>>> >         TransactionTemplate txnTemplate = new
>> >>>>> > TransactionTemplate(getTransactionManager(), txnDef);
>> >>>>> >         Integer importCount = (Integer) txnTemplate.execute(new
>> >>>>> > TransactionCallback() {
>> >>>>> >             public Object doInTransaction(TransactionStatus status)
>> >>>>> > { int batchRecordCount = 0;
>> >>>>> >
>> >>>>> >                 while (iterator.hasNext() && batchRecordCount <
>> >>>>> > maxRecords)
>> >>>>> > {
>> >>>>> >                     Record record = (Record) iterator.next();
>> >>>>> >                     if (record != null) { //can be null if any
>> >>>>> > records are
>> >>>>> > skipped
>> >>>>> >                         importRecord(record);
>> >>>>> >                         batchRecordCount++;
>> >>>>> >                         notifyRecordImported(record);
>> >>>>> >                         if (batchRecordCount % flushCount == 0) {
>> >>>>> >                             flush();
>> >>>>> >                         }
>> >>>>> >                     }
>> >>>>> >                 }
>> >>>>> >                 if (batchRecordCount > 0) {
>> >>>>> >                     flush();
>> >>>>> >                 }
>> >>>>> >
>> >>>>> >                 if (log.isInfoEnabled()) {
>> >>>>> >                     log.info("Batch imported " + batchRecordCount +
>> >>>>> > " records in a single transaction.");
>> >>>>> >                 }
>> >>>>> >                 return batchRecordCount;
>> >>>>> >             }
>> >>>>> >         });
>> >>>>> >         return importCount != null ? importCount : 0;
>> >>>>> >     }
>> >>>>> >
>> >>>>> > Regards,
>> >>>>> >
>> >>>>> > Les
>> >>>>> >
>> >>>>> > On Thu, Sep 4, 2008 at 5:30 PM, Les Hazlewood <[hidden email]>
>> >>>>>
>> >>>>> wrote:
>> >>>>> > > The only thing I can think of is that it might be related to the
>> >>>>> > > transaction manager - for a long running transaction, the TM is
>> >>>>> > > probably
>> >>>>> > > retaining log entries to accommodate a rollback.  For such a long
>> >>>>> > > running
>> >>>>> > > transaction, that could be a lot of overhead.  What TM
>> >>>>>
>> >>>>> implementation
>> >>>>>
>> >>>>> > > are
>> >>>>> > > you using?
>> >>>>> > > I myself just finished an application which required ridiculous
>> >>>>> > > amounts
>> >>>>> > > of data to be loaded on a daily basis in batch mode (gigabytes,
>> >>>>> > > millions
>> >>>>> > > of records across many tables).  I basically ensured that the
>> >>>>> > > insert operations happened in 1000 count 'mini' batches for 2
>> >>>>> > > reasons:
>> >>>>> > >
>> >>>>> > > 1) to prevent a TM timeout (default was 5 minutes)
>> >>>>> > > 2) to prevent the TM overhead from getting too large (memory)
>> >>>>> > >
>> >>>>> > > Here's some code I used to do that (Java + Spring
>> >>>>> > > TransactionTemplate).
>> >>>>> > >  Now I don't know if you are experiencing the same thing I did,
>> >>>>> > > but you
>> >>>>> > > might find it useful.
>> >>>>> > >
>> >>>>> > > - note that the Iterator passed in represented a custom
>> >>>>>
>> >>>>> implementation
>> >>>>>
>> >>>>> > > that was reading records from a file stream - the file was over
>> >>>>> > > 10 Gigs,
>> >>>>> > > and the Iterator represented each 'chunked' record in the stream.
>> >>>>> > > - The 'flushCount' was a config variable from a .properties file
>> >>>>> > > -
>> >>>>>
>> >>>>> in
>> >>>>>
>> >>>>> > > our
>> >>>>> > > case, it was equal to 1000).
>> >>>>> > > - the flush() method just called a DAO which internally called
>> >>>>> > > hibernateTemplate.flush() and then immediately
>> >>>>> > > hibernateTemplate.clear();
>> >>>>> > > - you could try to extend the transaction timeout beyond the
>> >>>>> > > default (e.g. 5 minutes) by configuring the transaction manager,
>> >>>>> > > but odds are high you'd get into a TM housekeeping memory problem
>> >>>>> > >
>> >>>>> > > protected int importBatch(final Iterator iterator, final int
>> >>>>> > > flushCount)
>> >>>>> > > { //Batch inserts need to be done as small chunks within their
>> >>>>> > > own transaction to avoid
>> >>>>> > >         //transaction timeouts (and rollbacks) due to large data
>> >>>>>
>> >>>>> sets,
>> >>>>>
>> >>>>> > > so
>> >>>>> > >         //specify PROPAGATION_REQUIRES_NEW.
>> >>>>> > >         //
>> >>>>> > >         //If we did NOT do this, and tried to import hundreds of
>> >>>>> > > thousands or millions of records,
>> >>>>> > >         //odds are high that the transaction would time out
>> >>>>> > >         //(e.g. after 5 minutes) and everything would be rolled
>> >>>>>
>> >>>>> back.
>> >>>>>
>> >>>>> > > This is not desired.
>> >>>>> > >         TransactionDefinition txnDef =
>> >>>>> > >                 new
>> >>>>>
>> >>>>> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIR
>> >>>>>ES_N
>> >>>>>
>> >>>>> > >EW);
>> >>>>> > >
>> >>>>> > >         TransactionTemplate txnTemplate = new
>> >>>>> > > TransactionTemplate(getTransactionManager(), txnDef);
>> >>>>> > >         Integer importCount = (Integer) txnTemplate.execute(new
>> >>>>> > > TransactionCallback() {
>> >>>>> > >             public Object doInTransaction(TransactionStatus
>> >>>>> > > status)
>> >>>>>
>> >>>>> {
>> >>>>>
>> >>>>> > >                 int batchRecordCount = 0;
>> >>>>> > >
>> >>>>> > >                 while (iterator.hasNext() ) {
>> >>>>> > >                     Record record = (Record) iterator.next();
>> >>>>> > >                     if (record != null) { //can be null if any
>> >>>>>
>> >>>>> records
>> >>>>>
>> >>>>> > > are skipped
>> >>>>> > >                         importRecord(record);
>> >>>>> > >                         batchRecordCount++;
>> >>>>> > >                         notifyRecordImported(record);
>> >>>>> > >                         if (batchRecordCount % flushCount == 0) {
>> >>>>> > >                             flush();
>> >>>>> > >                         }
>> >>>>> > >                     }
>> >>>>> > >                 }
>> >>>>> > >                 if (batchRecordCount > 0) {
>> >>>>> > >                     flush();
>> >>>>> > >                 }
>> >>>>> > >
>> >>>>> > >                 if (log.isInfoEnabled()) {
>> >>>>> > >                     log.info("Batch imported " + batchRecordCount
>> >>>>> > > +
>> >>>>>
>> >>>>> "
>> >>>>>
>> >>>>> > > records in a single transaction.");
>> >>>>> > >                 }
>> >>>>> > >                 return batchRecordCount;
>> >>>>> > >             }
>> >>>>> > >         });
>> >>>>> > >         return importCount != null ? importCount : 0;
>> >>>>> > >     }
>> >>>>> > >
>> >>>>> > > Note that if any of those 'mini' transactions fail, you have to
>> >>>>> > > do a manual rollback of all of the records that went in.  We
>> >>>>> > > accounted
>> >>>>>
>> >>>>> for
>> >>>>>
>> >>>>> > > this by storing an id in the table where these fields were
>> >>>>> > > inserted, and
>> >>>>> > > if any error occurred, performed a bulk delete ('delete from blah
>> >>>>> > > where
>> >>>>> > > manual_tx_id = foo').
>> >>>>> > >
>> >>>>> > > HTH,
>> >>>>> > >
>> >>>>> > > Les
>> >>>>> > >
>> >>>>> > > On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith
>> >>>>>
>> >>>>> <[hidden email]>wrote:
>> >>>>> > >> I'm building a large database for load testing, so I'm creating
>> >>>>> > >> millions
>> >>>>> > >> of
>> >>>>> > >> domain instances. I'm trying to use the standard bulk insert
>> >>>>>
>> >>>>> approach
>> >>>>>
>> >>>>> > >> in
>> >>>>> > >> Hibernate, i.e. flush() and clear() the session periodically,
>> >>>>> > >> with the
>> >>>>> > >> whole
>> >>>>> > >> operation running in a transaction.
>> >>>>> > >>
>> >>>>> > >> The problem is the instances are still around - I run out of
>> >>>>> > >> memory after a
>> >>>>> > >> few hours with a heap set at 1G. I've turned off 2nd-level
>> >>>>> > >> caching and
>> >>>>> > >> looked
>> >>>>> > >> at the Session in a debugger and clear() definitely works - it's
>> >>>>> > >> empty
>> >>>>> > >> afterwards. There are no mapped collections, and I'm not keeping
>> >>>>>
>> >>>>> any
>> >>>>>
>> >>>>> > >> explicit
>> >>>>> > >> references to the instances.
>> >>>>> > >>
>> >>>>> > >> But running in a profiler I can see the instance count steadily
>> >>>>> > >> increase
>> >>>>> > >> and
>> >>>>> > >> never decrease. Running gc() has no effect.
>> >>>>> > >>
>> >>>>> > >> Any thoughts on what might be keeping a reference to these
>> >>>>>
>> >>>>> instances?
>> >>>>>
>> >>>>> > >> Burt
>> >>>>>
>> >>>>> ---------------------------------------------------------------------
>> >>>>>
>> >>>>> > >> To unsubscribe from this list, please visit:
>> >>>>> > >>
>> >>>>> > >>    http://xircles.codehaus.org/manage_email
>> >>>>>
>> >>>>> ---------------------------------------------------------------------
>> >>>>> To unsubscribe from this list, please visit:
>> >>>>>
>> >>>>>    http://xircles.codehaus.org/manage_email
>> >>>
>> >>> --
>> >>> Graeme Rocher
>> >>> Grails Project Lead
>> >>> G2One, Inc. Chief Technology Officer
>> >>> http://www.g2one.com
>> >>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe from this list, please visit:
>> >>>
>> >>>     http://xircles.codehaus.org/manage_email
>> >>
>> >> --
>> >> View this message in context:
>> >> http://www.nabble.com/Memory-leak-tp19320007p19333509.html Sent from the
>> >> grails - user mailing list archive at Nabble.com.
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe from this list, please visit:
>> >>
>> >>    http://xircles.codehaus.org/manage_email
>> >
>> > --
>> > Graeme Rocher
>> > Grails Project Lead
>> > G2One, Inc. Chief Technology Officer
>> > http://www.g2one.com
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe from this list, please visit:
>> >
>> >    http://xircles.codehaus.org/manage_email
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>     http://xircles.codehaus.org/manage_email
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>    http://xircles.codehaus.org/manage_email
>
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

Miguel Ping
A RAQ page with this info would be great ;)

On Sun, Sep 7, 2008 at 4:45 AM, Daniel Honig <[hidden email]> wrote:
got it. thanks for the clarification.

On Sat, Sep 6, 2008 at 10:41 PM, Burt Beckwith <[hidden email]> wrote:
> I'm not sure it's a bug, more of a candidate for the FAQ (or RAQ, since this
> is not a typical scenario).
>
> Ordinarily a problem keeping domain instances from being candidates for
> garbage collection in a Grails method isn't an issue since most or all of the
> instances will be needed. It's only a problem in the rare case where you're
> loading or creating a bazillion instances.
>
> I'm not sure what Grails could do to fix this automatically - how would it be
> able to detect that an instance is no longer needed? You might put them in
> the model for use by the view, and errors might be needed to display
> validation messages, or you might just be using them temporarily.
>
> I think it's best to leave the disassociation (setting errors to null) to the
> user.
>
> Burt
>
> On Friday 05 September 2008 12:28:45 pm Daniel Honig wrote:
>> Is there a JIRA issue on this, I'd like to go vote for it :)
>>
>> On Fri, Sep 5, 2008 at 12:16 PM, Graeme Rocher <[hidden email]> wrote:
>> > Well at least you have a workaround :-)
>> >
>> > Cheers
>> >
>> > On Fri, Sep 5, 2008 at 4:17 PM, burtbeckwith <[hidden email]>
> wrote:
>> >> This worked, but was a little more work than setting errors to null. I
>> >> was running this in the console so there's no request, and the
>> >> PROPERTY_INSTANCE_MAP configured in DomainClassGrailsPlugin was holding
>> >> the errors. When I cleared that map after each loop interation
>> >> everything worked fine:
>> >>
>> >>   import org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin
>> >>
>> >>   for (int i = 0; i < 100000; i++) {
>> >>      new Thing(name: "thing_${i}")
>> >>      DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
>> >>      Thread.sleep(50) // to allow time to watch things in the profiler
>> >>   }
>> >>
>> >> Burt
>> >>
>> >> Graeme Rocher-2 wrote:
>> >>> I believe the issue is that we override the default constructor to
>> >>> perform data binding. The binding errors object is bound to either the
>> >>> current thread or the current request
>> >>>
>> >>> Since these errors are scoped to the request they won't be GC'ed until
>> >>> the request completes. You could regard this as a bug. One work around
>> >>> may be do do:
>> >>>
>> >>>   for (int i = 0; i < 100000; i++) {
>> >>>      def t = new Thing(name: "thing_${i}")
>> >>>      t.errors = null
>> >>>      Thread.sleep(50); // to allow time to watch things in the profiler
>> >>>   }
>> >>>
>> >>> I'm not sure what a better solution is for now. Maybe we could use a
>> >>> ReferenceQueue or something to check if its a candidate for GC'ing and
>> >>> then automatically clear the errors... hmmm needs some thought
>> >>>
>> >>> Cheers
>> >>>
>> >>> On Fri, Sep 5, 2008 at 3:24 AM, Les Hazlewood <[hidden email]> wrote:
>> >>>> Hrm - that sounds kinda serious - and may even be a Groovy thing and
>> >>>> not a
>> >>>> Grails thing.  Anyone from G2One wanna comment?
>> >>>>
>> >>>> On Thu, Sep 4, 2008 at 8:15 PM, Burt Beckwith <[hidden email]>
>> >>>>
>> >>>> wrote:
>> >>>>> I spent quite a while on this today and think the issue has to do
>> >>>>> with Grails
>> >>>>> and has nothing to do with transactions. Btw - it's not a memory leak
>> >>>>> in the
>> >>>>> classic sense, but a GC issue - domain instances were somehow
>> >>>>> referenced and
>> >>>>> couldn't be GC'd. The profile instance count just kept growing and
>> >>>>> growing.
>> >>>>>
>> >>>>> I kept stripping down the code to try to isolate what was going on,
>> >>>>> and it
>> >>>>> even happened when I just created a single, 1-field dummy domain
>> >>>>> class and
>> >>>>> didn't call a single method on it. The code was something like
>> >>>>>
>> >>>>>   for (int i = 0; i < 100000; i++) {
>> >>>>>      new Thing(name: "thing_${i}")
>> >>>>>      Thread.sleep(50); // to allow time to watch things in the
>> >>>>> profiler }
>> >>>>>
>> >>>>> While the loop was running I'd run gc() periodically, and it had no
>> >>>>> effect
>> >>>>> on
>> >>>>> the domain class instance count, but clearly it was working since
>> >>>>> other class
>> >>>>> counts dropped and the overall heap size dropped. Once the method
>> >>>>> exited,
>> >>>>> finally the gc() call dropped the instance count down to near zero.
>> >>>>>
>> >>>>> Burt
>> >>>>>
>> >>>>> On Thursday 04 September 2008 5:34:39 pm Les Hazlewood wrote:
>> >>>>> > Oops, I didn't give you all the code - the method I posted earlier
>> >>>>> > performed one and only one 'mini' batch insert.  It was called from
>> >>>>> > another
>> >>>>> > method that coordinated calling that method as necessary:
>> >>>>> > public void importProductBatched(Product product) {
>> >>>>> >         if (log.isInfoEnabled()) {
>> >>>>> >             log.info("Performing a batching import for product of
>> >>>>> > type [" +
>> >>>>> > product.getClass().getName() + "]");
>> >>>>> >         }
>> >>>>> >         preBatch();
>> >>>>> >         Iterator i = product.iterator();
>> >>>>> >         while (i.hasNext()) {
>> >>>>> >             int batchImported = importBatch(i, getBatchCount(),
>> >>>>> > getFlushCount());
>> >>>>> >             product.setCommittedBytesRead(product.getBytesRead());
>> >>>>> >             product.setCommittedRecordsRead(((RecordIterator)
>> >>>>> > i).getIndex());
>> >>>>> >             log.info("Batched imported Count" + batchImported + "
>> >>>>>
>> >>>>> bytes
>> >>>>>
>> >>>>> > read =" + product.getBytesRead() + "|" + ((RecordIterator)
>> >>>>> > i).getIndex());
>> >>>>> > }
>> >>>>> >         if (log.isInfoEnabled()) {
>> >>>>> >             log.info("Successfully imported batch for product of
>> >>>>> > type
>> >>>>>
>> >>>>> ["
>> >>>>>
>> >>>>> > +
>> >>>>> > product.getClass().getName() + "]");
>> >>>>> >         }
>> >>>>> >         postBatch();
>> >>>>> >     }
>> >>>>> >
>> >>>>> > And then the importBatch method:
>> >>>>> >
>> >>>>> > protected int importBatch(final Iterator iterator, final int
>> >>>>>
>> >>>>> maxRecords,
>> >>>>>
>> >>>>> > final int flushCount) {
>> >>>>> >         //Batch inserts need to be done as small chunks within
>> >>>>> > their
>> >>>>>
>> >>>>> own
>> >>>>>
>> >>>>> > transaction to avoid
>> >>>>> >         //transaction timeouts (and rollbacks) due to large data
>> >>>>> > sets, so
>> >>>>> >         //specify PROPAGATION_REQUIRES_NEW.
>> >>>>> >         //
>> >>>>> >         //If we did NOT do this, and imported in one transaction,
>> >>>>> > and the
>> >>>>> > Product contained
>> >>>>> >         //thousands of records (as does happen during batch mode),
>> >>>>>
>> >>>>> odds
>> >>>>>
>> >>>>> > are
>> >>>>> > high that the transaction would time out
>> >>>>> >         //(e.g. after 5 minutes) and everything would be rolled
>> >>>>> > back. This
>> >>>>> > is not desired.
>> >>>>> >         TransactionDefinition txnDef =
>> >>>>> >                 new
>> >>>>>
>> >>>>> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIR
>> >>>>>ES_NEW
>> >>>>>
>> >>>>> >);
>> >>>>> >
>> >>>>> >         TransactionTemplate txnTemplate = new
>> >>>>> > TransactionTemplate(getTransactionManager(), txnDef);
>> >>>>> >         Integer importCount = (Integer) txnTemplate.execute(new
>> >>>>> > TransactionCallback() {
>> >>>>> >             public Object doInTransaction(TransactionStatus status)
>> >>>>> > { int batchRecordCount = 0;
>> >>>>> >
>> >>>>> >                 while (iterator.hasNext() && batchRecordCount <
>> >>>>> > maxRecords)
>> >>>>> > {
>> >>>>> >                     Record record = (Record) iterator.next();
>> >>>>> >                     if (record != null) { //can be null if any
>> >>>>> > records are
>> >>>>> > skipped
>> >>>>> >                         importRecord(record);
>> >>>>> >                         batchRecordCount++;
>> >>>>> >                         notifyRecordImported(record);
>> >>>>> >                         if (batchRecordCount % flushCount == 0) {
>> >>>>> >                             flush();
>> >>>>> >                         }
>> >>>>> >                     }
>> >>>>> >                 }
>> >>>>> >                 if (batchRecordCount > 0) {
>> >>>>> >                     flush();
>> >>>>> >                 }
>> >>>>> >
>> >>>>> >                 if (log.isInfoEnabled()) {
>> >>>>> >                     log.info("Batch imported " + batchRecordCount +
>> >>>>> > " records in a single transaction.");
>> >>>>> >                 }
>> >>>>> >                 return batchRecordCount;
>> >>>>> >             }
>> >>>>> >         });
>> >>>>> >         return importCount != null ? importCount : 0;
>> >>>>> >     }
>> >>>>> >
>> >>>>> > Regards,
>> >>>>> >
>> >>>>> > Les
>> >>>>> >
>> >>>>> > On Thu, Sep 4, 2008 at 5:30 PM, Les Hazlewood <[hidden email]>
>> >>>>>
>> >>>>> wrote:
>> >>>>> > > The only thing I can think of is that it might be related to the
>> >>>>> > > transaction manager - for a long running transaction, the TM is
>> >>>>> > > probably
>> >>>>> > > retaining log entries to accommodate a rollback.  For such a long
>> >>>>> > > running
>> >>>>> > > transaction, that could be a lot of overhead.  What TM
>> >>>>>
>> >>>>> implementation
>> >>>>>
>> >>>>> > > are
>> >>>>> > > you using?
>> >>>>> > > I myself just finished an application which required ridiculous
>> >>>>> > > amounts
>> >>>>> > > of data to be loaded on a daily basis in batch mode (gigabytes,
>> >>>>> > > millions
>> >>>>> > > of records across many tables).  I basically ensured that the
>> >>>>> > > insert operations happened in 1000 count 'mini' batches for 2
>> >>>>> > > reasons:
>> >>>>> > >
>> >>>>> > > 1) to prevent a TM timeout (default was 5 minutes)
>> >>>>> > > 2) to prevent the TM overhead from getting too large (memory)
>> >>>>> > >
>> >>>>> > > Here's some code I used to do that (Java + Spring
>> >>>>> > > TransactionTemplate).
>> >>>>> > >  Now I don't know if you are experiencing the same thing I did,
>> >>>>> > > but you
>> >>>>> > > might find it useful.
>> >>>>> > >
>> >>>>> > > - note that the Iterator passed in represented a custom
>> >>>>>
>> >>>>> implementation
>> >>>>>
>> >>>>> > > that was reading records from a file stream - the file was over
>> >>>>> > > 10 Gigs,
>> >>>>> > > and the Iterator represented each 'chunked' record in the stream.
>> >>>>> > > - The 'flushCount' was a config variable from a .properties file
>> >>>>> > > -
>> >>>>>
>> >>>>> in
>> >>>>>
>> >>>>> > > our
>> >>>>> > > case, it was equal to 1000).
>> >>>>> > > - the flush() method just called a DAO which internally called
>> >>>>> > > hibernateTemplate.flush() and then immediately
>> >>>>> > > hibernateTemplate.clear();
>> >>>>> > > - you could try to extend the transaction timeout beyond the
>> >>>>> > > default (e.g. 5 minutes) by configuring the transaction manager,
>> >>>>> > > but odds are high you'd get into a TM housekeeping memory problem
>> >>>>> > >
>> >>>>> > > protected int importBatch(final Iterator iterator, final int
>> >>>>> > > flushCount)
>> >>>>> > > { //Batch inserts need to be done as small chunks within their
>> >>>>> > > own transaction to avoid
>> >>>>> > >         //transaction timeouts (and rollbacks) due to large data
>> >>>>>
>> >>>>> sets,
>> >>>>>
>> >>>>> > > so
>> >>>>> > >         //specify PROPAGATION_REQUIRES_NEW.
>> >>>>> > >         //
>> >>>>> > >         //If we did NOT do this, and tried to import hundreds of
>> >>>>> > > thousands or millions of records,
>> >>>>> > >         //odds are high that the transaction would time out
>> >>>>> > >         //(e.g. after 5 minutes) and everything would be rolled
>> >>>>>
>> >>>>> back.
>> >>>>>
>> >>>>> > > This is not desired.
>> >>>>> > >         TransactionDefinition txnDef =
>> >>>>> > >                 new
>> >>>>>
>> >>>>> DefaultTransactionDefinition(TransactionDefinition.PROPAGATION_REQUIR
>> >>>>>ES_N
>> >>>>>
>> >>>>> > >EW);
>> >>>>> > >
>> >>>>> > >         TransactionTemplate txnTemplate = new
>> >>>>> > > TransactionTemplate(getTransactionManager(), txnDef);
>> >>>>> > >         Integer importCount = (Integer) txnTemplate.execute(new
>> >>>>> > > TransactionCallback() {
>> >>>>> > >             public Object doInTransaction(TransactionStatus
>> >>>>> > > status)
>> >>>>>
>> >>>>> {
>> >>>>>
>> >>>>> > >                 int batchRecordCount = 0;
>> >>>>> > >
>> >>>>> > >                 while (iterator.hasNext() ) {
>> >>>>> > >                     Record record = (Record) iterator.next();
>> >>>>> > >                     if (record != null) { //can be null if any
>> >>>>>
>> >>>>> records
>> >>>>>
>> >>>>> > > are skipped
>> >>>>> > >                         importRecord(record);
>> >>>>> > >                         batchRecordCount++;
>> >>>>> > >                         notifyRecordImported(record);
>> >>>>> > >                         if (batchRecordCount % flushCount == 0) {
>> >>>>> > >                             flush();
>> >>>>> > >                         }
>> >>>>> > >                     }
>> >>>>> > >                 }
>> >>>>> > >                 if (batchRecordCount > 0) {
>> >>>>> > >                     flush();
>> >>>>> > >                 }
>> >>>>> > >
>> >>>>> > >                 if (log.isInfoEnabled()) {
>> >>>>> > >                     log.info("Batch imported " + batchRecordCount
>> >>>>> > > +
>> >>>>>
>> >>>>> "
>> >>>>>
>> >>>>> > > records in a single transaction.");
>> >>>>> > >                 }
>> >>>>> > >                 return batchRecordCount;
>> >>>>> > >             }
>> >>>>> > >         });
>> >>>>> > >         return importCount != null ? importCount : 0;
>> >>>>> > >     }
>> >>>>> > >
>> >>>>> > > Note that if any of those 'mini' transactions fail, you have to
>> >>>>> > > do a manual rollback of all of the records that went in.  We
>> >>>>> > > accounted
>> >>>>>
>> >>>>> for
>> >>>>>
>> >>>>> > > this by storing an id in the table where these fields were
>> >>>>> > > inserted, and
>> >>>>> > > if any error occurred, performed a bulk delete ('delete from blah
>> >>>>> > > where
>> >>>>> > > manual_tx_id = foo').
>> >>>>> > >
>> >>>>> > > HTH,
>> >>>>> > >
>> >>>>> > > Les
>> >>>>> > >
>> >>>>> > > On Thu, Sep 4, 2008 at 10:53 AM, Burt Beckwith
>> >>>>>
>> >>>>> <[hidden email]>wrote:
>> >>>>> > >> I'm building a large database for load testing, so I'm creating
>> >>>>> > >> millions
>> >>>>> > >> of
>> >>>>> > >> domain instances. I'm trying to use the standard bulk insert
>> >>>>>
>> >>>>> approach
>> >>>>>
>> >>>>> > >> in
>> >>>>> > >> Hibernate, i.e. flush() and clear() the session periodically,
>> >>>>> > >> with the
>> >>>>> > >> whole
>> >>>>> > >> operation running in a transaction.
>> >>>>> > >>
>> >>>>> > >> The problem is the instances are still around - I run out of
>> >>>>> > >> memory after a
>> >>>>> > >> few hours with a heap set at 1G. I've turned off 2nd-level
>> >>>>> > >> caching and
>> >>>>> > >> looked
>> >>>>> > >> at the Session in a debugger and clear() definitely works - it's
>> >>>>> > >> empty
>> >>>>> > >> afterwards. There are no mapped collections, and I'm not keeping
>> >>>>>
>> >>>>> any
>> >>>>>
>> >>>>> > >> explicit
>> >>>>> > >> references to the instances.
>> >>>>> > >>
>> >>>>> > >> But running in a profiler I can see the instance count steadily
>> >>>>> > >> increase
>> >>>>> > >> and
>> >>>>> > >> never decrease. Running gc() has no effect.
>> >>>>> > >>
>> >>>>> > >> Any thoughts on what might be keeping a reference to these
>> >>>>>
>> >>>>> instances?
>> >>>>>
>> >>>>> > >> Burt
>> >>>>>
>> >>>>> ---------------------------------------------------------------------
>> >>>>>
>> >>>>> > >> To unsubscribe from this list, please visit:
>> >>>>> > >>
>> >>>>> > >>    http://xircles.codehaus.org/manage_email
>> >>>>>
>> >>>>> ---------------------------------------------------------------------
>> >>>>> To unsubscribe from this list, please visit:
>> >>>>>
>> >>>>>    http://xircles.codehaus.org/manage_email
>> >>>
>> >>> --
>> >>> Graeme Rocher
>> >>> Grails Project Lead
>> >>> G2One, Inc. Chief Technology Officer
>> >>> http://www.g2one.com
>> >>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe from this list, please visit:
>> >>>
>> >>>     http://xircles.codehaus.org/manage_email
>> >>
>> >> --
>> >> View this message in context:
>> >> http://www.nabble.com/Memory-leak-tp19320007p19333509.html Sent from the
>> >> grails - user mailing list archive at Nabble.com.
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe from this list, please visit:
>> >>
>> >>    http://xircles.codehaus.org/manage_email
>> >
>> > --
>> > Graeme Rocher
>> > Grails Project Lead
>> > G2One, Inc. Chief Technology Officer
>> > http://www.g2one.com
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe from this list, please visit:
>> >
>> >    http://xircles.codehaus.org/manage_email
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>     http://xircles.codehaus.org/manage_email
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>    http://xircles.codehaus.org/manage_email
>
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Memory leak

honiewelle
In reply to this post by burtbeckwith
This might be a very old thread but I am curious if this was addressed in the newer versions of grails (in code or documentation)?

burtbeckwith wrote
This worked, but was a little more work than setting errors to null. I was running this in the console so there's no request, and the PROPERTY_INSTANCE_MAP configured in DomainClassGrailsPlugin was holding the errors. When I cleared that map after each loop interation everything worked fine:

   import org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin

   for (int i = 0; i < 100000; i++) {
      new Thing(name: "thing_${i}")
      DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
      Thread.sleep(50) // to allow time to watch things in the profiler
   }

Burt
Loading...