Encoding problems

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Encoding problems

kusako
Hi-

I've just started playing around with grails, and must say it looks awesome.
However I stumbled across a problem with character encoding:

When I try to enter any non 7-bit character in a form it gets displayed as a '?' character.
From how it looks it gets stored with Latin-1 encoding in the database, without being converted to Unicode first.
This happens with the built in Jetty server and the hsqldb that comes with grails as well as with mysql.

It works when the resulting war is deployed on tomcat.

On a side note, I haven't managed to change the page encoding to UTF-8 (which would be my preferred setup anyway).

Are this known problems?

Keep up the great work,

-markus

--
Markus Strickler

http://www.braindump.ms/markus/
Reply | Threaded
Open this post in threaded view
|

RE: Encoding problems

Dierk König
Hi Markus,

welcome to Grails.

> -----Original Message-----
> From: Markus Strickler [mailto:[hidden email]]
> Sent: Dienstag, 7. März 2006 21:17
> To: [hidden email]
> Subject: [grails-user] Encoding problems
..
> On a side note, I haven't managed to change the page encoding to
> UTF-8 (which would be my preferred setup anyway).

What exactly did you try?

Setting <!doctype ...> ?

Mittie

Reply | Threaded
Open this post in threaded view
|

Re: Encoding problems

kusako
Hi Dierk-

Dierk Koenig wrote:

> Hi Markus,
>
> welcome to Grails.
>
>> -----Original Message-----
>> From: Markus Strickler [mailto:[hidden email]]
>> Sent: Dienstag, 7. März 2006 21:17
>> To: [hidden email]
>> Subject: [grails-user] Encoding problems
> ..
>> On a side note, I haven't managed to change the page encoding to
>> UTF-8 (which would be my preferred setup anyway).
>
> What exactly did you try?
>
> Setting <!doctype ...> ?
>
Hm, setting the Doctype shouldn't do anything about the encoding.
I tried setting <%@ page contentType="text/html;charset=UTF-8" %>
and <meta http-equiv="content-type" content="text/html;charset=UTF-8"/>
in the GSPs that get auto-generated.
Also tried the same on the SiteMesh decorator but I guess it isn't used for the GSPs?

I've filed a bug on Jira about this <http://jira.codehaus.org/browse/GRAILS-51> with some pointer where the problem comes from.
 
> Mittie
>

-markus
--
Markus Strickler

http://www.braindump.ms/markus/
Reply | Threaded
Open this post in threaded view
|

RE: Encoding problems

Dierk König
> I've filed a bug on Jira about this
> <http://jira.codehaus.org/browse/GRAILS-51> with some pointer
> where the problem comes from.

Yes, I've seen it. Thanks.

Not sure whether it will solve the issue, though.

When does you problem occur:
- when entering special chars in a text form field
- when putting special chars directly in the gsp sources
?

If the first, then the 'form' tag needs adaption.

cheers
Mittie
Reply | Threaded
Open this post in threaded view
|

RE: Encoding problems

kusako
Hi Dierk-

> -----Original Message-----
> From: Dierk Koenig [mailto:[hidden email]]
> Sent: Tuesday, March 07, 2006 11:54 PM
> To: [hidden email]
> Subject: RE: [grails-user] Encoding problems
>
> > I've filed a bug on Jira about this
> > <http://jira.codehaus.org/browse/GRAILS-51> with some pointer
> > where the problem comes from.
>
> Yes, I've seen it. Thanks.
>
> Not sure whether it will solve the issue, though.
>
OK, I've done some more investigations here. Actually it does solve the issue for the Jetty, but breaks the form handling on Tomcat.


> When does you problem occur:
> - when entering special chars in a text form field
When entering the chars in the form field

> - when putting special chars directly in the gsp sources
> ?
>
> If the first, then the 'form' tag needs adaption.
>
Not really.
You could set accept-encoding="UTF-8" if you use Jetty, and leave it out, or set it to ISO-8859-1 for tomcat.
Alternatively you could set the form's enctype to multipart/form-data as in this case Jetty seems to default to ISO-8859-1.

Let me explain a bit more on what's happening:

Corrently when a GSP is the response content type is set to text/html by the GroovyPagesTemplateEngine. Implicitly this also sets
the character encoding of the response to ISO-8859-1.
Because the content type (and the response encoding) can only be set once, subsequent attempts on setting it are ignored.
When the form is submitted, the browser encodes the data with the encoding given in the server response, i.e. ISO-8859-1 and sends
the Request to the server. As current browsers choose to ignore the possibilty to send a charset with the response the server uses
the default request encoding. For some reason Jetty defaults to UTF-8 for url encoded form data.
This results in the data not being converted to Unicode before it is saved to the database. If you read them back you get non
displayable characters, hence the ? in the output.
For multipart data Jetty seems to default to ISO-8859-1, so this doens't lead to garbled data.
Tomcat consistently defaults to ISO-8859-1, so you won't see the problem there.

In the end the only way to get consistent behaviour is to set the response and request encoding to a fixed, cofigurable value.
So if you send out text/html; charset=UTF-8 and then do a request.setCharacterEncoding("UTF-8") before you start reading from the
request stream you should be all set.
That way you don't have to rely on default settings.

Unfortunately I don't know enough about Grails to suggest the best place to set the request encoding or where to place a config
parameter for the encoding.
From a user POV a Groovy script in grails-app/conf would probably be most intuitive.

> cheers
> Mittie
>

Hope this made some sense...

greetings,

-markus

Reply | Threaded
Open this post in threaded view
|

RE: Encoding problems

Dierk König
Hi Markus,

very good explanation. Thanks.

Meanwhile, Graeme has resolved the issue.
http://jira.codehaus.org/browse/GRAILS-51?page=all 

If it now works for you, please close it.

cheers
Mittie

> -----Original Message-----
> From: Markus Strickler [mailto:[hidden email]]
> Sent: Mittwoch, 8. Marz 2006 13:32
> To: [hidden email]
> Subject: RE: [grails-user] Encoding problems
>
>
> Hi Dierk-
>
> > -----Original Message-----
> > From: Dierk Koenig [mailto:[hidden email]]
> > Sent: Tuesday, March 07, 2006 11:54 PM
> > To: [hidden email]
> > Subject: RE: [grails-user] Encoding problems
> >
> > > I've filed a bug on Jira about this
> > > <http://jira.codehaus.org/browse/GRAILS-51> with some pointer
> > > where the problem comes from.
> >
> > Yes, I've seen it. Thanks.
> >
> > Not sure whether it will solve the issue, though.
> >
> OK, I've done some more investigations here. Actually it does
> solve the issue for the Jetty, but breaks the form handling on Tomcat.
>
>
> > When does you problem occur:
> > - when entering special chars in a text form field
> When entering the chars in the form field
>
> > - when putting special chars directly in the gsp sources
> > ?
> >
> > If the first, then the 'form' tag needs adaption.
> >
> Not really.
> You could set accept-encoding="UTF-8" if you use Jetty, and leave
> it out, or set it to ISO-8859-1 for tomcat.
> Alternatively you could set the form's enctype to
> multipart/form-data as in this case Jetty seems to default to ISO-8859-1.
>
> Let me explain a bit more on what's happening:
>
> Corrently when a GSP is the response content type is set to
> text/html by the GroovyPagesTemplateEngine. Implicitly this also sets
> the character encoding of the response to ISO-8859-1.
> Because the content type (and the response encoding) can only be
> set once, subsequent attempts on setting it are ignored.
> When the form is submitted, the browser encodes the data with the
> encoding given in the server response, i.e. ISO-8859-1 and sends
> the Request to the server. As current browsers choose to ignore
> the possibilty to send a charset with the response the server uses
> the default request encoding. For some reason Jetty defaults to
> UTF-8 for url encoded form data.
> This results in the data not being converted to Unicode before it
> is saved to the database. If you read them back you get non
> displayable characters, hence the ? in the output.
> For multipart data Jetty seems to default to ISO-8859-1, so this
> doens't lead to garbled data.
> Tomcat consistently defaults to ISO-8859-1, so you won't see the
> problem there.
>
> In the end the only way to get consistent behaviour is to set the
> response and request encoding to a fixed, cofigurable value.
> So if you send out text/html; charset=UTF-8 and then do a
> request.setCharacterEncoding("UTF-8") before you start reading from the
> request stream you should be all set.
> That way you don't have to rely on default settings.
>
> Unfortunately I don't know enough about Grails to suggest the
> best place to set the request encoding or where to place a config
> parameter for the encoding.
> >From a user POV a Groovy script in grails-app/conf would
> probably be most intuitive.
>
> > cheers
> > Mittie
> >
>
> Hope this made some sense...
>
> greetings,
>
> -markus
Reply | Threaded
Open this post in threaded view
|

RE: Encoding problems

kusako
Hi-

yes now I can set the content type for GSP just like for JSP. So we can now alos generate groovy XML :)

This leaves open the issue of the reqeust encoding. As Grails is based on Spring, Spring's CharacterEncodingFiler should do the
trick. See http://mrj.woo.dk/squareroot/2006/02/16/character-encoding-in-submitted-forms/ for some explanation.
I'll give it a try later on.

cheers,

-markus

> -----Original Message-----
> From: Dierk Koenig [mailto:[hidden email]]
> Sent: Wednesday, March 08, 2006 1:43 PM
> To: [hidden email]
> Subject: RE: [grails-user] Encoding problems
>
> Hi Markus,
>
> very good explanation. Thanks.
>
> Meanwhile, Graeme has resolved the issue.
> http://jira.codehaus.org/browse/GRAILS-51?page=all 
>
> If it now works for you, please close it.
>
> cheers
> Mittie
>

Reply | Threaded
Open this post in threaded view
|

Re: Encoding problems

graemer
On 08/03/06, Markus Strickler <[hidden email]> wrote:
> Hi-
>
> yes now I can set the content type for GSP just like for JSP. So we can now alos generate groovy XML :)
Indeed :)

>
> This leaves open the issue of the reqeust encoding. As Grails is based on Spring, Spring's CharacterEncodingFiler should do the
> trick. See http://mrj.woo.dk/squareroot/2006/02/16/character-encoding-in-submitted-forms/ for some explanation.
Thanks for the pointer can you please raise a separate issue for this
as its really a different problem that requires some thought as to how
we configure this

> I'll give it a try later on.
>
> cheers,
>
> -markus
>
> > -----Original Message-----
> > From: Dierk Koenig [mailto:[hidden email]]
> > Sent: Wednesday, March 08, 2006 1:43 PM
> > To: [hidden email]
> > Subject: RE: [grails-user] Encoding problems
> >
> > Hi Markus,
> >
> > very good explanation. Thanks.
> >
> > Meanwhile, Graeme has resolved the issue.
> > http://jira.codehaus.org/browse/GRAILS-51?page=all
> >
> > If it now works for you, please close it.
> >
> > cheers
> > Mittie
> >
>
>