{"id":349,"date":"2009-03-16T21:14:36","date_gmt":"2009-03-16T18:14:36","guid":{"rendered":"http:\/\/www.void.gr\/kargig\/blog\/?p=349"},"modified":"2009-05-03T13:13:10","modified_gmt":"2009-05-03T10:13:10","slug":"convert-greek-characters-from-latin1-mysql-database-fields-to-pure-utf8","status":"publish","type":"post","link":"https:\/\/www.void.gr\/kargig\/blog\/2009\/03\/16\/convert-greek-characters-from-latin1-mysql-database-fields-to-pure-utf8\/","title":{"rendered":"Convert greek characters from latin1 mysql database fields to pure utf8"},"content":{"rendered":"<p><strong>The Problem<\/strong><br \/>\nTo sum it up, the case is this: many many many web applications were programmed so that they used latin1 collation for their fields inside mysql databases. But most users now use utf8 from within their browsers. What happens is that utf8 characters are getting stored inside latin1 fields, which in return produces chaos! A huge web application that used that kind of madness was <a href=\"http:\/\/wordpress.org\/\">WordPress<\/a>. Luckily (or not) WordPress now uses utf8 everywhere. I&#8217;ve known many many many people that got so frustrated when they tried to move from their old WordPress installation to a newer one because all their greek posts couldn&#8217;t be exported &#8220;easily&#8221;, I won&#8217;t say &#8220;properly&#8221; because there are always solutions to problems like this, but all the solutions were not straightforward at all, that they finally dumped the idea of moving the posts and started a new blog. <\/p>\n<p>This is a HUGE problem for many greek (and not only) users and I hope I now have an elegant(?) solution to it.<br \/>\n<!--more--><br \/>\nThe solution that I provide does not require any use of the mysqldump utility at all. Most solutions to the problem I&#8217;ve seen so far were more or less using the mysqldump utility like this:<br \/>\n<code>$ mysqldump --default-character-set=latin1 --opt -u user -p dbname> latin1-dbname.sql<\/code><br \/>\nSince many people have their blogs on shared hosting or have very limited shell access, the previous solution is a no-go for them because it requires that they contact their hosting support, explain to them what they want and wait for their reply. If they are lucky they might get the .sql file&#8230;else&#8230;they are back to where they started.<\/p>\n<p><strong>My solution:<\/strong><br \/>\nFirst of all it is based purely on this post: <a href=\"http:\/\/combatwombat.7doves.com\/2008\/10\/26\/mysql-latin1-to-utf8-issues\">http:\/\/combatwombat.7doves.com\/2008\/10\/26\/mysql-latin1-to-utf8-issues<\/a> .While that post does not mention greek characters at all, it gave me an idea of how it should be done.<br \/>\nIn order to solve the problem using my solution you need a Linux or a MacosX host. This is because the solution is based on a bash script that needs the sed utility for character conversion. Both bash and sed are of course not included on a default windows installation. So if you are a windows only user you can either install those tools through <a href=\"http:\/\/www.cygwin.com\/\">cygwin<\/a> and try if it works (never tested), ask a friend of yours that uses linux or macosX to help you, boot a linux live cd, or install linux \ud83d\ude00<\/p>\n<p>What every hosting solution definitely has is access to mysql databases through <a href=\"http:\/\/www.phpmyadmin.net\/\">phpmyadmin<\/a>. Even if your hosting provider or control panel does not provide it for you, you can always install it manually. One of the easiest things to do on phpmyadmin is export a database. Just open phpmyadmin, select your database, click on export on the right, select some or all of the database tables you want, select &#8220;Save as File&#8221; and click on zipped. Click on &#8220;Go&#8221; and after a few seconds you will have your .sql.zip file sent to you. If you find that hard to do please ask a friend. Please don&#8217;t blame me for blowing up your mysql database if you don&#8217;t know how to handle these simple directions..<br \/>\nLet&#8217;s say that the db name was sample-db, then you should have gotten a file named: sample-db.sql.zip<br \/>\nUnzip it:<br \/>\n<code>$ unzip sample-db.sql.zip<\/code><br \/>\nand then edit it with a text editor,vim for example:<br \/>\n<code>$ vim sample-db.sql<\/code><br \/>\nIf you are suffering from the problem mentioned before you will probably see things like:<br \/>\n<a href=\"http:\/\/www.void.gr\/kargig\/greek-convert-latin1-to-utf8\/greek-utf8-inside-latin1.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.void.gr\/kargig\/greek-convert-latin1-to-utf8\/greek-utf8-inside-latin1-300x185.jpg\" alt=\"greek-utf8-inside-latin1\" title=\"greek-utf8-inside-latin1\" width=\"300\" height=\"185\" class=\"alignnone size-medium wp-image-352\" srcset=\"https:\/\/www.void.gr\/kargig\/blog\/wp-content\/greek-utf8-inside-latin1-300x185.jpg 300w, https:\/\/www.void.gr\/kargig\/blog\/wp-content\/greek-utf8-inside-latin1-1024x633.jpg 1024w, https:\/\/www.void.gr\/kargig\/blog\/wp-content\/greek-utf8-inside-latin1.jpg 1226w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><br \/>\nTo start the conversion you need to <strong>download<\/strong> the following script: <a href=\"http:\/\/www.void.gr\/kargig\/greek-convert-latin1-to-utf8\/greek-convert-latin1-to-utf8.sh\"><strong>greek-convert-latin1-to-utf8.sh<\/strong><\/a> or <a href=\"http:\/\/www.void.gr\/kargig\/greek-convert-latin1-to-utf8\/greek-convert-latin1-to-utf8.sh.gz\"><strong>greek-convert-latin1-to-utf8.sh.gz<\/strong><\/a> (you need to extract the .gz)<br \/>\nThe make it an executable: <code>$ chmod +x greek-convert-latin1-to-utf8.sh<\/code><br \/>\nAnd then execute the script with the database as an input:<br \/>\n<code>$ .\/greek-convert-latin1-to-utf8.sh sample-db.sql<br \/>\nI'll work on sample-db.sql ...<br \/>\nsample-db.sql...done<\/code><br \/>\nThen you will have a new file named sample-db.sql.clean as output on the same dir you ran the script.<br \/>\nOpen it and you should now see every post in pure utf8 greek like this:<br \/>\n<a href=\"http:\/\/www.void.gr\/kargig\/greek-convert-latin1-to-utf8\/greek-utf8-inside-latin1-converted.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.void.gr\/kargig\/greek-convert-latin1-to-utf8\/greek-utf8-inside-latin1-converted-300x185.jpg\" alt=\"greek-utf8-inside-latin1-converted\" title=\"greek-utf8-inside-latin1-converted\" width=\"300\" height=\"185\" class=\"alignnone size-medium wp-image-356\" srcset=\"https:\/\/www.void.gr\/kargig\/blog\/wp-content\/greek-utf8-inside-latin1-converted-300x185.jpg 300w, https:\/\/www.void.gr\/kargig\/blog\/wp-content\/greek-utf8-inside-latin1-converted-1024x633.jpg 1024w, https:\/\/www.void.gr\/kargig\/blog\/wp-content\/greek-utf8-inside-latin1-converted.jpg 1226w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><br \/>\nAs you can see 99% percent of the characters were correctly converted to proper greek utf8 ones. I don&#8217;t currently have the time to investigate why a few characters don&#8217;t get properly converted, but I&#8217;ll soon find a solution for that too \ud83d\ude42<\/p>\n<p>What&#8217;s now left to do is to import the sample-db.sql.clean file to your new hosting&#8230;you can do that of course through phpmyadmin&#8230;<\/p>\n<p>The conversion table that I used is here:<a href=\"http:\/\/www.void.gr\/kargig\/greek-convert-latin1-to-utf8\/greek-replacement-latin1-to-utf8.ods\">greek-replacement-latin1-to-utf8.ods<\/a><\/p>\n<p>This might be a late solution, since the problem is quite old, but I am sure that there are many people still having headaches over issues like this. Enjoy \ud83d\ude42<\/p>\n<p>Downloads:<br \/>\n<a href=\"http:\/\/www.void.gr\/kargig\/greek-convert-latin1-to-utf8\/greek-convert-latin1-to-utf8.sh\"><strong>greek-convert-latin1-to-utf8.sh<\/strong><\/a><br \/>\n<a href=\"http:\/\/www.void.gr\/kargig\/greek-convert-latin1-to-utf8\/greek-convert-latin1-to-utf8.sh.gz\"><strong>greek-convert-latin1-to-utf8.sh.gz<\/strong><\/a> <\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Problem To sum it up, the case is this: many many many web applications were programmed so that they used latin1 collation for their fields inside mysql databases. But most users now use utf8 from within their browsers. What happens is that utf8 characters are getting stored inside latin1 fields, which in return produces [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ep_exclude_from_search":false,"footnotes":""},"categories":[11,5,3],"tags":[25,24,600,22,19,26,23,16],"class_list":["post-349","post","type-post","status-publish","format-standard","hentry","category-greek","category-internet","category-linux","tag-bash","tag-convert","tag-greek","tag-latin1","tag-mysql","tag-sed","tag-utf8","tag-wordpress"],"aioseo_notices":[],"views":40873,"_links":{"self":[{"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/posts\/349","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/comments?post=349"}],"version-history":[{"count":24,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/posts\/349\/revisions"}],"predecessor-version":[{"id":489,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/posts\/349\/revisions\/489"}],"wp:attachment":[{"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/media?parent=349"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/categories?post=349"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/tags?post=349"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}