|Posted: 2011-02-14 16:41|
I have some accented characters in my CSV file which I am importing into Workbooks. However when they arrive in Workbooks they don't look the same. What am I doing wrong?
|Posted: Mon, 14.02.2011 - 16:48|
Comma-Separated Values files - "CSVs" - can be encoded in a variety of ways. For example the way each line in the CSV is normally terminated varies between Windows machines and Macintosh computers, and the way that accented or foreign characters are represented can also vary.
In particular, Microsoft Excel will normally save CSVs using an encoding which depends on the "locale" of the user who is running it. Hence a CSV generated by someone in Eastern Europe or Russia will have a different encoding to one generated in Western Europe, or (different again!) in the US. The technical term for this encoding is the "character set" of the file. Some versions of Excel let you choose which encoding should be used in the created CSV (see comment below).
Workbooks is consistent in that it always looks to receive data in a CSV encoded using 'Unicode' which can represent a huge variety of characters - specifically an encoding called 'UTF 8'. If you take care and ensure your CSV is encoded in 'UTF 8' then your characters will be imported correctly.
|Posted: Mon, 14.02.2011 - 21:24|
Given the restrictions with Excel that I've outlined above we need a workaround to get accented characters into Workbooks reliably. If you have Excel 2010 you can use this; another way is to use Google Docs which provides excellent import and export tools.
Here's a spreadsheet in MS Excel containing some foreign characters:
Using Excel 2010...
Choose Save As, then Other Formats.
Next, choose Tools and then in the Encoding tab, Unicode (UTF-8). Click on the image below to enlarge:
Using Google Docs...
Use the Upload... button on the Google Docs home screen to upload your spreadsheet into Google Docs:
Now use the 'File > Download as > CSV (current sheet)' option to export the data back out to your computer. Note that Google will encode the characters properly, using the UTF 8 character set we need.
Import into Workbooks
Finally import that into Workbooks - make sure you choose the Unix/Linux file format (click on the image to enlarge):
|Posted: Thu, 11.10.2012 - 18:38|
Another option you can use to get around the problem of the character set of your data changing when you save an Excel file into CSV is to use Open Office. (Open Office is an open source office software suite which stores data in an international open standard format - you can learn more here.)
So, to retain your character set, open your Excel file using Open Office's spreadsheet tool. Then use 'Save As' and choose to save the file as type CSV. You'll then be prompted to specify the character set you want to use and here you should select Unicode (UTF-8) as shown below. (Click to enlarge.)
You can now use the resulting CSV file to import your data into Workbooks successfully.