Menu
  • HOME
  • TAGS

export sql in php UTF-8 [closed]

php,sql,database,utf-8,export

Start your query after this: $db = new MySQLi("host", "username", "password", "db"); if(!$db) { die("your_error_msg" . mysqli_error($db)); } $db->set_charset("utf8");' EDIT <?php //ENTER THE RELEVANT INFO BELOW $mysqlDatabaseName ='xxxxxx'; $mysqlUserName ='xxxxxxx'; $mysqlPassword ='myPassword'; $mysqlHostName ='xxxxxxxx.net'; $mysqlExportPath ='chooseFilenameForBackup.sql'; //DO NOT EDIT BELOW THIS LINE //Export the database and output the status to...

Why are my pictures corrupted after downloading and writing them in python?

python,facebook,utf-8

Answering the question of why @Alastair's solution worked: f = open(path, 'wb') From https://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files: On Windows, 'b' appended to the mode opens the file in binary mode, so there are also modes like 'rb', 'wb', and 'r+b'. Python on Windows makes a distinction between text and binary files; the end-of-line...

How to use UTF-8 in C code?

c,utf-8

This is more of a corollary to the other answers, but I'll try to explain this from a slightly different angle. Here is Jonathan Leffler's version of your code, with three slight changes: (1) I made explicit the actual individual bytes in the UTF-8 strings; and (2) I modified the...

Encoding problems in Python - 'ascii' codec can't encode character '\xe3' when using UTF-8

python,encoding,utf-8

Thanks a lot guys. Here it goes how I've solved the encoding problem in with Python 3.4.1: First I've inserted this line in the code to check the output encoding: print(sys.stdout.encoding) And I saw that the output encoding was: ANSI_X3.4-1968 - which stands for ASCII and doesn't support characters like...

Writing byte array to an UTF8-encoded file

java,utf-8,java-io

Don't use a Writer. Just use the OutputStream. A complete solution using try-with-resource looks as follows: try (FileOutputStream fos = new FileOutputStream(tmpFile)) { fos.write(buffer); } Or even better, as Jon points out below: Files.write(Paths.get(tmpFile), buffer); ...

Send Utf8(persian) to Server By HttpURLConnection

android,utf-8,httpurlconnection

try to use UTF-8 encoding, and this is a sample how to use encoding with POST request: try { URL url = new URL("your url"); HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setDoOutput(true); OutputStreamWriter writer = new OutputStreamWriter( conn.getOutputStream(), "UTF-8"); String request = "test data"; writer.write(request); writer.flush(); System.out.println("Code:" + conn.getResponseCode()); System.out.println("mess:" +...

How can I use special characters in angular directives attributes?

javascript,angularjs,utf-8,special-characters,directive

Usually in a good editor you can change the document encoding type, the document is saved in. try to set it to iso-8859-1/utf-8 and save/upload again. Next bet would be to change the encoding of the html-output with <meta http-equiv="Content-Type" content="text/html;charset=utf-8"> Umlauts often is trial & error......

The File Encoding Is utf8 but is in Windows-1256 readable

encoding,utf-8

Java does not attempt to detect the encoding of a file. getEncoding returns the encoding that was selected in the InputStreamReader constructor. If you don't use one of the constructors that take a character set parameter, you get the 'platform default charset', according to Oracle's documentation. This question discusses what...

UTF-8 for URL, Java

java,utf-8

Summary from comment The test code works fine. import java.io.UnsupportedEncodingException; import static java.net.URLEncoder.encode; public class MainApp { public static void main(String[] args) throws UnsupportedEncodingException { String url = "http://www.teanglann.ie/en/gram/"+ encode("fág", "UTF-8"); System.out.println(url); } } It emits like below http://www.teanglann.ie/en/gram/f%EF%BF%BDg Which would goto correct page. Correct steps are Ensure that source...

SQL*Loader does not recognize delimiter “¥”

oracle,utf-8,delimiter,sql-loader

Since you're using a UTF8 character and a UTF8 file format (I think), for the session that runs SQL*Loader set your NLS_LANG environment variable to "SPANISH_SPAIN.UTF8".

Handle windows-1252 and unicode in java [closed]

java,unicode,utf-8,character-encoding,bytearray

It seems that your server confuses the ISO-Latin-1 encoding with the proprietary Windows-1252 code page and the encoded data are the result of this. The Windows-1252 code page differs only at a few places from ISO-Latin-1. You can fix the data by converting them back to the bytes the server...

ctrl+G in erl doesn't work

unicode,encoding,utf-8,erlang,docker

Fixed the problem, needed export TERM=linux.

Wrong output when str_replace with acute ( ´ ) in utf-8 website [duplicate]

php,html,utf-8

try to utf8_decode('') <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>notitle</title> </head> <body> <form action="?" method="post"> <input type="text" name="string" value="<?= str_replace("'",utf8_decode('`'),$_POST['name']) ?>" /> </form> </body>...

Replace special qoutes with normal

vb.net,replace,utf-8

Unfortunately VB.NET doesn't support escape sequences but you can use ChrW() to specify code point: s = s.Replace(ChrW(&H201C), """") That's for “, code for ” is &H201D. Note that using code points you're free to search & replace any Unicode character (not just what VB.NET has an escape for -...

PHP - Changing charset for arabic characters using file_get_contents

php,utf-8,file-get-contents,arabic

Replace the last line with: echo iconv('WINDOWS-1256', 'UTF-8', $page); And I think it because you're using the wrong encoding, if you check the content charset meta returned by the page you'll see that it is windows-1256....

Python: difficulty converting ascii to unicode

python,unicode,encoding,utf-8

I'll assume your remote "source page" contains more than just ASCII otherwise your comparison will already work as is (ASCII is now a subset of UTF-8. I.e. A in ASCII is 0x41, which is the same as UTF-8). You may find Python Requests library easier as it will automatically decode...

Encoding error in Rails 4 when querying mysql DB

mysql,ruby,ruby-on-rails-4,encoding,utf-8

There was a non UTF8 character in one of the DB entries. Kudos to @muistoshort for pointing me in the right direction.

How to display Arabic unicode text in page that retrieved from database

java,unicode,utf-8,xhtml,arabic

This line: String bankName = "\u0627\u0644\u0628\u0646\u0643 \u0627\u0644\u0645\u062a\u062d\u062f"; Is completely equivalent to this: String bankName = "البنك المتحد"; Escaping (think, for example, about \n) isn't a mechanism in-built in Java strings. It's Java compiler that performs these replacements for you. Imagine to have a text file with these two characters: \...

PHP strings have same encoding (UTF8) and appear as identical in browser but are not equal

php,string,curl,utf-8,comparison

If you look at the HTML source code of what you copy pasted here, they are not the same. The second string has an additional entities & # 8203 ; (check the second `‌​') http://pasteboard.co/wBd2Ea4.png...

unicode converting in RestTemplate in Spring

java,spring,unicode,utf-8

You can use the Apache Commons Lang. There's a method called StringEscapeUtils.unescapeJava(String s) That can do it. (From http://stackoverflow.com/a/14368185/1176061)...

Why does Python 3 output \xe3, an extra char?

python,python-3.x,unicode,utf-8

You need to use a console or terminal that supports all of the characters that you want to print. When printing in the interactive console, the characters are encoded to the correct codec for your console, with any character that is not supported using the backslashreplace error handler to keep...

Extracting Double Byte Characters/substring from a UTF-8 formatted String

java,string,encoding,utf-8

Thanks to John Kugelman for the help. the solution looks like this now: for(int codePoint : codePoints(string)) { char[] chars = Character.toChars(codePoint); System.out.println(codePoint + " : " + String.copyValueOf(chars)); } With the codePoints(String string)-method looking like this: private static Iterable<Integer> codePoints(final String string) { return new Iterable<Integer>() { public Iterator<Integer>...

Utf8 accents in nodejs

javascript,node.js,utf-8

This is Quoted Printable. You can install mimielib and use like below: var mimelib = require("mimelib"); json = { "name": "=4A=61=76=69=65=72=20=4C=75=6A=C3=A1=6E", "tel": "2814682382" }; name = mimelib.decodeQuotedPrintable(json.name); console.log(name); // This will print Javier Luján ...

Converting Unicode codepoints to UTF-8 in C using iconv

c,unicode,encoding,utf-8,iconv

Two problems: Since you’re using UTF-32, you need to specify 4 bytes. The “lower-case lambda is a 16-bit character (0x3BB = 955)” comment isn’t true for a 4-byte fixed-width encoding; it’s 0x000003bb. Set size_t in_size = 4;. iconv doesn’t add null terminators for you; it adjusts the pointers it’s given....

Ruby on Rails - UTF8 encoding problems in MySQL from ActiveRecord

mysql,ruby-on-rails,utf-8,character-encoding

The problem was linked to my terminal. I imported the language list through it and I think this was the source of my encoding problem. I downloaded Sequel Pro to manage my database and all of my data are not corrupted....

Erroneous encoding on form to Spring MVC

java,spring,spring-mvc,utf-8,character-encoding

You need to add an encoding filter to your web.xml to make it encode the chars correctly: <filter> <filter-name>encoding-filter</filter-name> <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class> <init-param> <param-name>encoding</param-name> <param-value>UTF-8</param-value> </init-param> <init-param> <param-name>forceEncoding</param-name> <param-value>true</param-value> </init-param>...

How to make the output from Text::CSV utf8?

perl,csv,encoding,utf-8

You have set the encoding of the input file handle (which, by the way, should be <:encoding(utf8) -- note the colon) but you haven't specified the encoding of the output channel, so Perl will send unencoded character values to the output The Unicode values for characters that will fit in...

NSJSONSerializzation not reading UTF8 correctly [duplicate]

ios,objective-c,uitableview,utf-8

Your data is using the HTML-way to store special characters. It is different from UTF-8 and is a way to add special characters using ASCII-codepoints. See http://www.w3.org/TR/html4/charset.html#h-5.3 for how they work. A way to decode them is answered in HTML character decoding in Objective-C / Cocoa Touch....

cut off last rune in UTF string

utf-8,go

Almost, utf8 package has a function to decode the last rune in a string which also returns its length. Cut that number of bytes off the end and you are golden: str := "你好" _, lastSize := utf8.DecodeLastRuneInString(str) withoutLastRune := str[:len(str)-lastSize] fmt.Println(withoutLastRune) playground...

ANSI vs UTF-8 in web Browser

javascript,html,utf-8,character-encoding,ansi

Let's distinguish between two things here: characters the user can type and the encoding used to send this data to the server. These are two separate issues. A user can type anything they want into a form in their browser. For all intents and purposes these characters have no encoding...

MySQL utf8_czech_ci vs utf8_general_ci

mysql,utf-8

The reason is that š and s are two different letters in CZ alphabet, so that's that's why it's not found when using utf8_czech_ci collation. See also http://collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html and http://collation-charts.org/mysql60/mysql604.utf8_czech_ci.html...

Php mysql query charset

php,mysql,utf-8

This is how you should do it. It will not solve the caracter encoding thing, but it will help with the reste. (And perhaps even solve your problem, because it might just be related to not escaping the values). $query = "select id, name, typeid from MainObjects where used =...

Working with characters based on their UTF-8 hex codes

javascript,jquery,unicode,utf-8

I suggest preprocessing the data as you grab it from the webpage instead of extracting it from the string afterwards. You can then use decodeURIComponent() to decode the percent-encoded string: decodeURIComponent('%F0%9F%98%92') Combine that with jQuery to access the data-textvalue-attribute: decodeURIComponent($(element).data('textvalue')) I created a simple example on JSFiddle. For some reason...

json_encode() throwing an error: “Invalid UTF-8 sequence in argument”

php,json,utf-8

json_encode only allows UTF-8 characters to be encoding. Looks like the data you are trying to encode may have non UTF-8 characters. So, you should first convert the string/data into UTF-8 and then do the encoding. mb_convert_encoding($string,'UTF-8','UTF-8'); json_encode($string); ...

How to convert euro (€) symbol from Windows-1252 to UTF-8?

php,encoding,utf-8,windows-1252

As shown in this Stack Overflow question the Euro symbol is converted to the latin-1 supplement euro character, and not the "proper" UTF-8 codepoint. A workaround for it is to utf8_decode and then "re-encode" again: $node = iconv('Windows-1252', 'UTF-8', utf8_decode($node)); So some sample code that works: <?php $xml = '<?xml...

Replace utf8 literals in string

php,utf-8

Using the tip from the comments I got a slight different algorithm working for me: function decode_code($code){ return preg_replace_callback( "@\\\(x)?([0-9a-f]{2,3})@", function($m){ return utf8_encode(chr($m[1]?hexdec($m[2]):octdec($m[2]))); }, $code ); } ...

R encoding UTF-8: U+0080-U+009F

r,utf-8,character-encoding

Try converting from UTF-8 to latin1: df <- read.table("http://dl.dropboxusercontent.com/u/94114397/example.txt", sep = "\t", row.names = 1, stringsAsFactors = FALSE, encoding="UTF-8") iconv(df[, 1], from = "UTF-8", to = "latin1") # [1] "Trichocentrum<->longifolium<-><->(Lindl.) R.Jiménez, Acta Bot. Mex. 97: 54 (2011)." # [2] "Salvia<->× hegelmaieri<->nothosubsp. accidentalis<->(Sánchez-Gómez & R.Morales)." # [3] "Edraianthus<->tarae<-><->Lakušic, Bilten Drustva Ekologa...

Read utf-8 character from byte stream

python-3.x,utf-8,utf8-decode

Wrap the stream in a TextIOWrapper with encoding='utf8', then call .read(1) on it. This is assuming you started with a BufferedIOBase or something duck-type compatible with it (i.e. has a read() method). If you have a generator or iterator, you may need to adapt the interface. Example: from io import...

How to find condition that start char in UTF-8 file is read, using FileStream and StreamReader?

c#-4.0,utf-8,io,stream

bool first = true; while((i = reader.Read()) > -1) { if (first) { first = false; // Do first character things } Note that the concept of first character is complex: what happens if the first glyph is è, that occupies two bytes in the file? The stream position will...

How to set the filename encoding globally for a python interpreter?

python,encoding,utf-8,filenames

Assuming this is really what you meant (and not the encoding of the file's contents), this would do it: open = lambda fname, *a, **kw: __builtins__.open(fname.encode('utf-8'), *a, **kw) This will only affect modules that include (or import) the redefinition, so it's reasonably safe. But it might be less confusing, and...

How to make sure a XDocument is saved with utf-8 file encoding?

c#,xml,unicode,encoding,utf-8

If you want fine-grained encoding control, you probably want to control the TextWriter; for example, in the example below I'm using UTF-8 sans-BOM. However, if possible, you could also write directly to a file via a FileStream... using System; using System.IO; using System.Text; using System.Xml.Linq; class Program { static void...

Does HTML Encoding have any cons?

asp.net-mvc,razor,encoding,utf-8,xss

I found the solution as using the AntiXSS library for Razor encoderType. This answer describes it well. Special characters in html output The default Razor encoder encodes accented chars whereas the AntiXSS library does not encode them. So, accented chars are rendered as they are....

Required to convert a String to UTF8 string

c++,c,utf-8,iconv,wchar-t

For your first question (which I am interpreting as "why is all the output not what I expect"): Where does the '?????' come from? In the call mbstowcs(WcharString, CharString, strlen(CharString)), the last argument (strlen(CharString)) is the length of the output buffer, not the length of the input string. mbstowcs will...

How to remove last utf8 char of a python string

python,python-2.7,utf-8

The simplest way is to decode your UTF-8 bytes to Unicode text: without_last = msg.decode('utf8')[:-1] You can always encode it again. The alternative would be for you to search for a UTF-8 start byte; UTF-8 byte sequences always start with a byte with the most significant bit set to 0,...

store Hebrew in database with utf- 8 encoding using php and mysql

html,mysql,internet-explorer,encoding,utf-8

(click for full size)Shown in the first 3 GETs, in order, I found that Chrome, Firefox, and IE all produce different results. But I guess theoretically, Firefox does it right. In any case, they all give the same result (next 3 GETs in order), if we encodeURIComponent() the input.value. Like...

Force UTF-8 encoding in inline CSS

css,encoding,utf-8

.close::before{ content:"\u00D6"; } The CSS syntax for Unicode characters does not include the u prefix. The correct syntax is: content:"\00D6"; or simply content:"\D6";. This character notation always refers to a Unicode Codepoint regardless of the character encoding of the (HTML) document that contains it. However, the character "\00D6" refers...

ZXing 2D barcode decoding: UTF-8 characters not decoded properly

java,utf-8,barcode,zxing

It looks like I was creating my bitmap the wrong way. This worked: MultiFormatReader reader = new MultiFormatReader(); FileInputStream fis = new FileInputStream(filePath); BinaryBitmap bitmap = new BinaryBitmap(new HybridBinarizer( new BufferedImageLuminanceSource( ImageIO.read(fis)))); Result result = reader.decode(bitmap); String originalText = result.getText(); byte[] bytes = originalText.getBytes("ISO-8859-1"); String outputText = new String(bytes, "UTF-8");...

Python encoding issue through sockets

python,sockets,encoding,utf-8

You are using two different versions of python py2 vs 3. Python2: In [1]: bytes("Unknown command", 'UTF-8') --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-1-ddc681192da0> in <module>() ----> 1 bytes("Unknown command", 'UTF-8') TypeError: str() takes at most 1 argument (2 given) Python3: In [1]: bytes("Unknown command", 'UTF-8') Out[1]: b'Unknown command'...

How to remove any utf8mb4 characters in string

c#,.net,utf-8,utf8mb4

This should replace surrogate characters with a replacementCharacter (that could even be string.Empty) This is a MySql problem, given the utf8mb4. Here there is the difference between utf8 and utf8mb4 in MySql. The difference is that utf8 doesn't support 4 byte utf8 sequences. By looking at the wiki, 4 byte...

send and retrive arabic data from mysql database

php,android,mysql,utf-8,arabic

Though you can just use set names utf8 instead of all that creepy queries, there is nothing here wrong about Arabic, cheers :) You are mixing mysql_* and mysqli_* which is not allowed Even if it's allowed to use mysql & mysqli connections interchangeably, you are passing parameters toquery...

Modelica encoding problems

utf-8,character-encoding,modelica,dymola,openmodelica

Testing UTF-8 in latest Dymola (Linux), it seems as if Dymola does not change the encoding to ISO-8859-1 any longer (to my memory it used to change this). It does however, not look good since it is displayed as if the UTF-8 text is ISO. The easiest way to get...

Which encoding replaces “í” with “\303 \255”?

c#,utf-8,character-encoding,data-conversion

It is UTF8, and 303 255 octal is 195 173 decimal, these numbers probably look more familiar. See the dec and oct headers in the table you linked. There is no built-in type that's going to produce octal output for some characters - you'll have to decide which characters to...

showing umlauts in html with utf8 charset

html,utf-8

If you specified "charset=utf-8", you have to upload/use a "File" that is encoded with UTF-8. To do this on Windows: Open your html/php.. file in Notepad. go to "File" and choose "Save As" Set the "Encoding" field to "UTF-8" -> Profit...

Convert byte array from utf-16 to utf-8

c++,utf-8,utf-16

Byte order must be specified for UTF-16 input. Since you are passing a utf16-be (big-endian) encoded buffer, you should prefix it with the appropriate byte-order-mark: uint8_t array[] = { 0xfe, 0xff, 0x00, 0x72, 0x00, 0x6f, 0x00, 0x6f, 0x00, 0x74 }; But this will produce UTF-8 output with a byte order...

Is the “UTF8” data in my database really encoded correctly?

php,mysql,utf-8

One way to see what is actually stored is to use the HEX function. (That's the closest MySQL gets to the Oracle-style DUMP() function. Here's a demonstration that shows the use of the HEX function to return what's stored... CREATE TABLE foo ( foo_lat VARCHAR(10) CHARSET latin1 , foo_utf VARCHAR(10)...

What happens under the hood when bytes converted to String in Java?

java,string,unicode,utf-8,byte

Not all sequences of bytes are valid in UTF-8. UTF-8 is a smart scheme with a variable number of bytes per code point, the form of every byte indicating how many other bytes follow for the same code point. Refer to this table: Bytes 1 (hex 0x01, binary 00000001) and...

PHP / MySQL: Certain characters not being encoded properly and appearing as question marks

php,mysql,encoding,utf-8,character-encoding

Please specify the character set by adding $conn->set_charset("utf8") ...

Centos System config language not working

linux,utf-8,centos5

Try running the source command on the file: source /etc/sysconfig/i18n...

Detecting corrupt characters in UTF-8 encoded text file

regex,encoding,awk,utf-8,scripting

This finds all chars outside of the ASCII range: $ awk '/[^\x00-\x7F]/{ print NR ":", $0 }' file 1: Interruptor EC não está em DESLOCAR 4: 辅助驾驶室门关闭 5: Porte cab. aux. fermée 7: Дверь аппаратной камеры закрыта 13: é«˜åŽ‹ä¿æŠ¤æ‰‹æŸ„å‘下 14: Barrière descendue 16: Огранич. Планка ВВК опущ. 19: Barra de...

What -C flag number in perl makes UTF-8 “just work”?

perl,utf-8,utf8-decode

Why there is no -C flag number ... which makes the example work at least once? Because using UTF-8 literals in your Perl source requires use utf8;. for (( i=0; $i < 512; i=$((i+1)) )); do echo -n 'привет мир' | perl -C$i -le 'use utf8; $a=<>; print sprintf("%...

Delphi - converting string back from UTF-8

osx,delphi,utf-8

value = '散汤湡獤杀潯汧浥楡⹬潣m䌴䅓㜭䙇ⵊ䵙㑗㈭呖ⵆ䥉儵䈭呎́'#4 This is indicative of you interpreting 8 bit text, presumably UTF-8 encoded, as if it were UTF-16 encoded. As a broad rule, when you see a UTF-16 string with Chinese characters, either it is a correctly interpreted Chinese text, or it is mis-interpreted 8 bit text....

Remove UTF-8 substring in sqlite

sqlite,utf-8

x'202B' is not a single, invisible Unicode character; it is a blob containing the two ASCII characters and +. All SQLite strings are encoded in UTF-8. When you are constructing strings from bytes manually, you have to use the same encoding: x'E280AB' ...

Perl XML::Twig character encoding

xml,perl,encoding,utf-8

First thing: in your case keep_encoding should not be used. It's an old option, dating back to ancient times, when latin1 was a commonly used encoding and perl wasn't so good with unicode. I am talking pre-5.8 here. The option provided a way for people living in an all-latin1 world...

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4: ordinal not in range(128)

python,api,utf-8

I think what you are looking for is: # ... text = articleFile.read().decode('utf-8') keyphrases = extractKeyphrases(text) # ... Basicly you want to decode to an unicode string the contents of the file as soon as you read it. Then the rest of your program is save from conversion problems. Please...

Send Arabic Text to Web Service

excel,vba,excel-vba,utf-8

From your answer I get a suspicion. What you are doing is the following: You write the pure unencoded Unicode bytes in the binary ADODB.Stream. Then with concatenating "" & fsT.Read you are creating a Unicode String. As in https://msdn.microsoft.com/en-us/library/ms763706%28v=vs.85%29.aspx mentioned: "If the input type is a BSTR, the response...

How to use ctype_alpha with UTF-8

php,utf-8

Use regex for latin characters, change this condition: if (ctype_alpha($_POST['first_name']) === false) to: if (!preg_match('/^[\p{Latin}]+$/u', $_POST['first_name']) Or - to allow whitespace: if (!preg_match('/^[\p{Latin}\s]+$/u', $_POST['first_name']) ...

How to print bit representation of unicode character

c++,windows,unicode,utf-8

Two problems: reverse bit pattern (binary reads left to right bit 7 to 0). sign extension std::string contobin(std::string str) { std::string result; for(int i=0; i<str.size(); ++i) for(int j=8; j--;) { result.append((1<<j) & uint8_t(str[i]) ? "1" : "0"); } return result; } ...

How to read utf-8 encoded XML file in PHP?

php,xml,utf-8

Your XML declaration line is invalid: <?xml version="1.0" standalone="yes" encoding="UTF-8"?> ^^^^^^^^^^^^^^^^^ Move the underlined portion to the back and it will parse properly: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> ...

Can you skip non utf-8 data in python csv?

python,csv,utf-8

You could use a filter that reads a line as raw bytes, tries to convert it to unicode as UTF8 and then : if successful, passes it down to the csv reader if not, stores it for later analyzing Assuming that you are using Python2, you could use something like...

Using UTF-8 identifier

java,android,encoding,utf-8

Your problem is not in the code you have posted but in code that gets data from HTTP request. You are passing String data to writeToFile method. String in Java is UTF-16 encoded. If you have UTF-8 encoded data in that string, then no amount of further encoding-decoding is going...

C# - Create .txt file as UTF-8 instead ANSI from plaintext

c#,utf-8

Use UnicodePlainText instead of PlainText: richTextBox1.SaveFile("notes.txt", RichTextBoxStreamType.UnicodePlainText); And do the same when using LoadFile....

Laravel 5 charset not working correctly on the views. But it working well when I dump it from controller

php,laravel,utf-8,character-encoding,laravel-5

I solved this issue using this aswer. I've just went in my AppServiceProvider and put into boot method: Blade::setEchoFormat('e(utf8_encode(%s))'); I don't know if this is the correct way to do it, but it works for me....

Is the String Constructor from UTF-8 Broken?

java,android,utf-8

The UnsupportedEncodingException is thrown if the charset itself is unsupported (that is, you specify a charset and the system doesn't recognize the name) -- not if the bytes don't encode correctly. Note that the corresponding constructor that takes a java.nio.charset.Charset does not throw that exception (since there's no name to...

download xml file from url using winhttp in excel - CHARSET=UTF-8

php,excel-vba,utf-8,winhttp

See the Content-Encoding: gzip? You need to decompress the body using gzip, or use a client library that will do it for you. Just as a side note, though, if the server sent you a gzipped response when you didn't send Accept-Encoding: gzip, there's something wrong with it....

Delete weird ANSI character and convert accented ones using Python

python,encoding,utf-8,ansi

You are mixing apples and oranges. b'reuni\xc3\xb3n' is the UTF-8 encoding of u'reuni\u00f3n' which of course is reunión in human-readable format. >>> print b'reuni\xc3\xb3n'.decode('utf-8') reunión >>> repr(b'reuni\xc3\xb3n'.decode('utf-8')) "u'reuni\\xf3n'" There is no "ANSI" here (it's a misnomer anyway; commonly it is used to refer to Windows character encodings, but not necessarily...

Can't get UTF-8 Special Chars to Correctly Write to MySQL (PHP)

php,mysql,encoding,utf-8

As shown in your second edit, the path column has the latin1 charset, even though the table defaults to utf8. Maybe you've got in this state by altering an existing table? Try ALTER TABLE files MODIFY path VARCHAR(510) CHARACTER SET utf8;...

Inserting UTF-8 data into SQL Server 2012

javascript,sql-server,utf-8

So after our little chat in comment section, I understood that your collaction needs to be changed in your table, this is potential query to do that: ALTER TABLE tam_hist ALTER COLUMN bookname NVARCHAR(500) COLLATE Indic_General_90_BIN; I'm using this collation as it's suggested on this site. If you need/want to...

Loss of quotes when encoding into ascii

python,regex,utf-8,ascii

Instead of brute-force destroying everything, use Unidecode to transliterate the text into ASCII. >>> unidecode.unidecode(u'“…”') '"..."' ...

How to find UTF-8 reference of a composite unicode character

unicode,encoding,utf-8,character-encoding

UTF-8 is a byte encoding for a sequence of individual Unicode codepoints. There is no single Unicode codepoint defined for n̂, not even when a Unicode string is normalized in NFC or NFKC formats. As you have noted, n̂ consists of codepoint U+006E LATIN SMALL LETTER N followed by codepoint...

How to process a file in correct encoding in powershell?

mysql,powershell,encoding,utf-8,cmd

Powershell ISE outputs in iso-8859-1 and console in us-ascii At least on my PC, you can check this yourself by checking the variable $OutputEncoding You can change this variable $OutputEncoding = New-Object -typename System.Text.UTF8Encoding Then it should work in both Powershell ISE and Powershell Console...

Haskell: quoteFile fails on text file with “invalid byte sequence” on unicode characters

linux,haskell,unicode,encoding,utf-8

Finally, I've found that my virtual locale was not properly set, e.g. locale command showed me that all LANG variables are set to POSIX. Exporting LANG variable to command is the quickest workaround (bash example): export LANG=en_US.uft8 cabal build However, likely you need to have en_US locale installed, Debian manual...

Remove all non utf-8 characters from file with no output in terminal

ubuntu,utf-8,output

Output the conversion to a new file using shell redirection: iconv -f utf-8 -t utf-8 -c file.txt > new-file.txt Then check the end of new file: tail new-file.txt Check the top: head new-file.txt ...

How to understand text language in utf8 encoded text?

node.js,utf-8,character-encoding,redis,language-detection

Do a google search for "language detect node". This turned up https://github.com/FGRibreau/node-language-detect and https://github.com/dachev/node-cld.

Return a csv encoded in UTF-8 with BOM from django

django,csv,utf-8

Add the UTF-8 BOM to the response object before you write your data: def render_to_csv(self, request, qs): response = HttpResponse(content_type='text/csv') response['Content-Disposition'] = 'attachment; filename="test.csv"' # BOM response.write("\xEF\xBB\xBF") writer = csv.writer(response, delimiter=',') … ...

iOS SQLite SELECT with UTF 8 characters

ios,sqlite,utf-8

Well, the rmaddy's solution worked perfectly for me. What I've done is: PerformQuery is almost the same, it receives a new argument: - (NSArray *)performQuery:(NSString *)query withArgument:(NSString *)arg Just after the sqlite3_prepare_v2 I've made the bind: int codigo = sqlite3_prepare_v2(db,[query UTF8String],[query length],&resultado,NULL); if (codigo == SQLITE_OK) { sqlite3_bind_text(resultado, 1, [arg...

Include Unicode Signature (BOM) in HTML files or not?

html,utf-8,byte-order-mark

The BOM is entirely optional for UTF-8. The Unicode consortium points out that it can create problems while offering no real advantage; the W3C says that it can be a substitute for other forms of declaring the encodings and should work on all modern browsers. The BOM is only there...

BeautifulSoup gives garbage for html conversion

python,html,pdf,utf-8,beautifulsoup

That's because the URL points to a document in PDF format, so interpreting it as HTML won't make any sense at all.

AngularJS non-ascii property name support

angularjs,utf-8,non-ascii-chars

While this is an issue with Angular, as per https://github.com/angular/angular.js/issues/2174, it can be worked around by specifying your own predicate function on the scope in order to access the property to sort by: $scope.predicateProperty = 'name'; $scope.predicate = function(a) { return a[$scope.predicateProperty]; }; And the HTML can be almost identical,...

Turkish characters are not shown on TextView

android,utf-8,textview

I solved my problem. Here is solution: in build.gradle(module:app) added this code: compileOptions.encoding = 'windows-1254' here is build gradle file apply plugin: 'com.android.application' android { compileSdkVersion 22 buildToolsVersion "21.1.2" compileOptions.encoding = 'windows-1254' defaultConfig { applicationId "yazlm.beyaz.keyazarlar" minSdkVersion 14 targetSdkVersion 22 versionCode 1 versionName "1.0" } buildTypes { release { minifyEnabled...

Export (Android/Java) string data in with extended characters for import into Excel

java,android,excel,unicode,utf-8

Excel assumes csv files are in an 8-bit code page. To get Excel to parse your csv as UTF-8, you need to add a UTF-8 Byte Order Mark to the start of the file. Edit: If you're in Western Europe or US, Excel will likely use Windows-1252 character set for...

Attributes for UTF-8 characters

c++,utf-8,ncurses

There are actually four character types which can be used with ncurses: char (for waddstr) chtype (for waddchstr) wchar_t (for waddnwstr) cchar_t (for wadd_wchstr) The char and chtype data came first, for 8-bit encodings. wchar_t and cchar_t came later for wide-characters. The latter of each pair is essentially the former...