- I create a file, execute the following command:
:set fileencoding
- result is :
fileencoding=cp936
- I edit and then close the file. I reopen the file and execute the following command:
:set fileencoding
- result is :
fileencoding=utf-8
The contents of .vimrc are:
...
set fencs=ucs-bom,utf-8,gbk,gb18030,utf-16,big5
set fenc=cp936
set encoding=utf-8
Also, I used a remote connection in vscode. Why does the value of fileencoding change? What is the reason for this ? How to solve this problem ? thanks!
Here are the results of my attempts:
- When the content contains only English, content is as follows, I save and reopen it. And then i execute the command: set fileencoding, the result is
fileencoding=utf-8。also, I executefile test1.c,the result istest1.c: ASCII text.
//file: test1.c
abc
- When the content contains Chinese: content is as follows, I save and reopen it. And then i execute the command: set fileencoding, the result is
fileencoding=cp936。also, I executefile test2.c,the result istest2.c: ISO-8859 text.
//file:test2.c
你好abc
.vimrccontent is :
...
set fencs=ucs-bom,utf-8,gbk,gb18030,utf-16,big5
set fenc=cp936
set encoding=utf-8
My question is why fileencoding is utf-8 and not cp936 when the content is in English only?
It's actually not a Vim issue but an encoding issue. Vim does what you ask for and it does not make a difference.
There are two pieces of information that explain the behavior. The first one is that text files contain no meta information about their encoding. It's actually just a bunch of bytes. How they are interpreted is up to the application. Applications will have to guess. Judging from the bulk of related questions on a popular programming Q&A site, this is hard.
The second piece of the puzzle is that the first 128 characters of both UTF-8 and CP936 are identical to the ASCII character set. Take a look at the code page file for CP936 and compare it with ASCII.
This is by design. So for bytes
0x00to0x7f, it's just plain ASCII, no matter what encoding you specify.Using Vim, let's create a simple text file containing "hello world" and take a look at it:
After
set fileencoding=cp936and saving again, you will get identical results.Note the complete absence of any encoding meta information. The whole file is just "hello world" and a newline.
Everything changes once you introduce non-ASCII characters. The first non-ASCII in CP936 is the Euro sign encoded as
0x80. So let's say "h€llo world" and re-run the file investigations:Note that € is encoded as
e2 82 acas one would expect in UTF-8 while it's encoded as80in CP936.Also note that
filecannot correctly guess CP936 encoding.As we have seen, there's no difference in file contents as long as only ASCII characters are used. So the bottom line is that Vim saves your ASCII files as CP936 but it doesn't make a difference.
To help Vim get the encoding right when opening files, you would add cp936 near the start of
'fileencodings'.