National character encoding

Files on a computer do not actually consist of numbers and letters, but of individual numbers. The code page indicates which number corresponds to which character. For example, the decimal number 65 stands for the capital letter A, 66 stands for B and so on. However, the older code pages only have space to specify 255 characters, which is insufficient to designate all national characters. Therefore, each language (or group of languages) has its own code page. What one code page displays the letter ź(ISO-8859-2), another code page displays the character Ľ (in this case in ISO-8859-1).

To correctly display national characters, the program must know which code page is used in the document. If there was only one code page for one language, there would be no problem, however, for example for the Polish language itself, there are several ways of encoding characters, of which two are most often used on the Internet:

ISO-8859-2 - ISO-compliant codepage, used in most systems,
WIN-CP-1250 - codepage used by MS Windows.

The recommended code page is the one compliant with the Polish Standard ISO-8859-2. This is the default encoding for Polish characters in Spider. Similarly for other languages.

Using different code pages while working

Spider allows you to create documents in virtually any language. The program can also automatically recognize the code page used in a given document, based on entries in the META section. For options, see Program Settings - Documents as well as Spelling Menu .

When working with a website, in the Project Properties window, you can specify the encoding to be used by default for everyone documents of this website. This allows the project to be made independent of the global program settings.

Or, you can quickly open and save a document using the encoding of your choice with the commands in Menu Spelling -> Read in encoding ... and Save in encoding ... , which allows you to additionally use encoding independent of program and project settings.

The most commonly used conversion is to change the character encoding from WIN-CP-1250 (code page used in MS Windows) to ISO-8859-2 (ISO compatible code page used in most systems).

The encoding selected for a document has the highest priority. Then encoding for the project and finally the general encoding. In case of changing the reading encoding for a document, the document is read again with conversion according to the selected encoding. If the saving encoding is changed, the document is saved in the selected encoding.

Support for Unicode

In order to improve the working comfort when using Unicode encoding, a mini unicode editors has been created. This tool allows you to copy and paste texts containing Unicode entities into a document, which will be automatically converted into UTF characters.

Conversion of national characters

Spider can convert character encodings to many different code pages. You can conduct it in several ways:

using the Extended Search and Replace tool, which allows you to convert the encoding in the current document, all open documents as well as in a selected document or all documents of a specific folder, or Project. You can open the window using the Character Encoding Converter command on the Spelling menu.
you can quickly convert characters in the current document by using the National Characters command in the current document on the Spelling menu.

you can also quickly delete all Polish characters in the current document or in all open documents by using the Delete Polish Characters in Current and Remove Polish Characters commands in the Spelling menu.

When creating a new document, you can select a codepage in the META Section Editor . The ISO-8859-2 page is set by default.

National character encoding

Using different code pages while working

Conversion of national characters

Related topics