PPRS Character Sets
PPRS uses the coded character set named ISO-8859-1. This character set supports special characters such as accented characters used in French and other European languages. The HTML coded character set is also based on ISO-8859-1.
Common word processing applications such as Microsoft Word may use additional characters that are not part of the ISO-8859-1 character set. Instead, these are characters within a set that is alternatively referred to as Windows ANSI, Windows-1252 or CP1252. When used in an HTML browser environment or when used by systems other than Microsoft Windows, these Windows ANSI / CP1252 characters can be misinterpreted, misrepresented or cause other more severe problems.
For additional information on the relevant coded character sets, please see the following web site:
In English: http://en.wikipedia.org/wiki/ISO/IEC_8859-1 (opens in new tab)
In French: https://fr.wikipedia.org/wiki/ISO_8859-1 (opens in new tab)
Special Handling within General Collateral Description and Additional Information fields
In order to support PPRS users copying text from common word processing applications into PPRS’ large, free-text fields, PPRS replaces commonly-occuring invalid characters with one or more similar ISO-8859-1 characters. This replacement is applied for the General Collateral Description and Additional Information fields only. The following table lists the most common occurrences of these characters and their replacement values.
Windows ANSI / CP1252 character | Replacement ISO-8859-1 character(s) | ||
… | horizontal ellipsis | ... | three periods |
‘ | left single quotation | ' | apostrophe |
’ | right single quotation | ' | apostrophe |
“ | left double quotation | " | quotation |
” | right double quotation | " | quotation |
• | bullet | · | middle dot |
– | En dash | - | hyphen |
— | Em dash | - | hyphen |
When PPRS replaces invalid characters, the registrant will be notified that characters have been or will be replaced in case the registrant wishes to revise or cancel the registration request. All other characters not included in ISO-8859-1 and not listed in the above table will be rejected as invalid. If an invalid character is detected, the registration request will fail and an error message will be returned to the registrant stating that the registration contains invalid characters.
Changing your practices to prevent entry of invalid characters
Non-ISO-8859-1 characters typically occur within PPR databases as a result of users copying and pasting data from word processing software (e.g. Microsoft Word). Word processing software introduces non-ISO-8859-1 characters primarily through the use of "auto correction" and "auto formatting" features. For example, "(tm)" is replaced with a trademark symbol. The above example can result in introduction of an invalid character if the text were subsequently copied to a PPR registration screen. If you wish to disable one or more of these features, please consult online user guides or the help system available for your word processing software.