Skip to main content
OCLC Support

Diacritics or other characters do not come through correctly when importing records using the CONTENTdm Project Client

Symptom
  • When importing records using the Project Client, some records with diacritics or records that are derived from different character sets display with a question mark instead. For example, using the pound currency symbol - if the record contains £400 in the text, it appears correctly in the tab-delimited text file as £400. However, when that data is uploaded to the Project, it renders as �400 when the record is uploaded
     
Applies to
  • CONTENTdm 
Resolution

Ensure you follow these suggestions:

  • We recommend making sure the encoding of what you are adding matches the setting of the tool you are using. If it's the CONTENTdm Administration site, everything assumes UTF-8. If members copy and paste text and the source document is not UTF-8, then the characters will be misencoded. Please make sure your source document is also in UTF-8
  • For the Project Client, there is more flexibility if you are adding text using a tab-delimited file. The various wizards to create records have an option on the field mapping dialogue to choose the character encoding of the source document. The options are either "ANSI" (which is ISO-8859-1 or ASCII) and "UTF-8". Your input file must be one of those options and members must choose the correct corresponding option before proceeding in the Project Client wizard
Additional information
  • If the records are already loaded and have characters displaying incorrectly, these will need to be manually corrected. Open the record in the CONTENTdm Administration edit window and recopy the characters while ensuring the encoding is UTF-8
Page ID
16200