PDF and Image Data Preparation


PDFs are great to work with if you are interested in analyzing visual data within the document, rather than pure text alone. For example, if you are interested in analyzing an interview transcript, Word Document or plain text would be the best format for upload.

If your documents are PDFs and you need a converter, we recommend Adobe's free converter:

Reminder: If you are working with any text-based documents, it is best to upload in .doc or .docx (word document) format. PDFs are a more difficult file format for applications and software to decode. Unless you are interested in visual distinctions within your documents (i.e., a pictures, drawing in margins, handwriting, historical record, etc.), opt for word document format instead.

If you are analyzing PDFs and the files are large (e.g., more than 10 pages each), we recommend saving them in smaller chunks to expedite upload speeds. You can still analyze these documents as if they were one by attaching a Descriptor



Supported image formats include .jpeg and .png

Some general tips include:

  • Try to have the highest resolution necessary for your analysis 
  • Crop out unwanted areas of the image before upload