FAQs❓ | DocuPanda Help Center

What are credits? How much will this cost me?

In DocuPanda, all operations cost credits at a fixed rate. The price of purchasing credits varies with your subscription tier, with the cheaper tiers costing more per-credit than the expensive tiers.

Most users fall into one of the usage pattern below. There are other services in DocuPanda, but these two are the bread and butter:

Parsing Only: if all you want is the plain-text representation of your document, which is the result of uploading a document to DocuPanda - then the cost is 1 credit per page.
Parsing and Standardization: this is the most popular option, where you upload your document and then standardize it according to your schema. Standardization costs an extra 2 credits per page on top of parsing, bringing the total to 3 credits per page

Why do I need to Standardize?

You might be asking - wait, what is this standardization thing? Why do I need that at all, isn't the parsed text output I get from DocuPanda after uploading enough? The answer is: sometimes, usually no. Why is that?

DocuPanda's parsing uses traditional machine learning methods like transformer models for OCR, computer vision models for detection of objects like checkmarks and table extraction, etc. to deliver a semi structured output that organizes your documents into pages and sections. Each section can be text, table, or image - and we give you an ordered representation so that if you read the whole thing top to bottom, it will make sense.

However, this is not the same as the totally structured output you receive from standardization. For that, you define a schema with exactly the fields you want to extract, what they mean, and what types they are (date, integer, number, string, etc.). Standardization will give you the same output format every time, regardless of the input document. It is that consistency and predictability that makes it much more useful than just parsing on its own. Parsing by itself is mainly useful if you have a downstream use for that representation, for instance: putting it into an AI system yourself for further business needs.

What is the difference between Standardization and Analysis?

Standardization is the workhorse of DocuPanda, covering most use-cases. It is the appropriate service when one had many documents that all adhere to a common schema, meaning, you have the same fields that you want to extract from all such documents.

Analysis is more suitable when this specific document is its own unique snowflake, and you don't know in advance which questions you want to ask of it. For instance, imagine you have your own SaaS in which your users forward you a question about a document in free-text (you don't know their question in advance, it's not a repeated question that you always ask of such documents. In that case, analysis is a better solution, as it does not require a schema and let's you ask ad-hoc questions.

What is Classification? How does that fit in?

Classification is a triaging feature that is intended for users who do not necessarily know what type of document they are uploading in advance. For instance, they may be routing an incoming stream of faxes directly to DocuPanda, which can include multiple types of documents (e.g. Invoices, Utility Bills, Bank Statements). For such cases, the classification feature can determine which class the document belongs to, usually so it can be classified by the appropriate schema.

Since classification is priced much cheaper than standardization, it is economical to use it beforehand to determine if standardization is even needed (maybe this document does not fit the intended schema at all), or which schema to use. But for most users, who do know what they are uploading in advance, classification is unnecessary - because in essence you already know the class of the document that you uploaded.