NAV
curl

Nota Rest API Reference

Introduction

API Endpoint

https://nota-api.herokuapp.com

The Nota API is a RESTful API. Our API has predictable, resource-oriented URLs, and uses HTTP response codes to indicate API errors. We use built-in HTTP features, like HTTP authentication and HTTP verbs, which are understood by off-the-shelf HTTP clients. We support cross-origin resource sharing, allowing you to interact securely with our API from a client-side web application. JSON is returned by all API responses, including errors.

This API is meant to be used in conjuction with the Nota Viewer javascript library which provides a component to load, render and annotation PDF documents within a web page.

Topics

Authentication

To authorize, use this code:

curl https://nota-api.herokuapp.com/projects \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==

Authenticate your account when using the API by including your secret API key in the request. Do not share your secret API keys in publicly accessible areas such GitHub, client-side code, and so forth.

Authentication to the API is performed via HTTP Basic Auth. Provide your API key as the basic auth username value. You do not need to provide a password.

To authorize, use this code:

curl https://nota-api.herokuapp.com/hello \
  -H "Authorization: Bearer nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw=="

If you need to support cross origin requests authentication to the API can be performed via HTTP Bearer Auth. Nota API expects for the API key to be included in all API requests to the server in a header that looks like the following:

Authorization: Bearer nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==

All API requests must be made over HTTPS. Calls made over plain HTTP will fail. API requests without authentication will also fail.

Errors

The Nota API uses conventional HTTP response codes to indicate the success or failure of an API request. In general, codes in the 2xx range indicate success, codes in the 4xx range indicate an error that failed given the information provided (e.g., a required parameter was omitted, a charge failed, etc.), and codes in the 5xx range indicate an error with Nota’s servers (these are rare).

Not all errors map cleanly onto HTTP response codes, however. When a request is valid but does not complete successfully (e.g., document has not completed ocr), we return a 404 error code.

The Nota API uses the following error codes:

Error Code Meaning
400 Bad Request – Your request sucks
401 Unauthorized – Your API key is wrong
403 Forbidden – The Nota requested is hidden for administrators only
404 Not Found – The specified resource could not be found
405 Method Not Allowed – You tried to access Nota API with an invalid method
406 Not Acceptable – You requested a format that isn’t json
410 Gone – The resource requested has been removed from our servers
429 Too Many Requests – You’re requesting too many resources! Slow down!
500 Internal Server Error – We had a problem with our server. Try again later.
503 Service Unavailable – We’re temporarially offline for maintanance. Please try again later.

Troubleshooting

Each API request has an associated request identifier. You can find this value in the response headers, under Request-Id. If you need to contact us about a specific request, providing the request identifier will ensure the fastest possible resolution.

Versioning

When we make backwards-incompatible changes to the API, we release new, dated versions. To use a specific version of the Nota API, pass an HTTP header called “Nota-Version” with the date stamp of the version you would like to use. This HTTP header is optional, if you do not pass it then you will access the latest version of the Nota API.

Example request:

curl https://nota-api.herokuapp.com/hello \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw== \
  -H "Nota-Version: 2016-05-03"

We will make details about specific breaking changes along with their Nota-Version datestamps available in a change log.

Project

The project object

Example Response:

{
  "-KGJR9NDL3Nr-t5fz3DS": {
    "group": "demo", 
    "documents": {
        ...
    }, 
    "ocr_page_credit": 10000, 
    "schema": {
        ...
    }
  }
}

The project is a collection of documents along with an annotation schema that specifies how they should be annotated.

Attribute Type Description
schema array List of annotation types and their properties
documents array List of documents that have been uploaded and possibly annotated

List all projects

You can query for all the projects currently associated with your account.

Example request:

curl https://nota-api.herokuapp.com/projects \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==

The above command will return JSON structured like this:

{
  "-KGJR9NDL3Nr-t5fz3DS": {
    "group": "demo", 
    "documents": {
        ...
    }, 
    "ocr_page_credit": 10000, 
    "schema": {
        ...
    }
  }
}

Create project

Example request:

curl -X POST https://nota-api.herokuapp.com/projects \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==: \
  -F "ocrFlags=[0,0]" \
  -F "name=Test" \
  -F "file[]=@1.pdf" \
  -F "file[]=@2.pdf"

The above command will return JSON structured like this:

{
  "documents": {
    "93c52282-7567-41ef-bbea-466c6cee2f59": {
      "file-name": "2.pdf",
      "group": "demo",
      "projects": {
        "-KH7dgaIRl6e-yIemmXb": true
      },
      "size": 135226,
      "status": "ready",
      "textConversionStatus": "pending",
      "total_pages": 1
    },
    "fa2768de-e90e-4980-99dc-dce67ba19a2f": {
      "file-name": "1.pdf",
      "group": "demo",
      "projects": {
        "-KH7dgaIRl6e-yIemmXb": true
      },
      "size": 91208,
      "status": "ready",
      "textConversionStatus": "pending",
      "total_pages": 1
    }
  },
  "name": "-KH7dgaIRl6e-yIemmXb"
}

Create a new project with new documents and a unique schema.

To upload a file to Nota, you’ll need to send a request of type multipart/form-data with two pieces of information. The first is a list of files using the parameter name “file[]”. The second is a list called “ocrFlags”. The following table shows the possible ocrFlag codes:

Code Description
0 Do not perform OCR on the file
1 Perform OCR on the file
2 Perform OCR on the file only if it contains no text (images only)

The following table summarizes the common error conditions:

Error Message Description
Exceeded storage capacity Total size of all documents cannot exceed 1gb
File exceeds max file size The maximum size per file cannot exceed 10mb
Invalid or password-protected PDF Invalid PDF file

The the ocrFlags[] list is optional, defaults to 0 (do not perform OCR). If the ocrFlags[] list is present and does not match the file[] list then the request will fail with HTTP 400 error.

If a file is not a valid pdf document, does not contain the pdf extension or exceeds one of the above limits the status of the document will be set to “error” and the PDF content will not be viewable within Nota Viewer.

If attribute “name” is not supplied or exceeds 300 chars the request will fail with HTTP 400 error. If attribute “description” is supplied and exceeds 3000 chars the request will fail with HTTP 400 error. Requests for over 10k pages must incur additional charges.

Update a project

Example request:

curl -X PATCH https://nota-api.herokuapp.com/projects/-KH7dgaIRl6e-yIemmXb \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==: \
  -F "ocrFlags=[0,0]" \
  -F "file[]=@1.pdf" \
  -F "file[]=@2.pdf"

The above command will return JSON structured like this:

{
 "d3644694-0ba2-4786-84e6-036305cb26a6": {
   "-KGJR9Kml5l17QnRp1E5": {
     "status": "pending"
   }
 },
 "documents": {
   "d3644694-0ba2-4786-84e6-036305cb26a6": {
     "created_at": 1462911500,
     "created_by": "-KGJR9Kml5l17QnRp1E5",
     "file-name": "scan1page.pdf",
     "group": "watchtower",
     "modified_at": 1462911500,
     "modified_by": "-KGJR9Kml5l17QnRp1E5",
     "projects": {
       "-KHR6OdwxsMgzP_5Du5F": true
     },
     "size": 715900,
     "status": "ocrPending",
     "textConversionStatus": "ocrPending",
     "total_pages": 1
   }
 }
}

Use this to add documents to an existing project.

To upload a file to Nota, you’ll need to send a request of type multipart/form-data with two pieces of information. The first is a list of files using the parameter name “file[]”. The second is a list called “ocrFlags”. The following table shows the possible ocrFlag codes:

Code Description
0 Do not perform OCR on the file
1 Perform OCR on the file
2 Perform OCR on the file only if it contains no text (images only)

The following table summarizes the common error conditions:

Error Message Description
Exceeded storage capacity Total size of all documents cannot exceed 1gb
File exceeds max file size The maximum size per file cannot exceed 10mb
Invalid or password-protected PDF Invalid PDF file

The the ocrFlags[] list is optional, defaults to 0 (do not perform OCR). If the ocrFlags[] list is present and does not match the file[] list then the request will fail with HTTP 400 error.

If a file is not a valid pdf document, does not contain the pdf extension or exceeds one of the above limits the status of the document will be set to “error” and the PDF content will not be viewable within Nota Viewer.

If attribute “name” is supplied and exceeds 300 chars the request will fail with HTTP 400 error. If attribute “description” is supplied and exceeds 3000 chars the request will fail with HTTP 400 error.

Requests for over 10k pages must incur additional charges.

Document

The document object

Example response:

{
  "created_at": 1462657066,
  "created_by": "-KH78v0iAp1GujLrX81l",
  "file-name": "2.pdf",
  "group": "demo",
  "modified_at": 1462657066,
  "modified_by": "-KH78v0iAp1GujLrX81l",
  "projects": {
    "-KHB4EnmQzjImVsqm6Nl": true
  },
  "size": 135226,
  "status": "ready",
  "textConversionStatus": "pending",
  "total_pages": 1
}

The document object represents a PDF document that has been uploaded into the Nota system along with all the metadata associated with it.

Attribute Type Description
id guid Document Id
file-name string Name of the file
group string Group identifier
size integer Size in bytes of the file
status string (ready, ocrPending, error)
textConversionStatus string (ready, pending)
total_pages integer Total number of pages

List document ids in a project

Example request:

curl https://nota-api.herokuapp.com/projects/-KHB4EnmQzjImVsqm6Nl/documents/ids \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==:

The above command will return JSON structured like this:

{
  "documents": [
    "b0334302-2c68-4cf9-8fe6-13b06604a820",
    "63de98d1-8787-41de-b772-cd015d0a77e9",
    "95c29d8x-251f-43f6-9171-f1851ff47d74"
  ]
}

This method returns the list of all document ids that have been uploaded to a project.

List documents in a project

Example request:

curl https://nota-api.herokuapp.com/projects/-KHB4EnmQzjImVsqm6Nl/documents \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==:

The above command will return JSON structured like this:

{
  "cedaa582-fe4e-4e91-b02d-f52e071b9842": {
    "created_at": 1462657066,
    "created_by": "-KH78v0iAp1GujLrX81l",
    "file-name": "1.pdf",
    "group": "demo",
    "modified_at": 1462657066,
    "modified_by": "-KH78v0iAp1GujLrX81l",
    "projects": {
      "-KHB4EnmQzjImVsqm6Nl": true
    },
    "size": 91208,
    "status": "ready",
    "textConversionStatus": "pending",
    "total_pages": 1
  },
  "d174a3b6-1550-4d1e-a10b-d000a0d0a91e": {
    "created_at": 1462657066,
    "created_by": "-KH78v0iAp1GujLrX81l",
    "file-name": "2.pdf",
    "group": "demo",
    "modified_at": 1462657066,
    "modified_by": "-KH78v0iAp1GujLrX81l",
    "projects": {
      "-KHB4EnmQzjImVsqm6Nl": true
    },
    "size": 135226,
    "status": "ready",
    "textConversionStatus": "pending",
    "total_pages": 1
  }
}

This method returns the list of all documents that have been uploaded to a project. Use list document ids for large projects with more than 400 documents to avoid timeouts.

Fetch a document by id

Example request:

curl https://nota-api.herokuapp.com/projects/-KHB4EnmQzjImVsqm6Nl/documents/d174a3b6-1550-4d1e-a10b-d000a0d0a91e \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==:

The above command will return JSON structured like this:

{
  "created_at": 1462657066,
  "created_by": "-KH78v0iAp1GujLrX81l",
  "file-name": "2.pdf",
  "group": "demo",
  "modified_at": 1462657066,
  "modified_by": "-KH78v0iAp1GujLrX81l",
  "projects": {
    "-KHB4EnmQzjImVsqm6Nl": true
  },
  "size": 135226,
  "status": "ready",
  "textConversionStatus": "pending",
  "total_pages": 1
}

Returns metadata associated with a document.

Retrieve pdf content

Use this method to retrieve the original PDF document that was uploaded to Nota. If the pdf document was OCR’d this method will return the OCR’d version of the document.

Example request:

curl https://nota-api.herokuapp.com/documents/2c4382fb-2fd9-4953-8b90-57cac2bb7bf2/pdf \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==:

Retrieve text content

Use this method to retrieve the formatted text from an uploaded PDF document. If the document needs to be OCR’d and is in status=ocrPending this method will return an error: “unable to fetch text content for document status ocrPending”.

Example request:

curl https://nota-api.herokuapp.com/documents/2c4382fb-2fd9-4953-8b90-57cac2bb7bf2/text \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==:

Delete document

Use this method to delete a document. Send a DELETE request to the document URL /projects//documents/.

Example request:

curl -X "DELETE" https://nota-api.herokuapp.com/-KHB4EnmQzjImVsqm6Nl/documents/d174a3b6-1550-4d1e-a10b-d000a0d0a91e \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==:

Schema

The schema object

Example response:

{
  "0b605e67-ce35-49ff-a017-b5228b81cd1b": {
    "color": [
      135,
      206,
      250,
      0.5
    ],
    "created_at": 1462659116,
    "description": "Long Term Disability description",
    "name": "Long Term Disability"
  }
}

The schema defines a collection of annotation types which each have a name, description and a color. When users highlight text snippets they will select an annotation type to store with the annotation and the color will be used as the highlight color of the annotation within the PDF document.

Attribute Type Description
name string Name of annotation type
description string Description of purpose of the annotation type
color array [r, g, b, opacity]

Retrieve current schema

Example request:

curl https://nota-api.herokuapp.com/projects/-KHB4EnmQzjImVsqm6Nl/schema \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==:

The above command will return JSON structured like this:

{
  "0b605e67-ce35-49ff-a017-b5228b81cd1b": {
    "color": [
      135,
      206,
      250,
      0.5
    ],
    "created_at": 1462659116,
    "description": "Long Term Disability description",
    "name": "Long Term Disability"
  }
}

This query returns the existing schema which is a collection of annotation types, each containing a name, description and color. The color will be used as the highlight color in Nota Viewer.

Import schema

Example request:

curl -X PUT https://nota-api.herokuapp.com/projects/-KHB4EnmQzjImVsqm6Nl/schema \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==: \
  -H "Content-Type: application/json" \
  -d '[{"annotation_color" : [ 135, 206, 250, 0.5 ], \
        "annotation_description" : "Long Term Disability description", \
        "annotation_name" : "Long Term Disability" \
      }]'

The above command will return JSON structured like this:

{
"0b605e67-ce35-49ff-a017-b5228b81cd1b": {
  "color": [
    135,
    206,
    250,
    0.5
  ],
  "created_at": 1462659116,
  "description": "Long Term Disability description",
  "name": "Long Term Disability"
}

This method allows you to import a json schema object and overwrite the existing schema.

Extend schema

Example request:

curl -X PATCH https://nota-api.herokuapp.com/projects/-KHB4EnmQzjImVsqm6Nl/schema \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==: \
  -H "Content-Type: application/json" \
  -d '[{"annotation_color" : [ 135, 206, 250, 0.5 ], \
        "annotation_description" : "Long Term Disability description", \
        "annotation_name" : "Long Term Disability" \
      }]'

The above command will return JSON structured like this:

{
"0b605e67-ce35-49ff-a017-b5228b81cd1b": {
  "color": [
    135,
    206,
    250,
    0.5
  ],
  "created_at": 1462659116,
  "description": "Long Term Disability description",
  "name": "Long Term Disability"
}

This method allows you to extend the existing schema with additional annotation types.

Update existing schema element

Example request:

curl -X PUT https://nota-api.herokuapp.com/projects/-KHB4EnmQzjImVsqm6Nl/schema/annotations/0b605e67-ce35-49ff-a017-b5228b81cd1b \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==: \
  -H "Content-Type: application/json" \
  -d '{"annotation_color" : [ 135, 206, 250, 0.5 ], \
       "annotation_description" : "Short Term Disability description", \
       "annotation_name" : "Short Term Disability"}'

The above command will return JSON structured like this:

{
  "color": [
    135,
    206,
    250,
    0.5
  ],
  "description": "Short Term Disability description",
  "name": "Short Term Disability"
}

This method allows you to update the name, description or color of an existing annotation type within the schema.

Annotation

The annotation object

Example response:

{
  "3e0b3775-9d55-46ce-bf42-3cdc756abfdd": {
    "annotation_type_id": "ffa87bdf-6cce-42e3-a280-0105bb836e91",
    "created_at": 1462919484,
    "end": 19,
    "metadata": "Serialized json metadata of your choice",
    "start": 0,
    "text": "Long Term"
  }
}

The annotation object represents a single highlighted text snippet tagged with an annotation type. Annotations contain a metadata field to store additional user specified information along with an annotation which provides a nice extension point for this object.

Attribute Type Description
annotation_type_id string Annotation type id (from the schema)
text string The highlighted text
metadata string Additional data about the annotation serialized into a string.
start integer Starting offset of a text annotation. (Available only for imported annotations)
end integer Ending offset of a text annotation. (Available only for imported annotations)

List annotations in a document

Example request:

curl https://nota-api.herokuapp.com/projects/-KHB4EnmQzjImVsqm6Nl/2c4382fb-2fd9-4953-8b90-57cac2bb7bf2/annotations \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==:

The above command will return JSON structured like this:

{
  "11b11717-63c6-4c64-b297-fc486a0b9402": {
    "annotation_type_id": "ffa87bdf-6cce-42e3-a280-0105bb836e91",
    "created_at": 1462919484,
    "end": 19,
    "metadata": "Serialized json metadata of your choice",
    "start": 0,
    "text": "Long Term"
  }
}

Import annotations

Example request:

curl -X PATCH https://nota-api.herokuapp.com/projects/-KHB4EnmQzjImVsqm6Nl/2c4382fb-2fd9-4953-8b90-57cac2bb7bf2/annotations \
  -u nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==: \
  -H "Content-Type: application/json" \
  -d '[{"start" : "23", "end" : "76", "annotation_type_id" : "0b605e67-ce35-49ff-a017-b5228b81cd1b", "text" : "i was highlighted", "metadata" : "hello world"}]'

The above command will return JSON structured like this:

{
  "3e0b3775-9d55-46ce-bf42-3cdc756abfdd": {
    "annotation_type_id": "0b605e67-ce35-49ff-a017-b5228b81cd1b",
    "created_at": 1462673015,
    "end": 76,
    "metadata": "hello world",
    "start": 23,
    "text": "i was highlighted"
  }
}

Use this method to add annotations into Nota. Accepts an array of annotations with start and end character offsets mapping the text in the formatted version of the pdf text.

Attribute Type Description
annotation_type_id string Should exist in the project schema
text string The highlighted text (optional)
metadata string Additional data about the annotation serialized into a string with a limit of 2000 characters.
start integer The begin index of the character offset, inclusive
end integer The end index of the character offset, exclusive

Export annotations

Example request:

curl -X GET -H "Authorization: Bearer nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==" \
-H "Content-Type: application/json" \
 "https://nota-api.herokuapp.com/projects/-KIELPASHUY7OcJqH2Ed/89f29ff1-1498-4d7d-ad38-becf06b85b72/annotations/offsets"```

> The above command will return JSON structured like this:

```json
{
  "0d32509e-2a9a-4abf-918d-31b2344b3653": {
    "annotation_type_id": "30359a24-f128-4314-a459-66f9e9b68548",
    "created_at": 1464119863473,
    "created_by": "-KH78v0iAp1GujLrX81l",
    "document_sort_order_number": 16001437.478337755,
    "end": 55865,
    "group": "demo",
    "highlights": {
      "p15": [
        [
          0.2083965407477485,
          0.14166386836803224,
          0.3310060501098633,
          0.15509570181329752
        ]
      ]
    },
    "metadata": "Create any inputs for collecting annotation metadata and pass the information to the API as a json block.",
    "start": 55848,
    "text": "Guardian business"
  }
}

Use this method to export annotation offsets from Nota: projects///annotations/offsets. Computes the character offsets of the formatted text version of the document. The return object contains a map with keys the annotation ids and values the attributes of the annotation. The attributes contain “start” and “end” character offsets. Please increase your request time-outs for long documents with a large number of annotations as the computation of the offsets can take some time.

Miscellaneous

Usage

Example request:

curl -X GET -H "Authorization: Bearer nota_test_LUtINzh2MGlBcDFHdWpMclg4MWwsZGVtbw==" \
 "https://nota-api.herokuapp.com/usage"

The above command will return JSON structured like this:

{
  "content_length_used": 1181685,
  "max_content_length": 50000000,
  "members": {
    "-KH78v0iAp1GujLrX81l": true
  },
  "name": "demo",
  "ocr_page_credit": 10000,
  "ocr_pages_used": 0
}

Use this method to check your content size / OCR page limits and current usage.