REST API

AddSearch’s REST API provides programmatic access to use and manipulate data in your search index. Current version of our API is v1. We’re expanding API’s functionalities based on your feedback, so feel free to contact us if any ideas arise.

Base URL

All API URLs start with the following base URL:
https://api.addsearch.com/v1
Access is always over HTTPS. All calls to HTTP return 405 Method Not Allowed

Content type

All API endpoints consume and produce JSON:
application/json
Calls with JSON payload must include Content-Type header, which can be added in curl with the following switch:

curl -H 'Content-Type:application/json' https://api...

Authentication

Authentication is done with HTTP Basic Auth. Your index’s SITEKEY is the username and your secret API key is the password. You’ll find your SITEKEY and secret API key from the Dashboard’s Installation page. HTTP authentication in curl is done with the user switch:

curl --user ‘sitekey:secret-api-key’ https://api...

Date Format

AddSearch API uses ISO-8601 standard as the date format. Example of an accepted timestamp is:
2015-01-30T11:17:22-02:00
Read more about ISO-8601 from w3.org

Rate limits

By default rate limits are monitored over 15 minutes time period. Every API call returns rate limit information in the following headers

X-Rate-Limit-Limit: The limit for a given request

X-Rate-Limit-Remaining: Requests left for the current 15 minute window

X-Rate-Limit-Reset: The time when the current usage count resets (seconds since Unix epoch)

Example headers returned by an API call:

X-Rate-Limit-Limit: 100

X-Rate-Limit-Remaining: 97

X-Rate-Limit-Reset: 1422615270

API Endpoints

Get document’s status

You can get the status of a document with the following request. Doc id is the MD5 hash of a full URL with protocol and possible query parameters. For example the doc id of http://www.addsearch.com/ is 3b1d053e2fdf65f178dc5d1b5bd00f75

GET /indices/{index public key}/documents/{doc id}
API call returns following information:
{
"indexPublicKey": "index public key",
"docId": "md5 of url",
"status": "INDEXED|EXCLUDED|PENDING|ERROR|UNKNOWN",
"statusInfo": "Duplicate of another-doc-id",
"lastFetched": "2015-01-13T13:43:01.000Z",
"duplicateOf": {
"href": "https://api.addsearch.com/v1/indices/{index public key}/documents/{doc id}"
},
"content": {
"href": "https://api.addsearch.com/v1/indices/{index public key}/documents/{doc id}/content"
}
}

Get document’s contents

Following the link in the “content” property of document’s status response the indexed content of the document is returned.

GET /indices/{index public key}/documents/{doc id}/content

The response is of the following form:

{
"title": "An example page",
"h1": "The heading on an example page",
"h2": "",
"mainContent": "The indexed content on an example page",
"documentDate": "2015-02-10T14:11:13.000Z",
"language": "en",
"hiddenKeywords": null
}

The field “documentDate” is the ISO 8601 date that the document was created, if the information is available in the source document’s meta data, or if not, the date the document was initially indexed.

Hidden keywords is a space delimited list of manually defined keywords that when used as search keywords will match to this document, even  though they are not present in the documents indexed content.

Modify document’s hidden keywords

You can modify the hidden keywords of a document by POSTing the new value to this endpoint.

POST /indices/{index public key}/documents/{doc id}/content/hiddenKeywords

payload:

{
"hiddenKeywords": "list of space delimited hidden keywords"
}

The endpoint returns HTTP 200 OK if succesful.

Add a page to index or re-crawl an URL

You can add new pages to your index or re-crawl existing documents with following endpoint. Re-indexing is executed at the latest in a minute or two.

POST /crawler
payload:
{
"action": "FETCH",
"indexPublicKey": "SITEKEY",
"url": "http://foo.com/bar.html"
}

Returns HTTP 202 ACCEPTED with payload e.g.

{
"message": "Scheduled",
"docId": "doc id"
}