AddSearch REST API

AddSearch’s REST API provides programmatic access to use and manipulate and query data in your search index. Current version of our API is v1. We’re expanding API’s functionalities based on your feedback, so feel free to contact us if any ideas arise.

Base URL

All API URLs start with the following base URL:
https://api.addsearch.com/v1
Access is always over HTTPS. All calls to HTTP return 405 Method Not Allowed

Content type

API endpoints consume and produce JSON:
application/json
Calls with JSON payload must include Content-Type header, which can be added in curl with the following switch:

curl -H 'Content-Type:application/json' https://api...

Authentication

Authentication is done with HTTP Basic Auth. Your index’s SITEKEY is the username and your secret API key is the password. You’ll find your SITEKEY and secret API key from the Dashboard’s Installation page. HTTP authentication in curl is done with the user switch:

curl --user 'sitekey:secret-api-key' https://api...

Please notice! The Search API does not require authentication.

Date Format

AddSearch API uses ISO-8601 standard as the date format. Example of an accepted timestamp is:
2015-01-30T11:17:22-02:00
Read more about ISO-8601 from w3.org

Rate limits

By default rate limits are monitored over 15 minutes time period. Every API call that has limits returns rate limit information in the following headers

X-Rate-Limit-Limit: The limit for a given request
X-Rate-Limit-Remaining: Requests left for the current 15 minute window
X-Rate-Limit-Reset: The time when the current usage count resets (seconds since Unix epoch)

Example headers returned by an API call:

X-Rate-Limit-Limit: 100
X-Rate-Limit-Remaining: 97
X-Rate-Limit-Reset: 1422615270

Query limits

The term parameter value has the following limits that cover most of the search queries:
Maximum length: 150 characters
Maximum number of words: 10 words

API Endpoints


The offical Search API Client for JavaScript is available on npm:
npmjs.com/package/addsearch-js-client.

Make search queries to your AddSearch index with the following endpoint
GET /search/{index public key}

Mandatory query parameters are:

  • term: Search term (aka keyword)

Optional query parameters are:

  • limit: Number of results to return per page (default: 10. Must be 1-300)
  • page: Page to return (default: 1)
  • jsonp: JavaScript function call wrapped around the response JSON
  • lang: Return results only with this language (e.g. “en” or “de”)
  • categories: Limit search to certain categories (domain or URL path). E.g. “0xdomain.com” would return results only from domain.com, 1xnews would return results from “domain.com/news/*” path
  • sort: relevance (default) or date
  • order: desc (default) or asc. Only applicable if “sort=date”
  • fuzzy: false (default) or true. Also match words that are close to the defined keywords. Off by default. Suggested way to use fuzzy search is first to search with it off, and then, if there were no exact results, try another request with it turned on.
  • dateFrom: return only results that are newer than given date, in yyyy-MM-dd format (example: 2018-12-15)
  • dateTo: return only results older than given date, in yyyy-MM-dd format
  • customField: return only results containing the given custom field and value pair, in “key=value” URL encoded format (so key%3Dvalue). Multiple custom field pairs can be defined by adding additional customField parameters to the query. If the same custom field name is given with different values, results with any one value will be returned (i.e. OR match). If multiple custom field names are defined, results must match each criterion. Example: &customField=city%3Dlondon&customField=genre%3Drock&customField=genre%3Dpop (city=London AND (genre=rock OR genre=pop))
  • resultType: all (default) or organic (results without Pinned results or Promotions)
  • userToken: user token for search personalization
  • facet: return categories where search results belong to. Pass one or more custom fields to receive facets of said fields. For example, facet=genre&facet=artist would return search result aggregations by genre and artist custom fields.
  • numFacets: limit the maximum number of aggregations returned for each field defined with facet parameter. Default is 10, maximum value is 100.
Please notice! Search API uses your public SITEKEY, not secret API key!

For example:
https://api.addsearch.com/v1/search/1bed1ffde465fddba2a53ad3ce69e6c2?term=rest+api
Returns

{
  "page": 1,
  "total_hits": 1,
  "hits": [
    {
      "id": "54f5b92d4e4766f4bc0ce2b05f80f58d",
      "url": "https://www.addsearch.com/developers/api/",
      "title": "AddSearch REST API",
      "meta_description": "Documentation of our REST API",
      "meta_categories": ["features", "api"], // <meta name="addsearch-category" content="features/api" />
      "custom_fields": {
        "location": "London",
        "genre": ["Rock", "Pop"]
      }
      "highlight": "AddSearch’s <em>REST API</em> provides programmatic access to your search index",
      "ts": "2015-01-22T11:56:10",
      "categories": [
        "0xwww.addsearch.com",
        "1xdevelopers",
        "2xapi"
      ],
      "images": {
        "main": "https://d20vwa69zln1wj.cloudfront.net/1bed1ffde465fddba...",
        "main_b64": "/9j/4AAQSkZJRgABAQIAHAAcAAD/2wBDACgcHiMeGSgjISMtKygwPG...",
        "capture": "https://d20vwa69zln1wj.cloudfront.net/1bed1ffde465fddba..."
      },
      "score": 0.790107
    }
  ],
  facets: null
}

Fields in the returned JSON are:

  • page: Page number passed as a query parameter
  • total_hits: Total number of documents matching the search term

Elements in the hits array:

  • id: Document’s ID (md5 of the URL)
  • url: Document’s URL
  • title: Documents’s title
  • meta_description: Documents’s meta description
  • highlight: Part of the document’s content
  • ts: Document’s publishing date. if unknown, the time when the document was initially indexed
  • categories: Categories where the page belongs. These can be used to filter down the search query to a specific domain or path part
  • images.main: URL of the main image (e.g. og:image). Null if missing
  • images.main_b64: Low resolution version of the main image as a base64 encoded string
  • images.capture: URL of the screen capture. Null if missing
  • score: How well the search term matches the document
  • custom_fields: custom fields defined for the document. Each value is either a string or a string array (if multiple values defined).
  • facets: categories with the count of records where search results belong to.

Please notice! The search API does not require authentication.

The rate limit for the Search API is 5 requests/sec from a single IP address. There are no limits based on the total search volume or the number of requests coming from different IP addresses. If you implement a “search-as-you-type” functionality, throttling with about 200ms delay between requests is recommended.

Search suggestions


The offical Search API Client for JavaScript is available on npm:
npmjs.com/package/addsearch-js-client.

Search suggestions can be queried from the following API endpoint
GET /suggest/{index public key}

Mandatory query parameters are:

  • term: Search suggestion prefix

Optional query parameters are:

  • size: Number of search suggestions to return. Default value is 10

For example:
https://api.addsearch.com/v1/suggest/1bed1ffde465fddba2a53ad3ce69e6c2?term=api
Returns

{
  "suggestions": [
    {"value":"api"},
    {"value":"api reference"},
    {"value":"rest api"}
  ]
}

Please notice! The Suggestions API does not require authentication.

The default rate limit for the Suggestions API is 5 requests/sec from a single IP address. Higher rate limit can be requested from the support.

Get document’s status

The status of a document can be queried with the following request. Doc id is the MD5 hash of a full URL with protocol and possible query parameters. For example the doc id of https://www.addsearch.com/ is 3b1d053e2fdf65f178dc5d1b5bd00f75

GET /indices/{index public key}/documents/{doc id}
API call returns following information:

{
  "indexPublicKey": "index public key",
  "docId": "md5 of url",
  "status": "INDEXED|EXCLUDED|PENDING|ERROR|UNKNOWN",
  "statusInfo": "Duplicate of another-doc-id",
  "lastFetched": "2015-01-13T13:43:01.000Z",
  "duplicateOf": {
    "href": "https://api.addsearch.com/v1/indices/{index public key}/documents/{doc id}"
  },
  "content": {
    "href": "https://api.addsearch.com/v1/indices/{index public key}/documents/{doc id}/content"
  }
}

Get document’s contents

Following the link in the “content” property of document’s status response the indexed content of the document is returned.

GET /indices/{index public key}/documents/{doc id}/content

The response is of the following form:

{
  "title": "An example page",
  "h1": "The heading on an example page",
  "h2": "",
  "mainContent": "The indexed content on an example page",
  "documentDate": "2015-02-10T14:11:13.000Z",
  "language": "en",
  "hiddenKeywords": null
}

The field “documentDate” is the ISO 8601 date that the document was created, if the information is available in the source document’s meta data, or if not, the date the document was initially indexed.

Hidden keywords is a space delimited list of manually defined keywords that when used as search keywords will match to this document, even though they are not present in the documents indexed content.

Modify document’s hidden keywords

Hidden keywords can be modified by POSTing the keyword value to the following endpoint.

POST /indices/{index public key}/documents/{doc id}/content/hiddenKeywords

payload:

{
  "hiddenKeywords": "list of space delimited hidden keywords"
}

The endpoint returns HTTP 200 OK if successful.

Add a page to index or re-crawl an URL

Add new pages to your index, or re-crawl existing documents, with the following endpoint.

POST /crawler

payload:

{
  "action": "FETCH",
  "indexPublicKey": "SITEKEY",
  "url": "http://foo.com/bar.html"
}

Returns HTTP 202 ACCEPTED with payload e.g.

{
  "message": "Scheduled",
  "docId": "doc id"
}

Indexing is executed within a minute from the API call.