Digitize Document for Archiving

Learn how to digitize a physical document for digital archiving and search.

This guide will walk you through the process of digitizing a physical document from an image. This is a common requirement for businesses that want to move to a paperless office, archive their documents, and make them easily searchable.

See in Postman

This guide's API calls are available as a Postman collection. You can use it to quickly test the API and see how it works.

Business Problem

Imagine you work for a law firm that has thousands of physical case files. Finding a specific document can be a time-consuming process. The firm wants to create a digital archive of all its case files to make them easily searchable and accessible.

Solution

We can solve this problem by using the Scan Documents API to digitize the physical documents and prepare them for archiving. The process involves:

  1. Uploading an Image: Taking a picture of the document and uploading it to the API.
  2. Detecting the Document: Automatically identifying the document within the image.
  3. Warping the Document: Correcting the perspective of the document to make it look flat.
  4. Enhancing the Colors: Improving the readability of the document by applying a scanner-like effect.

This process will result in a high-quality digital version of the physical document, which can then be stored in a document management system and indexed for full-text search.

Digitized Document Example

Step 1: Upload the Image

First, you need to upload the image containing the document to the Scan Documents API. You can do this by sending a POST request to the /v1/files endpoint.

Upload a File

Creates a new file

curl -X POST "https://api.scan-documents.com/v1/files" \
  -H "x-api-key: YOUR_API_KEY" \
  -F name="Case File p. 1" \
  -F file="@/path/to/your/document.jpg"

The API will respond with a file object, which includes an ID for the uploaded file. You will use this ID in the next steps.

{
    "id": "file_glh4pbl2lbu59s07",
    "name": "Case File p. 1",
    "type": "image/webp",
    "properties": {
        "size": 265174,
        "width": 2448,
        "height": 3264
    },
    "task_id": null,
    "created_at": "2025-08-20T20:00:05.000Z"
}

Upload Image

Step 2: Detect the Document (or skip to Scan Endpoint Alternative)

Next, you need to detect the document in the uploaded image. You can do this by creating a detect-documents task.

Detect Documents

Creates a task to detect document boundaries within an image

curl -X POST "https://api.scan-documents.com/v1/image-operations/detect-documents" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "file_glh4pbl2lbu59s07"
  }'

The API will respond with a task object. Once the task is completed, its result will contain the coordinates of the detected document.

{
    "id": "task_vxjynjez5nw8didz",
    "operation": "detect-documents",
    "status": "completed",
    "parameters": {
        "input": "file_glh4pbl2lbu59s07",
        "scan_mode": "standard"
    },
    "result": {
        "documents": [
            {
                "vertices": [
                    {
                        "x": 82,
                        "y": 646
                    },
                    {
                        "x": 1539,
                        "y": 394
                    },
                    {
                        "x": 2359,
                        "y": 2105
                    },
                    {
                        "x": 807,
                        "y": 2788
                    }
                ],
                "bounding_box": {
                    "top": 394,
                    "left": 82,
                    "width": 2277,
                    "height": 2394
                },
                "file_id": "file_glh4pbl2lbu59s07"
            }
        ]
    },
    "callback_url": null,
    "created_at": "2025-08-22T19:58:00.000Z",
    "updated_at": "2025-08-22T19:58:10.000Z"
}

Detected Document

Step 3: Warp the Document

Now that you have the coordinates of the document, you can warp it to correct its perspective. This is done by creating a warp task.

Warp Image

Creates a task to warp an image using the specified vertices

curl -X POST "https://api.scan-documents.com/v1/image-operations/warp" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "input": "file_glh4pbl2lbu59s07", "name": "Warped Document", "vertices": [ { "x": 82, "y": 646 }, { "x": 1539, "y": 394 }, { "x": 2359, "y": 2105 }, { "x": 807, "y": 2788 } ] }'

The result of this task will be a new image file with the warped document.

{
    "id": "task_jbm922sf4it82itr",
    "operation": "warp",
    "status": "completed",
    "parameters": {
        "input": "file_glh4pbl2lbu59s07",
        "name": "Warped Document",
        "vertices": [
            {
                "x": 82,
                "y": 646
            },
            {
                "x": 1539,
                "y": 394
            },
            {
                "x": 2359,
                "y": 2105
            },
            {
                "x": 807,
                "y": 2788
            }
        ]
    },
    "result": {
        "generated_files": [
            {
                "id": "file_hvm5unqnr1d3xw8k",
                "name": "Warped Document",
                "type": "image/webp",
                "properties": {
                    "size": 207270,
                    "width": 1695,
                    "height": 2261
                },
                "task_id": "task_jbm922sf4it82itr",
                "created_at": "2025-08-22T20:02:00.000Z"
            }
        ]
    },
    "callback_url": null,
    "created_at": "2025-08-22T20:01:52.000Z",
    "updated_at": "2025-08-22T20:02:01.000Z"
}

Warped Document

Step 4: Enhance the Colors

Finally, you can enhance the colors of the warped document to make it look like a scanned document. This is done by creating an apply-effect task with the scanner effect.

Apply Effect

Creates a task to apply a predefined effect to an image

curl -X POST "https://api.scan-documents.com/v1/image-operations/apply-effect" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "input": "file_hvm5unqnr1d3xw8k", "name": "Digitized Document", "effect": "scanner" }'

The result of this task will be the final digitized document.

{
    "id": "task_b9qw2rd4vcvug29k",
    "operation": "apply-effect",
    "status": "completed",
    "parameters": {
        "input": "file_hvm5unqnr1d3xw8k",
        "name": "Digitized Document",
        "effect": "scanner"
    },
    "result": {
        "generated_files": [
            {
                "id": "file_jmjje3ut90btw1r9",
                "name": "Digitized Document",
                "type": "image/webp",
                "properties": {
                    "size": 283188,
                    "width": 1695,
                    "height": 2261
                },
                "task_id": "task_b9qw2rd4vcvug29k",
                "created_at": "2025-08-22T20:03:39.000Z"
            }
        ]
    },
    "callback_url": null,
    "created_at": "2025-08-22T20:03:31.000Z",
    "updated_at": "2025-08-22T20:03:39.000Z"
}

You can now download the final image using the /v1/files/{id}/download endpoint. This digitized document is now ready to be archived and indexed for search.

Digitized Document

Scan Endpoint Alternative

The process described above is the most flexible, allowing you to customize each step. However, if you want a simpler approach, you can use the scan endpoint, which combines all the steps into a single API call.

Scan Document

Creates a task to scan an image file. This is an equivalent operation for detect-documents and warp combined, additionally it can apply effects to the scanned image.

curl -X POST "https://api.scan-documents.com/v1/image-operations/scan-image" \
    -H "x-api-key: YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
        "input": "file_glh4pbl2lbu59s07",
        "name": "Scanned Document",
        "scan_mode": "standard",
        "effect": "scanner"
    }'

The scan endpoint will automatically detect the document, warp it, and apply the scanner effect in one go. The result will be a digitized document ready for archiving.

{
    "id": "task_b9qw2rd4vcvug29k",
    "operation": "apply-effect",
    "status": "completed",
    "parameters": {
        "input": "file_hvm5unqnr1d3xw8k",
        "name": "Digitized Document",
        "effect": "scanner"
    },
    "result": {
        "generated_files": [
            {
                "id": "file_jmjje3ut90btw1r9",
                "name": "Digitized Document",
                "type": "image/webp",
                "properties": {
                    "size": 283188,
                    "width": 1695,
                    "height": 2261
                },
                "task_id": "task_b9qw2rd4vcvug29k",
                "created_at": "2025-08-22T20:03:39.000Z"
            }
        ]
    },
    "callback_url": null,
    "created_at": "2025-08-22T20:03:31.000Z",
    "updated_at": "2025-08-22T20:03:39.000Z"
}

You can then download the final digitized document using the /v1/files/{id}/download endpoint.