Digitize Document for Archiving
Learn how to digitize a physical document for digital archiving and search.
This guide will walk you through the process of digitizing a physical document from an image. This is a common requirement for businesses that want to move to a paperless office, archive their documents, and make them easily searchable.
See in Postman
This guide's API calls are available as a Postman collection. You can use it to quickly test the API and see how it works.
Business Problem
Imagine you work for a law firm that has thousands of physical case files. Finding a specific document can be a time-consuming process. The firm wants to create a digital archive of all its case files to make them easily searchable and accessible.
Solution
We can solve this problem by using the Scan Documents API to digitize the physical documents and prepare them for archiving. The process involves:
- Uploading an Image: Taking a picture of the document and uploading it to the API.
- Detecting the Document: Automatically identifying the document within the image.
- Warping the Document: Correcting the perspective of the document to make it look flat.
- Enhancing the Colors: Improving the readability of the document by applying a scanner-like effect.
This process will result in a high-quality digital version of the physical document, which can then be stored in a document management system and indexed for full-text search.
Step 1: Upload the Image
First, you need to upload the image containing the document to the Scan Documents API. You can do this by sending a POST
request to the /v1/files
endpoint.
Upload a File
Creates a new file
curl -X POST "https://api.scan-documents.com/v1/files" \
-H "x-api-key: YOUR_API_KEY" \
-F name="Case File p. 1" \
-F file="@/path/to/your/document.jpg"
The API will respond with a file object, which includes an ID for the uploaded file. You will use this ID in the next steps.
{
"id": "file_glh4pbl2lbu59s07",
"name": "Case File p. 1",
"type": "image/webp",
"properties": {
"size": 265174,
"width": 2448,
"height": 3264
},
"task_id": null,
"created_at": "2025-08-20T20:00:05.000Z"
}
Step 2: Detect the Document (or skip to Scan Endpoint Alternative)
Next, you need to detect the document in the uploaded image. You can do this by creating a detect-documents
task.
Detect Documents
Creates a task to detect document boundaries within an image
curl -X POST "https://api.scan-documents.com/v1/image-operations/detect-documents" \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "file_glh4pbl2lbu59s07"
}'
The API will respond with a task object. Once the task is completed, its result
will contain the coordinates of the detected document.
{
"id": "task_vxjynjez5nw8didz",
"operation": "detect-documents",
"status": "completed",
"parameters": {
"input": "file_glh4pbl2lbu59s07",
"scan_mode": "standard"
},
"result": {
"documents": [
{
"vertices": [
{
"x": 82,
"y": 646
},
{
"x": 1539,
"y": 394
},
{
"x": 2359,
"y": 2105
},
{
"x": 807,
"y": 2788
}
],
"bounding_box": {
"top": 394,
"left": 82,
"width": 2277,
"height": 2394
},
"file_id": "file_glh4pbl2lbu59s07"
}
]
},
"callback_url": null,
"created_at": "2025-08-22T19:58:00.000Z",
"updated_at": "2025-08-22T19:58:10.000Z"
}
Step 3: Warp the Document
Now that you have the coordinates of the document, you can warp it to correct its perspective. This is done by creating a warp
task.
Warp Image
Creates a task to warp an image using the specified vertices
curl -X POST "https://api.scan-documents.com/v1/image-operations/warp" \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "input": "file_glh4pbl2lbu59s07", "name": "Warped Document", "vertices": [ { "x": 82, "y": 646 }, { "x": 1539, "y": 394 }, { "x": 2359, "y": 2105 }, { "x": 807, "y": 2788 } ] }'
The result of this task will be a new image file with the warped document.
{
"id": "task_jbm922sf4it82itr",
"operation": "warp",
"status": "completed",
"parameters": {
"input": "file_glh4pbl2lbu59s07",
"name": "Warped Document",
"vertices": [
{
"x": 82,
"y": 646
},
{
"x": 1539,
"y": 394
},
{
"x": 2359,
"y": 2105
},
{
"x": 807,
"y": 2788
}
]
},
"result": {
"generated_files": [
{
"id": "file_hvm5unqnr1d3xw8k",
"name": "Warped Document",
"type": "image/webp",
"properties": {
"size": 207270,
"width": 1695,
"height": 2261
},
"task_id": "task_jbm922sf4it82itr",
"created_at": "2025-08-22T20:02:00.000Z"
}
]
},
"callback_url": null,
"created_at": "2025-08-22T20:01:52.000Z",
"updated_at": "2025-08-22T20:02:01.000Z"
}
Step 4: Enhance the Colors
Finally, you can enhance the colors of the warped document to make it look like a scanned document. This is done by creating an apply-effect
task with the scanner
effect.
Apply Effect
Creates a task to apply a predefined effect to an image
curl -X POST "https://api.scan-documents.com/v1/image-operations/apply-effect" \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "input": "file_hvm5unqnr1d3xw8k", "name": "Digitized Document", "effect": "scanner" }'
The result of this task will be the final digitized document.
{
"id": "task_b9qw2rd4vcvug29k",
"operation": "apply-effect",
"status": "completed",
"parameters": {
"input": "file_hvm5unqnr1d3xw8k",
"name": "Digitized Document",
"effect": "scanner"
},
"result": {
"generated_files": [
{
"id": "file_jmjje3ut90btw1r9",
"name": "Digitized Document",
"type": "image/webp",
"properties": {
"size": 283188,
"width": 1695,
"height": 2261
},
"task_id": "task_b9qw2rd4vcvug29k",
"created_at": "2025-08-22T20:03:39.000Z"
}
]
},
"callback_url": null,
"created_at": "2025-08-22T20:03:31.000Z",
"updated_at": "2025-08-22T20:03:39.000Z"
}
You can now download the final image using the /v1/files/{id}/download
endpoint. This digitized document is now ready to be archived and indexed for search.
Scan Endpoint Alternative
The process described above is the most flexible, allowing you to customize each step. However, if you want a simpler approach, you can use the scan
endpoint, which combines all the steps into a single API call.
Scan Document
Creates a task to scan an image file. This is an equivalent operation for detect-documents
and warp
combined, additionally it can apply effects to the scanned image.
curl -X POST "https://api.scan-documents.com/v1/image-operations/scan-image" \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "file_glh4pbl2lbu59s07",
"name": "Scanned Document",
"scan_mode": "standard",
"effect": "scanner"
}'
The scan
endpoint will automatically detect the document, warp it, and apply the scanner effect in one go. The result will be a digitized document ready for archiving.
{
"id": "task_b9qw2rd4vcvug29k",
"operation": "apply-effect",
"status": "completed",
"parameters": {
"input": "file_hvm5unqnr1d3xw8k",
"name": "Digitized Document",
"effect": "scanner"
},
"result": {
"generated_files": [
{
"id": "file_jmjje3ut90btw1r9",
"name": "Digitized Document",
"type": "image/webp",
"properties": {
"size": 283188,
"width": 1695,
"height": 2261
},
"task_id": "task_b9qw2rd4vcvug29k",
"created_at": "2025-08-22T20:03:39.000Z"
}
]
},
"callback_url": null,
"created_at": "2025-08-22T20:03:31.000Z",
"updated_at": "2025-08-22T20:03:39.000Z"
}
You can then download the final digitized document using the /v1/files/{id}/download
endpoint.