Doc Engine is the system for document search and retrieval. It current handles the following data types
Doc Engine applies strict access control and records all requests for documents in the database.
The search functionality of Doc Engine is exposed via its public API
/api/documents for Document search /api/batchrecords for Batch Record search
GET /documents
Resource URL: /api/documents
Parameter:
Response: A JSON string containing the records matching the query, or an empty []
Example Request:
GET /api/documents?query={“sn_title”:”he”,”document_level”:”1”,”labels”:[“2”]}&page_number=1
(Shown as the original strings. In reality, they must be URL-encoded before sending.)
{
"page_number": 1,
"num_pages": 1,
"data":
[
{
"title": "Test 124",
"labels": [
{
"content": "TEST"
}
],
"file_url": "/doc_engine/access/4/",
"version": "1.0",
"location": "",
"serial_number": "HE 124"
},
{
"title": "Home, sweet home",
"labels": [
{
"content": "TEST"
}
],
"file_url": "/doc_engine/access/10/",
"version": "1.0",
"location": "",
"serial_number": "HE 123121212"
}
]
}
There are two models for storing documents in Doc Engine: Document and FileObject. Document stores the metadata about each document, i.e. the serial number, title, and group permission. It also contains a version number that corresponds to a FileObject, which stores the file path to the actual file on the disk. The user can change the version number for a given Document instance and have it serve a specific version of the file.
Creating/updating/deleting of records is done via the Django admin interface
The search result provides a link to the actual file at /doc_engine/access/primary key/ . Access control is employed to ensure that only authorized users are permitted to access the document.
Security is ensured through the following mechanisms:
1. Group-based permission model: Each Document has a set of permitted groups. When a user tries to access the document, his/her group membership is checked against that of the Document. Only a user who passes the test will be given access to the file. Others will see “Access Denied.”
2. Access recording: Each time a user attempts to access a document, a record is written in the database, regardless of the outcome (success or access denial).
3. Watermarking of PDF documents: If the document to be accessed is a PDF file, an access watermark is added to the bottom of every page, specifying the user who downloaded the file and the time of access.
GET /batchrecords
Resource URL: /api/batchrecords
Parameters:
All the parameters are enclosed in a JSON query string
name optional | Product name for the batch record |
batch_number optional | Batch number of the batch record |
date_manufactured_from optional date_manufactured_to optional |
Specify the date range to search for [date_manufactred_from, date_manufactured_to] |
At least one parameter has to be non-empty, otherwise the server returns a 400 error.
Response: A JSON string containing the records matching the query, or an empty []
Example Request:
GET /api/batchrecords?query={“name”:”am”,”batch_number”:”AMP12”,”date_manufactured_from”:”2011-04-01”,”date_manufactured_to”:”2011-09-06”}&page_number=1
(Shown as the original strings. In reality, they must be URL-encoded before sending.)
{
"page_number": 1,
"num_pages": 1,
"data": [
{
"date_manufactured": "2011-08-19",
"batch_number": "AMP12",
"name": "Ampicillin",
"date_manufactured_minguo": "0100-08-19",
"location": "AB 123"
}
]
}
Creating/updating/deleting of records is done via the Django admin interface.