Automate dataset submission to the Protein Diffraction Image Server
The REST API lets you upload crystallographic raw datasets and track their processing status programmatically – without using the web interface. All requests go to:
https://proteindiffraction.org/api/v1/
Supported archive formats:
tar.gz / tgztar.bz2 / tbz2tarzip*.tar.zst, *.tar.gz.zst, etc.
(the server decompresses transparently)
Every request (except POST /api/v1/auth/token/) requires an
API token in the Authorization header:
Authorization: Token your40chartoken...
Exchange your email + password for a token once and store it safely.
Request (form-data or JSON):
curl -X POST https://proteindiffraction.org/api/v1/auth/token/ \
-d "username=you@lab.edu&password=secret"
Response 200:
{
"token": "643df60b0f9156255757c5f641d5ea95bb043198",
"user_id": 42,
"email": "you@lab.edu"
}
The token does not expire automatically. Regenerate it from your Profile page if needed.
A complete submission requires two API calls:
POST /api/v1/upload/POST /api/v1/submit/# ① Get a token once TOKEN=$(curl -s -X POST https://proteindiffraction.org/api/v1/auth/token/ \ -d "username=you@lab.edu&password=secret" \ | python3 -c "import sys,json; print(json.load(sys.stdin)['token'])") # ② Compress raw data with zstd (fast level-3) tar -I zstd -cf dataset_1ABC.tar.zst ./raw_diffraction_data/ # ③ Upload – server decompresses, validates and detects PDB ID UPLOAD=$(curl -s -X POST https://proteindiffraction.org/api/v1/upload/ \ -H "Authorization: Token $TOKEN" \ -F "dataset_file=@dataset_1ABC.tar.zst") echo $UPLOAD # {"token":"abc123…","pdbid":"1ABC","md5hash":"…","size":12345678} FILE_TOKEN=$(echo $UPLOAD | python3 -c "import sys,json; print(json.load(sys.stdin)['token'])") # ④ Submit with metadata curl -s -X POST https://proteindiffraction.org/api/v1/submit/ \ -H "Authorization: Token $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "token": "'"$FILE_TOKEN"'", "pdb_code": "1ABC", "project_title": "High-resolution structure of protein X", "project_authors": "Smith J., Doe A.", "equipment_other": "APS 24-ID-C beamline" }' # {"status":"submitted","token":"abc123…"} # "project_title", "project_authors" and "equipment_other" are optional if the "pdb_code" is given. # In this case this data will be fetched from RCSB. # ⑤ Poll for processing status curl -H "Authorization: Token $TOKEN" https://proteindiffraction.org/api/v1/submissions/$FILE_TOKEN/
We recommend sending files with zstd instead of gzip because it achieves similar compression at much higher speeds, however, tar.gz, tar.bz2 and zip are also supported.
| Method | Create archive | Decompress locally |
|---|---|---|
| zstd (recommended) | tar -I zstd -cf data.tar.zst ./data/ |
tar -I zstd -xf data.tar.zst |
| gzip | tar -czf data.tar.gz ./data/ |
tar -xzf data.tar.gz |
| bzip2 | tar -cjf data.tar.bz2 ./data/ |
tar -xjf data.tar.bz2 |
Install zstd: apt install zstd / brew install zstd
| Field | Type | Description |
|---|---|---|
username required |
string | Your registered e-mail address |
password required |
string | Account password |
Returns 200
{ token, user_id, email }
| Field | Type | Description | dataset_file required |
file | Archive file: .tar, .tar.gz, .tar.bz2, .zip,
or any of those compressed with zstd (.tar.zst, .tar.gz.zst, etc.).
The server decompresses zstd transparently — the inner archive is stored. |
|---|---|---|
dataset_type optional |
string | "pdb" (default) or "other" for non-PDB depositions |
Returns 201:
{
"token": "<40-char upload token>", // save this!
"md5hash": "d41d8cd98f00b204e9800998ecf8427e",
"filename": "dataset.tar",
"size": 1234567,
"pdbid": "1ABC", // null if not detected
"is_pdb_exists": false
}
The upload token is required for the POST /submit/ call.
If is_pdb_exists is true, this PDB ID already has
a dataset in the repository.
| Field | Type | Description |
|---|---|---|
token required |
string | Upload token returned by /upload/ |
pdb_code |
string | 4-letter PDB ID (required unless dataset_type=other) |
dataset_type optional |
string | "pdb" (default) or "other" |
project_title |
string | Descriptive title (min. 10 chars).
Required only when pdb_code is absent —
optional when a PDB code is given (title is fetched from RCSB). |
project_authors |
string | Author list, e.g. "Smith J., Doe A."
Required only when pdb_code is absent. |
synchrotron_id optional |
integer | Synchrotron ID from GET /equipment/ |
beamline_id optional |
integer | Beamline ID (pair with synchrotron_id) |
equipment_other |
string | Free-text equipment description.
Either this or synchrotron_id+beamline_id
is required only when pdb_code is absent. |
Returns 200:
{ "status": "submitted", "token": "…" }
Returns an array of all submissions belonging to the authenticated user, ordered newest-first.
Returns 200 – array of submission objects (see status fields below).
Returns 200 – full submission object:
{
"token": "abc123…",
"original_filename":"dataset.tar",
"md5hash": "d41d8…",
"pdbid": "1ABC",
"dataset_type": "pdb",
"project_title": "High-resolution structure…",
"project_authors": "Smith J., Doe A.",
"synchrotron": { "id": 25, "short": "APS" },
"beamline": { "id": 42, "short": "24-ID-C" },
"equipment_other": null,
"status": "validated",
"status_details": null,
"inserted": "2026-06-02T10:15:30Z",
"updated": "2026-06-02T10:18:45Z"
}
Delete a submission. Only allowed when the current status is deletable
(e.g. uploaded, validated, error_no_data).
Returns 204 No Content on success.
Returns the list of synchrotrons and their beamlines.
Use id values with the synchrotron_id and beamline_id
parameters of POST /submit/.
[
{
"id": 25, "short": "APS", "fullname": "Advanced Photon Source",
"url": "http://www.aps.anl.gov",
"beamlines": [
{ "id": 42, "short": "24-ID-C", "fullname": null, "url": null },
…
]
}, …
]
| Status | Meaning | Next action |
|---|---|---|
uploaded |
File received, waiting for metadata confirmation | Call POST /submit/ |
submitted |
Metadata confirmed, queued for validation | Wait |
pending |
Validation in progress | Wait |
validated |
Archive structure OK, metadata harvesting in progress | Wait |
published |
Data publicly accessible | Done ✓ |
error_no_data |
No diffraction images found in the archive | Delete and re-upload a corrected archive |
inconsistent_datasets |
Multiple non-matching datasets detected | Contact support or resubmit |
header_not_readable |
Image headers could not be parsed | Check image format; contact support |
The file api_upload_client.py available here: https://github.com/prubach/proteindiffraction_api
wraps the full workflow into a single command.
pip install requests zstandard
python api_upload_client.py \
--server https://proteindiffraction.org/ap \
--email you@lab.edu --password secret \
--path ./raw_data/ \
--pdb 1ABC \
--title "High-resolution structure of protein X" \
--authors "Smith J., Doe A." \
--synchrotron-id 25 --beamline-id 42
submit
# List available synchrotrons + beamlines python api_upload_client.py --server … --email … --password … equipment # List your submissions python api_upload_client.py --server … --email … --password … list # Check / poll a specific submission python api_upload_client.py --token abc123… --poll \ --server … --email … --password … status # Delete a pending submission python api_upload_client.py --token abc123… \ --server … --email … --password … delete
from api_upload_client import ProteinDiffractionClient, compress_directory_zstd client = ProteinDiffractionClient("https://proteindiffraction.org/ap") client.login("you@lab.edu", "secret") # compress and upload archive = compress_directory_zstd("./raw_data") upload = client.upload(archive) # submit with metadata client.submit( token = upload["token"], pdb_code = "1ABC", project_title = "High-resolution structure of protein X", project_authors= "Smith J., Doe A.", equipment_other= "APS 24-ID-C beamline", ) # wait for processing final = client.poll_until_done(upload["token"]) print(final["status"])
All errors return JSON with an "error" field:
{ "error": "Description of what went wrong." }
| Status | Meaning |
|---|---|
| 400 | Bad request – missing/invalid field or wrong file format |
| 401 | Authentication credentials missing or invalid |
| 403 | Forbidden – submission belongs to another user |
| 404 | Submission token not found |
| 409 | Conflict – duplicate file hash, or submission already processed |
| 500 | Internal server error – contact support |