Index

Documentation

GitHub

Introduction

populse_db is a small Python module aiming is to store and query efficiently JSON data with the following constraints:

populse_db offers a simple API on top of a SQLite backend that allows to respect all the requirements.

Dependencies

Install tools take care of all the required dependencies for you but here is a summary of what populse_db needs:

  • Python >= 3.9. populse_db uses some features introduced in Python 3.9 (such as the possibility to subscript list type as in list[str]). Therefore it can’t work with Python 3.8 or earlier releases.

  • dateutil

  • Lark-parser >= 0.7.0.

  • [optional] Sphinx allows to build the documentation from source code.

Get started

Installation

Populse_db can be installed by standard Python tool. For instance:

pip install populse_db

The documentation is embedded in the project. However, if one needs to rebuild the documentation, use:

pip install populse_db[doc]

Basic usage

Populse_db is organized in collections and documents. A document is a JSON object represented by a dictionary in Python. A collection is a named container in which documents can be stored. Internally, collections corresponds to database tables and documents are stored in table rows (one document per row). But populse_db can hide this internal database stuff. The only thing user must do isto declare its collections if it uses an empty database.

from datetime import date, datetime
from populse_db import Storage
from pprint import pprint

# Create a storage using a file name
storage = Storage('/tmp/populse_db.sqlite')
# Create a read/write session
with storage.session(write=True) as db:
   db.last_modified = datetime.now()

   # Store documents in the database
   db['subject'] = Storage.Collection('subject_id')
   db['subject']['rbndt001'] = {
      'name': 'Eléa',
      'sex': 'f',
      'birth_date': date(1968, 3, 3),
   }
   db['subject']['rbndt002'] = {
      'name': 'Païkan',
      'sex': 'm',
      'birth_date': date(1963, 12, 7),
   }
   db['acquisition'] = Storage.Collection()
   db['acquisition']['rbndt001_t1'] = {
      'subject_id': 'rbndt001',
      'type': ['image', 'mri', 'T1'],
      'format': 'DICOM',
      'files': [
            '/somewhere/t1/acq0001.dcm',
            '/somewhere/t1/acq0002.dcm',
            '/somewhere/t1/acq0003.dcm',
            '/somewhere/t1/acq0004.dcm',
      ],
      'date': date(2022, 3, 28),
   }
   db['acquisition']['rbndt001_t2'] = {
      'subject_id': 'rbndt001',
      'type': ['image', 'mri', 'T2'],
      'format': 'DICOM',
      'files': [
            '/somewhere/t2/acq0001.dcm',
            '/somewhere/t2/acq0002.dcm',
            '/somewhere/t2/acq0003.dcm',
            '/somewhere/t2/acq0004.dcm',
      ],
      'date': date(2022, 3, 28),
   }
   db['acquisition']['rbndt002_t1'] = {
      'subject_id': 'rbndt002',
      'type': ['image', 'mri', 'T1'],
      'format': 'NIFTI',
      'files': [
            '/elsewhere/sub-rbndt001.nii',
      ],
      'date': date(2022, 3, 29),
   }

   # Retrieve a single value from storage
   print('Last modified:', db.last_modified.get())

   # Retrieve a single document from collection 'subject'
   pprint(db['subject']['rbndt001'].get())

   # Retrieve all documents from collection 'subject' respecting the
   # following conditions:
   #   - the "subject" field equals to "rbndt001"
   #   - the "type" field is a list containing the value "T1"
   for doc in db['acquisition'].search('subject=="rbndt001" and "T1" in type'):
      pprint(doc)

   # Retrieve all possible types of acquisition
   print('Subjects:', ', '.join(db.subject.distinct_values('subject_id')))

The example above illustrate how to use populse_db to store and retrieve JSON objects without wondering about the underlying SQL database engine. This script produces the following result:

{'birth_date': datetime.date(1968, 3, 3),
'name': 'Eléa',
'primary_key': 'rbndt001',
'sex': 'f'}
{'date': datetime.date(2022, 3, 28),
'files': ['/somewhere/t1/acq0001.dcm',
         '/somewhere/t1/acq0002.dcm',
         '/somewhere/t1/acq0003.dcm',
         '/somewhere/t1/acq0004.dcm'],
'format': 'DICOM',
'primary_key': 'rbndt001_t1',
'subject': 'rbndt001',
'type': ['image', 'mri', 'T1']}

Indices and tables