PHAS Client API Reference

The PHAS client API allows you to connect to a running PHAS server and query data. It is a convenient wrapper around the RESTful API.

Usage Examples

Connecting to a Server

Obtain an API key through the PHAS web interface and save it as a JSON file. Pass the path to Client, or set the PHAS_AUTH_KEY environment variable.

from phas.client.api import Client, Task, AnnotationTask, DLTrainingTask, SamplingROITask
import pandas as pd

client = Client('https://phas.myserver.org:8888', '/home/user/private/myserver_api_key.json')

Browsing Projects and Tasks

Client provides methods to list the projects and tasks accessible to the current user. Results are dicts compatible with a pandas.DataFrame constructor.

# List all accessible projects
pd.DataFrame(client.project_listing()).set_index('id')

# List tasks within a project
pd.DataFrame(client.task_listing('my_project')).set_index('id')

# Control anonymization of private specimen names
client.anonymize = True

Working with Tasks and Slides

Task provides access to the slides in a task. Use the task-type-specific subclasses (AnnotationTask, DLTrainingTask, SamplingROITask) for additional functionality. The Slide class provides image access and metadata.

# List all slides in task 5, filtered to a specific stain
task = Task(client, 5)
pd.DataFrame(task.slide_manifest(stain='Nissl')).set_index('id')

# Access a slide and read its properties
from phas.client.api import Slide
slide = Slide(task, slide_id=42)
print(slide.specimen_public, slide.block_name, slide.section)
print(slide.dimensions)   # full-resolution pixel dimensions
print(slide.spacing)      # pixel spacing in mm

# Download a region of interest as a PIL image
patch = slide.get_patch(center=(10000, 8000), level=2, size=(512, 512))

# Download a thumbnail as a NIfTI image
slide.thumbnail_nifti_image(filename='thumb.nii.gz', max_dim=1000)

Annotations

Use AnnotationTask to read, write and export slide annotations.

annot_task = AnnotationTask(client, task_id=3)

# Get the annotation for a slide as a paper.js JSON dict
data = annot_task.get_slide_annot_json(slide_id=42)

# Export the annotation as an SVG file
annot_task.get_slide_annot_svg(slide_id=42, filename='annot.svg')

# Export / import all annotations in the task to a JSON file
annot_task.export_task_annots('all_annots.json')
annot_task.import_task_annots('all_annots.json')

Deep Learning Training Samples

Use DLTrainingTask to access classifier training samples and their patch images.

dl_task = DLTrainingTask(client, task_id=7)

# List all training samples on a slide
samples = dl_task.slide_training_samples(slide_id=42)

# Download the patch image for a sample
img = dl_task.get_sample_image(sample_id=samples[0]['id'])  # PIL Image

Sampling ROIs

Use SamplingROITask to read and manage sampling regions of interest.

roi_task = SamplingROITask(client, task_id=8)

# List slides that have at least two sampling ROIs
pd.DataFrame(roi_task.slide_manifest(min_sroi=2)).set_index('id')

# Get all sampling ROIs on a slide
rois = roi_task.slide_sampling_rois(slide_id=42)

# Download a NIfTI mask image of the sampling ROIs
roi_task.slide_sampling_roi_nifti_image(slide_id=42, filename='rois.nii.gz')

Label Sets

Tasks that involve labeling (training samples, sampling ROIs) have an associated Labelset. Access it through the task’s labelset property.

labels = dl_task.labelset.label_listing()
pd.DataFrame(labels).set_index('id')

API Reference

class phas.client.api.AnnotationTask(client: Client, task_id: int)

A representation of an annotation task on the remote server.

Parameters:

client (Client) – Connection to the PHAS server
task_id (int) – Numerical id of the task

export_task_annots(filename: str)

Export all annotations in the task to a JSON file.

Each entry in the file contains the slide name, slide id, and the paper.js annotation JSON. Slides with no annotation are omitted. The format is compatible with import_task_annots.

Parameters:: filename (str) – Path to the output JSON file.
Returns:: Number of annotations exported.

get_slide_annot_json(slide_id: int)

Get the annotation for a slide in raw paper.js JSON format.

Parameters:: slide_id (int) – Slide ID
Returns:: A dict containing the paper.js annotation, or None if the slide has no annotation.

get_slide_annot_svg(slide_id: int, downsample: int = 0, stroke_width: int = 480, strip_width: int = 0, font_size: str = '2000px', font_color: str = 'black', filename: str = None)

Get the annotation for a slide rendered as an SVG.

Parameters:

slide_id (int) – Slide ID
downsample (int, optional) – Maximum image dimension; 0 means full resolution (default: 0).
stroke_width (int, optional) – Stroke width for exported paths (default: 480).
strip_width (int, optional) – Strip width (default: 0).
font_size (str, optional) – Font size for markers (default: ‘2000px’).
font_color (str, optional) – Font color for markers (default: ‘black’).
filename (str, optional) – If provided, save SVG bytes to this file; otherwise return bytes.

Returns:

Raw SVG bytes, or None if filename was provided.

get_slide_annot_timestamp(slide_id: int)

Get the timestamp of the last edit to an annotation.

Parameters:: slide_id (int) – Slide ID
Returns:: Timestamp string (ISO format) of the last edit, or None if the slide has no annotation.

import_task_annots(filename: str)

Import annotations from a JSON file created by export_task_annots.

Slides are matched by slide_name. Records whose slide name is not found in the task manifest are skipped.

Parameters:: filename (str) – Path to the input JSON file.
Returns:: Number of annotations successfully imported.

set_slide_annot_json(slide_id: int, data: dict)

Upload an annotation for a slide in raw paper.js JSON format.

Parameters:

slide_id (int) – Slide ID
data (dict) – Annotation in paper.js JSON format (raw slide coordinate space)

Returns:

True if the update was successful.

class phas.client.api.Client(url, api_key: str | None = None, verify=True)

A connection to a remote PHAS server.

The connection is established by supplying a server address and an API key. The API key, which is a JSON file that can be generated through the web interface can also be passed in by setting the environment variable PHAS_AUTH_KEY.

Parameters:

url (str) – URL of the remote server, e.g., http://histo.itksnap.org:8888. The URL must include the scheme (http: or https:) while the port number is optional.
api_key (str,optional) – path to the JSON file storing the user’s API key. The API key can be downloaded by connecting to the PHAS server via the web interface.
verify (bool, optional) – Whether to perform SSL verification (see requests package)

property anonymize: Whether the client is connected in anonymized mode (bool). In anonymized mode, all personally identifying information such as specimen private names are hidden. Even if anonymized mode is off, unless the user has been granted explicit access to private information on a project, such information will still be hidden.

get_project_private_access(project: str)

Check whether the user has access to private information on a project. If not, such information will be hidden even if anonymized mode is off.

Parameters:: project (str) – Id of the project
Returns:: True if the user has access to private information on the project, False otherwise.

project_listing()

Listing of projects available on the server.

Returns:: A dict with project details that can be passed to a pandas DataFrame constructor. Only the tasks to which the user has access are returned.

task_listing(project)

Listing of tasks available for a project.

Parameters:: project (str) – Id of the project
Returns:: A dict with task details that can be passed to a pandas DataFrame constructor. Only the tasks to which the user has access are returned.

class phas.client.api.DLTrainingTask(client: Client, task_id: int)

A representation of a classifier training task on the remote server.

Parameters:

client (Client) – Connection to the PHAS server
task_id (int) – Numerical id of the task

get_sample_image(sample_id: int)

Download a PNG for a sample.

Parameters:: sample_id (id) – ID of the sample
Returns:: PIL Image containing the requested region

slide_training_samples(slide_id: int)

Get all the training samples available on a slide.

Parameters:: slide_id (int) – Slide ID
Returns:: A dict containing the training samples

class phas.client.api.Labelset(client: Client, project: str, labelset_id: int)

A representation of a labelset in PHAS.

Parameters:

client (Client) – Connection to the PHAS server
project (str) – Project associated with the labelset
labelset_id (int) – Numeric id of the labelset

label_listing()

List all labels in this labelset.

Returns:: A list of dicts, each describing a label with fields such as id, name, and color.

class phas.client.api.SamplingROIPatchExtractor(slide, geom, tile_size=1024, padding=32, sub_tiles_per_tile=16)

SamplingROIPatchExtractor is a helper class that makes it easy to download patches that overlap a sampling ROI. This is particularly useful for sampling ROIs that are large where we do not want to process the whole ROI using quantitative tools, but rather extract a few random measurements from the ROI.

The area containing the sampling ROI is divided into equal size tiles (default size 1024x1024) and the list of tiles that overlap the sampling ROI can be obtained using tiles() method. Then for each tile, a patch containing that tile, plus some padding can be extracted, along with a mask that shows the portion of the tile that overlaps the sampling ROI, using get_tile_patch_and_mask.

Parameters:

slide (Slide) – The slide from which the patches will be sampled
geom (dict) – The geometry of the sampling ROI that will be used to guide the sampling.
tile_size (int, optional) – The size of the tile, default is 1024 pixels, should be divisible by sub_tiles_per_tile
padding (int, optional) – Padding added to the tiles. Default: 32 pixels.
sub_tiles_per_tile (int, optional) – Subtiling is used to render the polygon before figuring out what tiles are overlapping the ROI. The default 16 should be good for most applications.

get_tile_patch_and_mask(tile_index)

Download a padded tile and ROI mask for the tile.

Parameters:: tile_index (tuple) – Index of the tile, must be one of the rows of the array returned by tiles()
Returns:: Tuple image,`mask`, where image is a PIL image containig the padded tile, and mask is a PIL image in which all the pixels in the non-padded part of the tile that overlap the sampling ROI are labeled 255.

subtile_mask_density(): Returns: a 2D array where each element represents a subtile and the value of each element is the percentage of the area of that tile that overlaps the sampling ROI. For visualization/debugging use.

tile_mask_density(): Returns: a 2D array where each element represents a tile and the value of each element is the percentage of the area of that tile that overlaps the sampling ROI. For visualization/debugging use.

tile_patch_origin(tile_index)

Compute the origin (per ITK) of the patch returned by get_tile_patch_and_mask

Parameters:: tile_index (tuple) – Index of the tile, must be one of the rows of the array returned by tiles()
Returns:: np.array of size 2 containing the coordinate of the center of the (0,0) pixel in the patch returned by get_tile_patch_and_mask. This can be used, together with Slide.spacing to set the header of the patch image relative to the overall slide image.

tiles()

Return the indices of the tiles that overlap the sampling ROI

Returns:: N x 2 array of tile indices, each tile corresponds to a patch that can be downloaded that overlaps the sampling ROI.

class phas.client.api.SamplingROITask(client: Client, task_id: int)

A representation of a sampling ROI placement task on the remote server.

Parameters:

client (Client) – Connection to the PHAS server
task_id (int) – Numerical id of the task

create_sampling_roi(slide_id: int, label_id: int, geom_data: dict)

Create a new sampling ROI on a slide.

Parameters:

slide_id (int) – Slide ID
label_id (int) – Label to assign to the new sampling ROI
geom_data (dict) – A dict describing the sampling ROI geometry, see dltrain.sampling_roi_schema

Returns:

id of the newly created ROI

delete_sampling_rois_on_slide(slide_id: int)

Delete all the sampling ROIs on a slide.

Parameters:: slide_id (int) – Slide ID
Returns:: True if request was successful

slide_sampling_roi_nifti_image(slide_id: int, filename: str = None, max_dim: int = 1000)

Generate a NIFTI image of the sampling ROIs on a slide.

Parameters:

slide_id (int) – Slide ID
filename (str, optional) – File where to save the image, if not specified bytes containing the image are returned
max_dim (int, optional) – Maximum image dimension, defaults to 1000.

Returns:

raw image data as bytes or None if filename provided

slide_sampling_rois(slide_id: int)

Get all the sampling ROIs on a slide.

Parameters:: slide_id (int) – Slide ID
Returns:: A dict containing the sampling ROIs

class phas.client.api.Slide(task: Task, slide_id: int)

A representation of a slide on the remote server.

This class represents a slide and provides access to methods that are slide-specific.

Parameters:

task (Task) – Task under which to access the slide. The task is used to check access priviliges.
slide_id (int) – Numerical id of the slide

property block_name: Name of the slide’s block (str)

property dimensions: Slide dimensions in pixels (tuple of int).

property fullpath: Full path or URL of the slide on the server.

get_patch(center, level, size, tile_size=1024)

Read a region from the slide.

Parameters:

center – Tuple of int indicating the center of the patch in full-resolution pixel units
level – Downsample level from which to retrieve the region.
size – Size of the image to retrieve.

Returns:

PIL Image containing the requested region

property level_dimensions: Pixel dimensions at each OpenSlide downsample level (list of (width, height) tuples).

property level_downsamples: Downsample factor for each OpenSlide level relative to level 0 (list of float).

property properties: OpenSlide properties dict for the slide (e.g. scanner metadata, MPP values).

property section: Number of the slide’s section (int)

property slide_number: Number of the slide within the section (int)

property spacing: Slide pixel spacing in millimeters.

property specimen_private: Name of the slide’s specimen, not anonymized (str)

property specimen_public: Name of the slide’s specimen, anonymized (str)

property stain: Name of the slide’s stain (str)

property tags: List of tags associated with the slide. Tags are arbitrary strings that can be used to label slides and filter them when listing slides in a task.

thumbnail_nifti_image(filename: str = None, max_dim: int = 1000)

Download a thumbnail of the slide as a NIfTI image.

Parameters:

filename (str, optional) – File where to save the image. If not specified, raw bytes containing the image are returned.
max_dim (int, optional) – Maximum image dimension in pixels, defaults to 1000.

Returns:

Raw image data as bytes, or None if filename was provided.

class phas.client.api.Task(client: Client, task_id: int)

A representation of a task on the remote server.

This class represents a task and provides access to methods that are task-specific.

Parameters:

client (Client) – Connection to the PHAS server
task_id (int) – Numerical id of the task

property labelset: Labelset associated with this task (or None if task has no labelset)

slide_manifest(specimen=None, block=None, section=None, slide=None, stain=None, min_paths=None, min_markers=None, min_sroi=None, tags_any=None, tags_all=None, tags_none=None)

A detailed listing of the slides in the task.

Parameters:

specimen (str,optional) – Only list slides for the given specimen
block (str,optional) – Only list slides for the given block
section (str,optional) – Only list slides for the given section
slide (str,optional) – Only list slides for the given slide number
stain (str,optional) – Only list slides with the given stain
min_paths (int, optional) – Only list slides with at least so many annotation paths
min_markers (int, optional) – Only list slides with at least so many annotation markers
min_sroi (int, optional) – Only list slides with at least so many sampling ROIs
tags_any (list of str, optional) – Only list slides that have at least one of the tags in this list
tags_all (list of str, optional) – Only list slides that have all the tags in this list
tags_none (list of str, optional) – Only list slides that have none of the tags in this list

Returns:

A dict with slide details that can be passed to a pandas DataFrame constructor