PHAS Client API Reference

The PHAS client API allows you to connect to a running PHAS server and query data. It is a convenient wrapper around the RESTful API.

Usage Examples

Connecting to a Server

Obtain an API key through the PHAS web interface and save it as a JSON file. Pass the path to Client, or set the PHAS_AUTH_KEY environment variable.

from phas.client.api import Client, Task, AnnotationTask, DLTrainingTask, SamplingROITask
import pandas as pd

client = Client('https://phas.myserver.org:8888', '/home/user/private/myserver_api_key.json')

Browsing Projects and Tasks

Client provides methods to list the projects and tasks accessible to the current user. Results are dicts compatible with a pandas.DataFrame constructor.

# List all accessible projects
pd.DataFrame(client.project_listing()).set_index('id')

# List tasks within a project
pd.DataFrame(client.task_listing('my_project')).set_index('id')

# Control anonymization of private specimen names
client.anonymize = True

Working with Tasks and Slides

Task provides access to the slides in a task. Use the task-type-specific subclasses (AnnotationTask, DLTrainingTask, SamplingROITask) for additional functionality. The Slide class provides image access and metadata.

# List all slides in task 5, filtered to a specific stain
task = Task(client, 5)
pd.DataFrame(task.slide_manifest(stain='Nissl')).set_index('id')

# Access a slide and read its properties
from phas.client.api import Slide
slide = Slide(task, slide_id=42)
print(slide.specimen_public, slide.block_name, slide.section)
print(slide.dimensions)   # full-resolution pixel dimensions
print(slide.spacing)      # pixel spacing in mm

# Download a region of interest as a PIL image
patch = slide.get_patch(center=(10000, 8000), level=2, size=(512, 512))

# Download a thumbnail as a NIfTI image
slide.thumbnail_nifti_image(filename='thumb.nii.gz', max_dim=1000)

Annotations

Use AnnotationTask to read, write and export slide annotations.

annot_task = AnnotationTask(client, task_id=3)

# Get the annotation for a slide as a paper.js JSON dict
data = annot_task.get_slide_annot_json(slide_id=42)

# Export the annotation as an SVG file
annot_task.get_slide_annot_svg(slide_id=42, filename='annot.svg')

# Export / import all annotations in the task to a JSON file
annot_task.export_task_annots('all_annots.json')
annot_task.import_task_annots('all_annots.json')

Deep Learning Training Samples

Use DLTrainingTask to access classifier training samples and their patch images.

dl_task = DLTrainingTask(client, task_id=7)

# List all training samples on a slide
samples = dl_task.slide_training_samples(slide_id=42)

# Download the patch image for a sample
img = dl_task.get_sample_image(sample_id=samples[0]['id'])  # PIL Image

Sampling ROIs

Use SamplingROITask to read and manage sampling regions of interest.

roi_task = SamplingROITask(client, task_id=8)

# List slides that have at least two sampling ROIs
pd.DataFrame(roi_task.slide_manifest(min_sroi=2)).set_index('id')

# Get all sampling ROIs on a slide
rois = roi_task.slide_sampling_rois(slide_id=42)

# Download a NIfTI mask image of the sampling ROIs
roi_task.slide_sampling_roi_nifti_image(slide_id=42, filename='rois.nii.gz')

Label Sets

Tasks that involve labeling (training samples, sampling ROIs) have an associated Labelset. Access it through the task’s labelset property.

labels = dl_task.labelset.label_listing()
pd.DataFrame(labels).set_index('id')

API Reference

class phas.client.api.AnnotationTask(client: Client, task_id: int)

A representation of an annotation task on the remote server.

Parameters:
  • client (Client) – Connection to the PHAS server

  • task_id (int) – Numerical id of the task

export_task_annots(filename: str)

Export all annotations in the task to a JSON file.

Each entry in the file contains the slide name, slide id, and the paper.js annotation JSON. Slides with no annotation are omitted. The format is compatible with import_task_annots.

Parameters:

filename (str) – Path to the output JSON file.

Returns:

Number of annotations exported.

get_slide_annot_json(slide_id: int)

Get the annotation for a slide in raw paper.js JSON format.

Parameters:

slide_id (int) – Slide ID

Returns:

A dict containing the paper.js annotation, or None if the slide has no annotation.

get_slide_annot_svg(slide_id: int, downsample: int = 0, stroke_width: int = 480, strip_width: int = 0, font_size: str = '2000px', font_color: str = 'black', filename: str = None)

Get the annotation for a slide rendered as an SVG.

Parameters:
  • slide_id (int) – Slide ID

  • downsample (int, optional) – Maximum image dimension; 0 means full resolution (default: 0).

  • stroke_width (int, optional) – Stroke width for exported paths (default: 480).

  • strip_width (int, optional) – Strip width (default: 0).

  • font_size (str, optional) – Font size for markers (default: ‘2000px’).

  • font_color (str, optional) – Font color for markers (default: ‘black’).

  • filename (str, optional) – If provided, save SVG bytes to this file; otherwise return bytes.

Returns:

Raw SVG bytes, or None if filename was provided.

get_slide_annot_timestamp(slide_id: int)

Get the timestamp of the last edit to an annotation.

Parameters:

slide_id (int) – Slide ID

Returns:

Timestamp string (ISO format) of the last edit, or None if the slide has no annotation.

import_task_annots(filename: str)

Import annotations from a JSON file created by export_task_annots.

Slides are matched by slide_name. Records whose slide name is not found in the task manifest are skipped.

Parameters:

filename (str) – Path to the input JSON file.

Returns:

Number of annotations successfully imported.

set_slide_annot_json(slide_id: int, data: dict)

Upload an annotation for a slide in raw paper.js JSON format.

Parameters:
  • slide_id (int) – Slide ID

  • data (dict) – Annotation in paper.js JSON format (raw slide coordinate space)

Returns:

True if the update was successful.

class phas.client.api.Client(url, api_key: str | None = None, verify=True)

A connection to a remote PHAS server.

The connection is established by supplying a server address and an API key. The API key, which is a JSON file that can be generated through the web interface can also be passed in by setting the environment variable PHAS_AUTH_KEY.

Parameters:
  • url (str) – URL of the remote server, e.g., http://histo.itksnap.org:8888. The URL must include the scheme (http: or https:) while the port number is optional.

  • api_key (str,optional) – path to the JSON file storing the user’s API key. The API key can be downloaded by connecting to the PHAS server via the web interface.

  • verify (bool, optional) – Whether to perform SSL verification (see requests package)

property anonymize

Whether the client is connected in anonymized mode (bool). In anonymized mode, all personally identifying information such as specimen private names are hidden. Even if anonymized mode is off, unless the user has been granted explicit access to private information on a project, such information will still be hidden.

get_project_private_access(project: str)

Check whether the user has access to private information on a project. If not, such information will be hidden even if anonymized mode is off.

Parameters:

project (str) – Id of the project

Returns:

True if the user has access to private information on the project, False otherwise.

project_listing()

Listing of projects available on the server.

Returns:

A dict with project details that can be passed to a pandas DataFrame constructor. Only the tasks to which the user has access are returned.

task_listing(project)

Listing of tasks available for a project.

Parameters:

project (str) – Id of the project

Returns:

A dict with task details that can be passed to a pandas DataFrame constructor. Only the tasks to which the user has access are returned.

class phas.client.api.DLTrainingTask(client: Client, task_id: int)

A representation of a classifier training task on the remote server.

Parameters:
  • client (Client) – Connection to the PHAS server

  • task_id (int) – Numerical id of the task

get_sample_image(sample_id: int)

Download a PNG for a sample.

Parameters:

sample_id (id) – ID of the sample

Returns:

PIL Image containing the requested region

slide_training_samples(slide_id: int)

Get all the training samples available on a slide.

Parameters:

slide_id (int) – Slide ID

Returns:

A dict containing the training samples

class phas.client.api.Labelset(client: Client, project: str, labelset_id: int)

A representation of a labelset in PHAS.

Parameters:
  • client (Client) – Connection to the PHAS server

  • project (str) – Project associated with the labelset

  • labelset_id (int) – Numeric id of the labelset

label_listing()

List all labels in this labelset.

Returns:

A list of dicts, each describing a label with fields such as id, name, and color.

class phas.client.api.SamplingROIPatchExtractor(slide, geom, tile_size=1024, padding=32, sub_tiles_per_tile=16)

SamplingROIPatchExtractor is a helper class that makes it easy to download patches that overlap a sampling ROI. This is particularly useful for sampling ROIs that are large where we do not want to process the whole ROI using quantitative tools, but rather extract a few random measurements from the ROI.

The area containing the sampling ROI is divided into equal size tiles (default size 1024x1024) and the list of tiles that overlap the sampling ROI can be obtained using tiles() method. Then for each tile, a patch containing that tile, plus some padding can be extracted, along with a mask that shows the portion of the tile that overlaps the sampling ROI, using get_tile_patch_and_mask.

Parameters:
  • slide (Slide) – The slide from which the patches will be sampled

  • geom (dict) – The geometry of the sampling ROI that will be used to guide the sampling.

  • tile_size (int, optional) – The size of the tile, default is 1024 pixels, should be divisible by sub_tiles_per_tile

  • padding (int, optional) – Padding added to the tiles. Default: 32 pixels.

  • sub_tiles_per_tile (int, optional) – Subtiling is used to render the polygon before figuring out what tiles are overlapping the ROI. The default 16 should be good for most applications.

get_tile_patch_and_mask(tile_index)

Download a padded tile and ROI mask for the tile.

Parameters:

tile_index (tuple) – Index of the tile, must be one of the rows of the array returned by tiles()

Returns:

Tuple image,`mask`, where image is a PIL image containig the padded tile, and mask is a PIL image in which all the pixels in the non-padded part of the tile that overlap the sampling ROI are labeled 255.

subtile_mask_density()

Returns: a 2D array where each element represents a subtile and the value of each element is the percentage of the area of that tile that overlaps the sampling ROI. For visualization/debugging use.

tile_mask_density()

Returns: a 2D array where each element represents a tile and the value of each element is the percentage of the area of that tile that overlaps the sampling ROI. For visualization/debugging use.

tile_patch_origin(tile_index)

Compute the origin (per ITK) of the patch returned by get_tile_patch_and_mask

Parameters:

tile_index (tuple) – Index of the tile, must be one of the rows of the array returned by tiles()

Returns:

np.array of size 2 containing the coordinate of the center of the (0,0) pixel in the patch returned by get_tile_patch_and_mask. This can be used, together with Slide.spacing to set the header of the patch image relative to the overall slide image.

tiles()

Return the indices of the tiles that overlap the sampling ROI

Returns:

N x 2 array of tile indices, each tile corresponds to a patch that can be downloaded that overlaps the sampling ROI.

class phas.client.api.SamplingROITask(client: Client, task_id: int)

A representation of a sampling ROI placement task on the remote server.

Parameters:
  • client (Client) – Connection to the PHAS server

  • task_id (int) – Numerical id of the task

create_sampling_roi(slide_id: int, label_id: int, geom_data: dict)

Create a new sampling ROI on a slide.

Parameters:
  • slide_id (int) – Slide ID

  • label_id (int) – Label to assign to the new sampling ROI

  • geom_data (dict) – A dict describing the sampling ROI geometry, see dltrain.sampling_roi_schema

Returns:

id of the newly created ROI

delete_sampling_rois_on_slide(slide_id: int)

Delete all the sampling ROIs on a slide.

Parameters:

slide_id (int) – Slide ID

Returns:

True if request was successful

slide_sampling_roi_nifti_image(slide_id: int, filename: str = None, max_dim: int = 1000)

Generate a NIFTI image of the sampling ROIs on a slide.

Parameters:
  • slide_id (int) – Slide ID

  • filename (str, optional) – File where to save the image, if not specified bytes containing the image are returned

  • max_dim (int, optional) – Maximum image dimension, defaults to 1000.

Returns:

raw image data as bytes or None if filename provided

slide_sampling_rois(slide_id: int)

Get all the sampling ROIs on a slide.

Parameters:

slide_id (int) – Slide ID

Returns:

A dict containing the sampling ROIs

class phas.client.api.Slide(task: Task, slide_id: int)

A representation of a slide on the remote server.

This class represents a slide and provides access to methods that are slide-specific.

Parameters:
  • task (Task) – Task under which to access the slide. The task is used to check access priviliges.

  • slide_id (int) – Numerical id of the slide

property block_name

Name of the slide’s block (str)

property dimensions

Slide dimensions in pixels (tuple of int).

property fullpath

Full path or URL of the slide on the server.

get_patch(center, level, size, tile_size=1024)

Read a region from the slide.

Parameters:
  • center – Tuple of int indicating the center of the patch in full-resolution pixel units

  • level – Downsample level from which to retrieve the region.

  • size – Size of the image to retrieve.

Returns:

PIL Image containing the requested region

property level_dimensions

Pixel dimensions at each OpenSlide downsample level (list of (width, height) tuples).

property level_downsamples

Downsample factor for each OpenSlide level relative to level 0 (list of float).

property properties

OpenSlide properties dict for the slide (e.g. scanner metadata, MPP values).

property section

Number of the slide’s section (int)

property slide_number

Number of the slide within the section (int)

property spacing

Slide pixel spacing in millimeters.

property specimen_private

Name of the slide’s specimen, not anonymized (str)

property specimen_public

Name of the slide’s specimen, anonymized (str)

property stain

Name of the slide’s stain (str)

property tags

List of tags associated with the slide. Tags are arbitrary strings that can be used to label slides and filter them when listing slides in a task.

thumbnail_nifti_image(filename: str = None, max_dim: int = 1000)

Download a thumbnail of the slide as a NIfTI image.

Parameters:
  • filename (str, optional) – File where to save the image. If not specified, raw bytes containing the image are returned.

  • max_dim (int, optional) – Maximum image dimension in pixels, defaults to 1000.

Returns:

Raw image data as bytes, or None if filename was provided.

class phas.client.api.Task(client: Client, task_id: int)

A representation of a task on the remote server.

This class represents a task and provides access to methods that are task-specific.

Parameters:
  • client (Client) – Connection to the PHAS server

  • task_id (int) – Numerical id of the task

property labelset

Labelset associated with this task (or None if task has no labelset)

slide_manifest(specimen=None, block=None, section=None, slide=None, stain=None, min_paths=None, min_markers=None, min_sroi=None, tags_any=None, tags_all=None, tags_none=None)

A detailed listing of the slides in the task.

Parameters:
  • specimen (str,optional) – Only list slides for the given specimen

  • block (str,optional) – Only list slides for the given block

  • section (str,optional) – Only list slides for the given section

  • slide (str,optional) – Only list slides for the given slide number

  • stain (str,optional) – Only list slides with the given stain

  • min_paths (int, optional) – Only list slides with at least so many annotation paths

  • min_markers (int, optional) – Only list slides with at least so many annotation markers

  • min_sroi (int, optional) – Only list slides with at least so many sampling ROIs

  • tags_any (list of str, optional) – Only list slides that have at least one of the tags in this list

  • tags_all (list of str, optional) – Only list slides that have all the tags in this list

  • tags_none (list of str, optional) – Only list slides that have none of the tags in this list

Returns:

A dict with slide details that can be passed to a pandas DataFrame constructor