Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Sync

Authors
Affiliations
The Eric and Wendy Schmidt Center for Data Science & Environment
University of California, Berkeley
The Eric and Wendy Schmidt Center for Data Science & Environment
University of California, Berkeley
The Eric and Wendy Schmidt Center for Data Science & Environment
University of California, Berkeley

The output parameter accepts a dict with a uri (or url) field that enables syncing the local output file to a remote location (S3, GCS, etc.). Syncing can be triggered two ways:

  • Sync button — a button in the widget UI that uploads on click

  • ba.sync() — programmatic upload from the notebook

Under the hood the sync methods rely on jupyter_bioacoustic.audio.io.write(), using the app-configuration to set the source and destination locations.

Author: Brookie Guzder-Williams (bguzder-williams@berkeley.edu)
Affiliation: The Eric and Wendy Schmidt Center for Data Science & Environment
Website: https://dse.berkeley.edu/
import pandas as pd
from jupyter_bioacoustic import BioacousticAnnotator
from jupyter_bioacoustic.audio import io

DESCRIBE = False

1. Sync Button

When a uri is configured the widget adds a sync button to the bottom-right of the form panel. Clicking it uploads the current output file, overwriting the remote copy. The button disables during upload and re-enables when complete.

The config below uses sync_button: 'Sync to S3' for a custom label. Set sync_button: false to hide the button while still allowing programmatic sync via ba.sync().

ba = BioacousticAnnotator('annotator_config/projects/sync_example.yaml')
if DESCRIBE: ba.describe()
ba.open()
Loading...
Loading...
ba.output()
Loading...
io.read(ba.sync_uri, 'tmp.csv')
pd.read_csv('tmp.csv')
Loading...

2. Programmatic Sync

ba.sync() uploads the output file to the configured uri. This is useful for scripted workflows, scheduled uploads, or syncing after a batch of annotations.

# sync to the configured uri (s3://my-bucket/project/annotations/sync-example.csv)
ba.sync()
's3://dse-soundhub/dev/annotations/sync-example.csv'

You can override the destination or pass additional auth kwargs:

# override destination
ba.sync(dest='s3://dse-soundhub/dev/annotations/sync-example-2.csv')
's3://dse-soundhub/dev/annotations/sync-example-2.csv'

ADVANCED USAGE

Pass keyword args to override authorization or other fields

ba.sync(profile='prod', region_name='us-east-1')

For more control use io.write() directly — ba.sync() is a convenience wrapper around it:

from jupyter_bioacoustic.audio import io
help(io.write)
Help on function write in module jupyter_bioacoustic.audio.io:

write(
    src: str,
    dest: str,
    recursive: bool = False,
    overwrite: bool = True,
    **kwargs: Any
) -> str
    Write a file or directory to any destination.

    Returns:
        Destination path/URI (str).

io.write(
    'outputs/sync-example.csv',
    's3://dse-soundhub/dev/annotations/sync-example-3.csv',
)
's3://dse-soundhub/dev/annotations/sync-example-3.csv'
io.read(ba.sync_uri, 'tmp.csv')
pd.read_csv('tmp.csv')
Loading...