SDK Reference

Log parameter

Logs an individual input parameter when it is called. Logged parameters are accessible programmatically or through the GUI as soon as this function is called within your job.

Python

foundations.log_param(key, value)

Arguments

  • key (str): The name of the input parameter.

  • value (number, str, bool, array of [number|str|bool], array of array of [number|str|bool]): the value associated with the given input parameter.

Returns

  • This function doesn't return a value.

Raises

  • TypeError: When a value of a non-supported type is provided as the metric value.

Note

Multiple calls with the same key during the same job will overwrite the previously logged value.

Example

import foundations
foundations.log_param("learning rate", 0.001)

Log parameter dictionary

Similar to log_param, but accepts a dictionary of key-value pairs.

Python

foundations.log_params({})

Arguments

  • dict : Dictionary of parameters to log. Each key-value pair needs to satisfy the same constraints as that of log_param

Raises

  • TypeError: When a value of a non-supported type is provided as the metric value.

Returns

  • This function doesn't return a value.

Example

import foundations
foundations.log_params({"learning_rate": 0.001,
                        "batch_size": 32,
                        "epochs": 75})

Log metric

Logs a metric when it is called. Logged metrics are accessible programmatically or through GUI as soon as this function is called within your job. e.g. this can happen at the end of every epoch to get updated metrics live.

Note

Currently logging numpy types is not supported.

foundations.log_metric(key, value)

Arguments

  • key (str): the name of the output metric.

  • value (number, str, bool, array of [number|str|bool], array of array of [number|str|bool]): the value associated with the given output metric.

Returns

  • This function doesn't return a value.

Raises

  • TypeError: When a value of a non-supported type is provided as the metric value.

Note

Multiple calls with the same key during the same job will create and append to a list containing the previously logged values.

Example

import foundations
foundations.log_metric("accuracy", 0.90)
foundations.log_metric("accuracy", 0.93)

Set tag

Sets a tag when it is called. Tags accessible programmatically or through GUI as soon as this line runs within your job. Job tags can also be modified within the GUI.

foundations.set_tag(key)

Arguments

  • key ([number|str]): the name of the tag, displayed on the GUI

Returns

  • This function doesn't return a value.

Raises

  • TypeError: When a value of a non-supported type is provided as the tag value.

Example

import foundations
foundations.set_tag("CNN")

Save artifact

Logs an artifact to a job when called. Artifacts can be images, audio clips, text files or serialized python objects. The artifact must be saved to disk first

foundations.save_artifact(filepath, key)

Arguments

  • filepath ([str]): path of the artifact saved to disk that needs to be logged

  • key ([number|str]): friendly name associated with the artifact

Returns

  • This function doesn't return a value.

Notes

Artifacts must be saved to disk before logging.

Example

import foundations
foundations.save_artifact("train_val_loss.png", "Loss_Curve")

Job submission

Submits a job to the Atlas Scheduler.

Arguments

  • scheduler_config ([str]): Name of the scheduler. Should always be scheduler for Atlas

  • job_directory ([str]): Default cwd. Optional argument to specify job directory

  • project_name ([str]): Defaults to current working directory. Optional argument to specify project name. This will take precedence over job.config.yaml

  • entrypoint ([str]): Optional argument to override the Docker entrypoint of the worker container

  • command ([list of str]): List of commands to pass to worker. Typically ['main.py', 'arg1', 'arg2']

  • num_gpus ([int]): Default 0. Used to set whether to run the worker with GPU support. Any positive number other than 0 will mount all available GPU devices inside the worker

  • params ([dict]): Optional argument. Allows you specify parameters for a job. This should be done in JSON serializable dictionary, where values must be supported by foundations.load_parameters(). Upon calling load_parameters() within job, this param argument will be returned to that job process. See load_parameters() docs for loading in parameters.

  • stream_job_logs ([bool]): Default True. Optional argument to specify if logs should be streamed to the console

Returns

  • deployment (Object) -- A deployment object which can be used to interact with the job

Notes

The project requirements.txt will not be automatically installed if the worker entrypoint is overridden using submit, please see Custom workers docs for more details.

Example

import foundations
foundations.submit(scheduler_config="scheduler",
                    command= ["main.py", "myarg1", "myarg2"],
                    num_gpus=1,
                    stream_job_logs=False)

BETA: Deployment Object

The object returned by job_deployment = foundations.submit(...) contains information about the job that it just launched. In it's current form, there are 3 supported functions.

# Get back a specific parameter for the job
job_deployment.get_param(param_name: str) -> str
# Get back a specific metric for the job
job_deployment.get_metric(metric_name: str) -> str
# Get back a dictionary that contains the information stored in the jobs row on the GUI
job_deployment.get_job_details() -> dict

Note

All of the calls are blocking. This means that if you call it on a job that is not finished, the function call will wait until the job to finish.

Warning

For hyperparameter search, we normally recommend setting the FOUNDATIONS_COMMAND_LINE environment variable to True to make sure that the search script does not run as a job. However, for the job deployment object to work it needs this environment variable to be either set to False or not set at all.

This means that your search script will show up as a job in the GUI. This "job" will run as long as the search script takes and act strangly within the GUI (e.g. no logs will appear).

We are aware of this annoyance and have a fix in the works!

Get project metrics

Retrieve metadata, hyper-parameters, metrics & tags for all jobs associated with a project

foundations.get_metrics_for_all_jobs(project_name)

Arguments

  • project_name ([str]): Name of the project to filter by

Returns

  • metrics (DataFrame) -- A Pandas DataFrame containing all of the results

Raises

  • ValueError -- An exception indicating that the requested project does not exist

Notes

Artifacts must be saved to disk before logging.

Example

import foundations
foundations.get_metrics_for_all_jobs("my_project")

Load parameters

Loads job parameters from a file called foundations_job_parameters.json that must exist in the root of the project as a dictionary. This will also log all loaded parameters in the GUI by default.

foundations.load_parameters(log_parameters=True)

Arguments

  • log_parameters (bool): Default True. Optional way to specify whether or not to log all parameter values in the GUI and SDK for the job.

Returns

  • parameters (dict): A dictionary of all the user-defined parameters for the model, from foundations_job_parameters.json.

Raises

  • FileNotFoundError: When the foundations_job_parameters.json file is not found in the deployment directory.

Example

Sample foundations_job_parameters.json:

{
    "learning_rate": 0.125,
    "layers": [
        {
            "neurons": 5
        },
        {
            "neurons": 6
        }
    ]
}
params = foundations.load_parameters()

Syncable directories

Foundations offers an interface to sync a directory within a job to a centralized location outside of that job. This directory can then be synced from a different job, allowing you to grab information from past jobs to know what has happened in before or build on the shoulders of giants (with giants being your own previous work).

This feature will be useful for advanced model search algorithms that the user may want to do, especially paired with jobs launching other jobs. The synced directories can be used to quickly achieve genetic search algoritms or Bayesian optimization.

foundations.artifacts.create_syncable_directory(key, directory_path=None, source_job_id=None)

Arguments

  • key (str): What your directory is called in the centralized location.

  • directory_path (str): Default None. The path to the directory within your jobs environment.

  • source_job_id (str): Default None. The ID of a previous job that has a directory by the same name as the value given to "key". If this is not specified, the current job ID is used.

Returns

  • syncable_directory (SyncableDirectory):

Examples

The following example shows how you can create and write to a syncable directory from within a job, and then read and write to the same directory from following jobs.

import foundations
import pandas as pd

df = pd.DataFrame([[1, 2, 3]])

directory = foundations.create_syncable_directory("directory_key", "sync/path")
df.to_csv("sync/path/hello.csv")
directory.upload()

If the job gives back the job ID 42, you can use this to read the saved files from any following job.

import foundations
import pandas as pd

directory = foundations.create_syncable_directory("directory_key", "sync/path", "42")
df = pd.read_csv("sync/path/hello.csv")

If you want to write back to the same directory, do so the same way that you did in the first job.

import foundations
import pandas as pd

directory = foundations.create_syncable_directory("directory_key", "sync/path", "42")
df = pd.read_csv("sync/path/hello.csv")

new_df = df + 3
new_df.to_csv("sync/path/hello.csv")
directory.upload()

NOTE: To access the the directory that a job uploaded to, in the state that you expect, always use that job's ID. Example: If you have 5 jobs that all read and write to a syncable directory with the same key, always use the previous job's ID.

Syncing Tensorboard log directory

An extra special form of a syncable directory provides the ability to sync a regular Tensorboard logdir to a centralized storage location. Doing this not only allows you to retrieve files later while tying them to a specific job, but also automatically adds a tag to the job for you. Any job that has this tag can be sent to a Tensorboard server directly from the GUI.

foundations.set_tensorboard_logdir(path)

Arguments

  • path (str): The path to your Tensorboard logdir within the jobs environment.