oracle.oci.oci_data_labeling_service_dataset – Manage a Dataset resource in Oracle Cloud Infrastructure

Note

This plugin is part of the oracle.oci collection (version 4.10.0).

You might already have this collection installed if you are using the ansible package. It is not included in ansible-core. To check whether it is installed, run ansible-galaxy collection list.

To install it, use: ansible-galaxy collection install oracle.oci.

To use it in a playbook, specify: oracle.oci.oci_data_labeling_service_dataset.

New in version 2.9.0: of oracle.oci

Synopsis

  • This module allows the user to create, update and delete a Dataset resource in Oracle Cloud Infrastructure

  • For state=present, creates a new Dataset.

  • This resource has the following action operations in the oracle.oci.oci_data_labeling_service_dataset_actions module: add_dataset_labels, change_compartment, generate_dataset_records, remove_dataset_labels, rename_dataset_labels, snapshot.

Requirements

The below requirements are needed on the host that executes this module.

Parameters

Parameter Choices/Defaults Comments
annotation_format
string
The annotation format name required for labeling records.
Required for create using state=present.
api_user
string
The OCID of the user, on whose behalf, OCI APIs are invoked. If not set, then the value of the OCI_USER_ID environment variable, if any, is used. This option is required if the user is not specified through a configuration file (See config_file_location). To get the user's OCID, please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm.
api_user_fingerprint
string
Fingerprint for the key pair being used. If not set, then the value of the OCI_USER_FINGERPRINT environment variable, if any, is used. This option is required if the key fingerprint is not specified through a configuration file (See config_file_location). To get the key pair's fingerprint value please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm.
api_user_key_file
string
Full path and filename of the private key (in PEM format). If not set, then the value of the OCI_USER_KEY_FILE variable, if any, is used. This option is required if the private key is not specified through a configuration file (See config_file_location). If the key is encrypted with a pass-phrase, the api_user_key_pass_phrase option must also be provided.
api_user_key_pass_phrase
string
Passphrase used by the key referenced in api_user_key_file, if it is encrypted. If not set, then the value of the OCI_USER_KEY_PASS_PHRASE variable, if any, is used. This option is required if the key passphrase is not specified through a configuration file (See config_file_location).
auth_purpose
string
    Choices:
  • service_principal
The auth purpose which can be used in conjunction with 'auth_type=instance_principal'. The default auth_purpose for instance_principal is None.
auth_type
string
    Choices:
  • api_key ←
  • instance_principal
  • instance_obo_user
  • resource_principal
The type of authentication to use for making API requests. By default auth_type="api_key" based authentication is performed and the API key (see api_user_key_file) in your config file will be used. If this 'auth_type' module option is not specified, the value of the OCI_ANSIBLE_AUTH_TYPE, if any, is used. Use auth_type="instance_principal" to use instance principal based authentication when running ansible playbooks within an OCI compute instance.
cert_bundle
string
The full path to a CA certificate bundle to be used for SSL verification. This will override the default CA certificate bundle. If not set, then the value of the OCI_ANSIBLE_CERT_BUNDLE variable, if any, is used.
compartment_id
string
The OCID of the compartment of the resource.
Required for create using state=present.
Required for update when environment variable OCI_USE_NAME_AS_IDENTIFIER is set.
Required for delete when environment variable OCI_USE_NAME_AS_IDENTIFIER is set.
config_file_location
string
Path to configuration file. If not set then the value of the OCI_CONFIG_FILE environment variable, if any, is used. Otherwise, defaults to ~/.oci/config.
config_profile_name
string
The profile to load from the config file referenced by config_file_location. If not set, then the value of the OCI_CONFIG_PROFILE environment variable, if any, is used. Otherwise, defaults to the "DEFAULT" profile in config_file_location.
dataset_format_details
dictionary
Required for create using state=present.
format_type
string / required
    Choices:
  • IMAGE
  • DOCUMENT
  • TEXT
The format type. DOCUMENT format is for record contents that are PDFs or TIFFs. IMAGE format is for record contents that are JPEGs or PNGs. TEXT format is for record contents that are TXT files.
text_file_type_metadata
dictionary
Applicable when format_type is 'TEXT'
column_delimiter
string
A column delimiter
column_index
integer / required
The index of a selected column. This is a zero-based index.
column_name
string
The name of a selected column.
escape_character
string
An escape character.
format_type
string / required
    Choices:
  • DELIMITED
It defines the format type of text files.
line_delimiter
string
A line delimiter.
dataset_id
string
Unique Dataset OCID
Required for update using state=present when environment variable OCI_USE_NAME_AS_IDENTIFIER is not set.
Required for delete using state=absent when environment variable OCI_USE_NAME_AS_IDENTIFIER is not set.

aliases: id
dataset_source_details
dictionary
Required for create using state=present.
bucket
string / required
The object storage bucket that contains the dataset data source.
namespace
string / required
The namespace of the bucket that contains the dataset data source.
prefix
string
A common path prefix shared by the objects that make up the dataset. Except for the CSV file type, records are not generated for the objects whose names exactly match with the prefix.
source_type
string / required
    Choices:
  • OBJECT_STORAGE
The source type. OBJECT_STORAGE allows the user to describe where in object storage the dataset is.
defined_tags
dictionary
The defined tags for this resource. Each key is predefined and scoped to a namespace. For example: `{"foo-namespace": {"bar-key": "value"}}`
This parameter is updatable.
description
string
A user provided description of the dataset
This parameter is updatable.
display_name
string
A user-friendly display name for the resource.
Required for create, update, delete when environment variable OCI_USE_NAME_AS_IDENTIFIER is set.
This parameter is updatable when OCI_USE_NAME_AS_IDENTIFIER is not set.

aliases: name
force_create
boolean
    Choices:
  • no ←
  • yes
Whether to attempt non-idempotent creation of a resource. By default, create resource is an idempotent operation, and doesn't create the resource if it already exists. Setting this option to true, forcefully creates a copy of the resource, even if it already exists.This option is mutually exclusive with key_by.
freeform_tags
dictionary
A simple key-value pair that is applied without any predefined name, type, or scope. It exists for cross-compatibility only. For example: `{"bar-key": "value"}`
This parameter is updatable.
initial_record_generation_configuration
dictionary
limit
float
The maximum number of records to generate.
key_by
list / elements=string
The list of attributes of this resource which should be used to uniquely identify an instance of the resource. By default, all the attributes of a resource are used to uniquely identify a resource.
label_set
dictionary
Required for create using state=present.
items
list / elements=dictionary
An ordered collection of labels that are unique by name.
name
string
An unique name for a label within its dataset.
labeling_instructions
string
The labeling instructions for human labelers in rich text format
This parameter is updatable.
region
string
The Oracle Cloud Infrastructure region to use for all OCI API requests. If not set, then the value of the OCI_REGION variable, if any, is used. This option is required if the region is not specified through a configuration file (See config_file_location). Please refer to https://docs.us-phoenix-1.oraclecloud.com/Content/General/Concepts/regions.htm for more information on OCI regions.
state
string
    Choices:
  • present ←
  • absent
The state of the Dataset.
Use state=present to create or update a Dataset.
Use state=absent to delete a Dataset.
tenancy
string
OCID of your tenancy. If not set, then the value of the OCI_TENANCY variable, if any, is used. This option is required if the tenancy OCID is not specified through a configuration file (See config_file_location). To get the tenancy OCID, please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm
wait
boolean
    Choices:
  • no
  • yes ←
Whether to wait for create or delete operation to complete.
wait_timeout
integer
Time, in seconds, to wait when wait=yes. Defaults to 1200 for most of the services but some services might have a longer wait timeout.

Examples

- name: Create dataset
  oci_data_labeling_service_dataset:
    # required
    compartment_id: "ocid1.compartment.oc1..xxxxxxEXAMPLExxxxxx"
    annotation_format: annotation_format_example
    dataset_source_details:
      # required
      source_type: OBJECT_STORAGE
      namespace: namespace_example
      bucket: bucket_example

      # optional
      prefix: prefix_example
    dataset_format_details:
      # required
      format_type: IMAGE
    label_set:
      # optional
      items:
      - # optional
        name: name_example

    # optional
    initial_record_generation_configuration:
      # optional
      limit: 3.4
    display_name: display_name_example
    description: description_example
    labeling_instructions: labeling_instructions_example
    freeform_tags: {'Department': 'Finance'}
    defined_tags: {'Operations': {'CostCenter': 'US'}}

- name: Update dataset
  oci_data_labeling_service_dataset:
    # required
    dataset_id: "ocid1.dataset.oc1..xxxxxxEXAMPLExxxxxx"

    # optional
    display_name: display_name_example
    description: description_example
    labeling_instructions: labeling_instructions_example
    freeform_tags: {'Department': 'Finance'}
    defined_tags: {'Operations': {'CostCenter': 'US'}}

- name: Update dataset using name (when environment variable OCI_USE_NAME_AS_IDENTIFIER is set)
  oci_data_labeling_service_dataset:
    # required
    compartment_id: "ocid1.compartment.oc1..xxxxxxEXAMPLExxxxxx"
    display_name: display_name_example

    # optional
    description: description_example
    labeling_instructions: labeling_instructions_example
    freeform_tags: {'Department': 'Finance'}
    defined_tags: {'Operations': {'CostCenter': 'US'}}

- name: Delete dataset
  oci_data_labeling_service_dataset:
    # required
    dataset_id: "ocid1.dataset.oc1..xxxxxxEXAMPLExxxxxx"
    state: absent

- name: Delete dataset using name (when environment variable OCI_USE_NAME_AS_IDENTIFIER is set)
  oci_data_labeling_service_dataset:
    # required
    compartment_id: "ocid1.compartment.oc1..xxxxxxEXAMPLExxxxxx"
    display_name: display_name_example
    state: absent

Return Values

Common return values are documented here, the following are the fields unique to this module:

Key Returned Description
dataset
complex
on success
Details of the Dataset resource acted upon by the current operation

Sample:
{'annotation_format': 'annotation_format_example', 'compartment_id': 'ocid1.compartment.oc1..xxxxxxEXAMPLExxxxxx', 'dataset_format_details': {'format_type': 'DOCUMENT', 'text_file_type_metadata': {'column_delimiter': 'column_delimiter_example', 'column_index': 56, 'column_name': 'column_name_example', 'escape_character': 'escape_character_example', 'format_type': 'DELIMITED', 'line_delimiter': 'line_delimiter_example'}}, 'dataset_source_details': {'bucket': 'bucket_example', 'namespace': 'namespace_example', 'prefix': 'prefix_example', 'source_type': 'OBJECT_STORAGE'}, 'defined_tags': {'Operations': {'CostCenter': 'US'}}, 'description': 'description_example', 'display_name': 'display_name_example', 'freeform_tags': {'Department': 'Finance'}, 'id': 'ocid1.resource.oc1..xxxxxxEXAMPLExxxxxx', 'initial_record_generation_configuration': {'limit': 10}, 'label_set': {'items': [{'name': 'name_example'}]}, 'labeling_instructions': 'labeling_instructions_example', 'lifecycle_details': 'lifecycle_details_example', 'lifecycle_state': 'CREATING', 'system_tags': {}, 'time_created': '2013-10-20T19:20:30+01:00', 'time_updated': '2013-10-20T19:20:30+01:00'}
 
annotation_format
string
on success
The annotation format name required for labeling records.

Sample:
annotation_format_example
 
compartment_id
string
on success
The OCID of the compartment of the resource.

Sample:
ocid1.compartment.oc1..xxxxxxEXAMPLExxxxxx
 
dataset_format_details
complex
on success

   
format_type
string
on success
The format type. DOCUMENT format is for record contents that are PDFs or TIFFs. IMAGE format is for record contents that are JPEGs or PNGs. TEXT format is for record contents that are TXT files.

Sample:
DOCUMENT
   
text_file_type_metadata
complex
on success

     
column_delimiter
string
on success
A column delimiter

Sample:
column_delimiter_example
     
column_index
integer
on success
The index of a selected column. This is a zero-based index.

Sample:
56
     
column_name
string
on success
The name of a selected column.

Sample:
column_name_example
     
escape_character
string
on success
An escape character.

Sample:
escape_character_example
     
format_type
string
on success
It defines the format type of text files.

Sample:
DELIMITED
     
line_delimiter
string
on success
A line delimiter.

Sample:
line_delimiter_example
 
dataset_source_details
complex
on success

   
bucket
string
on success
The object storage bucket that contains the dataset data source.

Sample:
bucket_example
   
namespace
string
on success
The namespace of the bucket that contains the dataset data source.

Sample:
namespace_example
   
prefix
string
on success
A common path prefix shared by the objects that make up the dataset. Except for the CSV file type, records are not generated for the objects whose names exactly match with the prefix.

Sample:
prefix_example
   
source_type
string
on success
The source type. OBJECT_STORAGE allows the user to describe where in object storage the dataset is.

Sample:
OBJECT_STORAGE
 
defined_tags
dictionary
on success
The defined tags for this resource. Each key is predefined and scoped to a namespace. For example: `{"foo-namespace": {"bar-key": "value"}}`

Sample:
{'Operations': {'CostCenter': 'US'}}
 
description
string
on success
A user provided description of the dataset

Sample:
description_example
 
display_name
string
on success
A user-friendly display name for the resource.

Sample:
display_name_example
 
freeform_tags
dictionary
on success
A simple key-value pair that is applied without any predefined name, type, or scope. It exists for cross-compatibility only. For example: `{"bar-key": "value"}`

Sample:
{'Department': 'Finance'}
 
id
string
on success
The OCID of the Dataset.

Sample:
ocid1.resource.oc1..xxxxxxEXAMPLExxxxxx
 
initial_record_generation_configuration
complex
on success

   
limit
float
on success
The maximum number of records to generate.

Sample:
10
 
label_set
complex
on success

   
items
complex
on success
An ordered collection of labels that are unique by name.

     
name
string
on success
An unique name for a label within its dataset.

Sample:
name_example
 
labeling_instructions
string
on success
The labeling instructions for human labelers in rich text format

Sample:
labeling_instructions_example
 
lifecycle_details
string
on success
A message describing the current state in more detail. For example, it can be used to provide actionable information for a resource in FAILED or NEEDS_ATTENTION state.

Sample:
lifecycle_details_example
 
lifecycle_state
string
on success
The state of a dataset. CREATING - The dataset is being created. It will transition to ACTIVE when it is ready for labeling. ACTIVE - The dataset is ready for labeling. UPDATING - The dataset is being updated. It and its related resources may be unavailable for other updates until it returns to ACTIVE. NEEDS_ATTENTION - A dataset updation operation has failed due to validation or other errors and needs attention. DELETING - The dataset and its related resources are being deleted. DELETED - The dataset has been deleted and is no longer available. FAILED - The dataset has failed due to validation or other errors.

Sample:
CREATING
 
system_tags
dictionary
on success
The usage of system tag keys. These predefined keys are scoped to namespaces. For example: `{"orcl-cloud": {"free-tier-retained": "true"}}`

 
time_created
string
on success
The date and time the resource was created, in the timestamp format defined by RFC3339.

Sample:
2013-10-20T19:20:30+01:00
 
time_updated
string
on success
The date and time the resource was last updated, in the timestamp format defined by RFC3339.

Sample:
2013-10-20T19:20:30+01:00


Authors

  • Oracle (@oracle)