Changelog

Last Updated: September 2022

One AI
One AI
·
Sep 30, 2022
·
3 min read

September 2022

Transcription Language Skill

  • The Transcription Skill converts audio to text, which can be then analyzed and transformed by other Language Skills
  • Labels extracted from transcriptions hold timestamps of the labeled words in the audio input
  • Automated diarization- different speakers are identified and exchanges are separated into different sections
  • The pipeline API supports .wav & .mp3 file inputs. The Transcription Skill is required to process audio files.
  • Use the async endpoint (details below) to process big audio files asynchronously

```const pipeline = new oneai.Pipeline(
   oneai.skills.transcribe(),
   oneai.skills.emotions(),
   oneai.skills.summarize(),
);

const output = await pipeline.runFile('my-conversation.mp3');
console.log(output.transcription.text);                  // transcribed text
console.log(output.transcription.emotions);        // emotions that appear in the conversation
console.log(output.transcription.summary.text); // summary of the conversation```

New & Improved Language Skills

Headline Generation

  • Generate an appropriate headline for your input text.

Subheadings Generation

  • Generate a subheading for your input text.

Summarize

  • The Summary Skill now generates `origin` labels, mapping words in the summary to their position in the source input (string indices for text inputs, timestamps for audio inputs).

Split-by-Topic

  • Skill is now configurable, providing control over how many sections the input is split into.
  • Combine with the Headline / Subheadings Skills to generate headline per section

Numbers

  • Numbers and quantities that appear in text form are interpreted & provided as numeric data
  • Dates, times & time-durations in various formats are provided in a standardized date string
  • Added a `datetime_baseline` parameter to control the current time dates & times should use as a baseline

Names

  • The Names Skill detects when names that appear in text reference real-world entities (people, companies, locations, etc.) based on context.
  • `name` labels are enriched with entity data when known entities are recognized.

Language-Detection

  • Automatically identify the dominant language used in the input and help with further processing and translation.

Analytics API (beta)

  • The analytics engine makes large amounts of text data digestible, by clustering together texts with similar meaning and accumulating metadata generated by Language Skills (such as Sentiment & Topics).
  • The API accepts text items and organizes them in hierarchical clusters by meaning.
  • Generated clusters can be fetched from the API or reviewed at the Analytics section of the Studio (the UI is open-source)
  • Cluster collections can be queried with specific text items, fetching the cluster with the most similar meaning, making the analytics API an effective tool for intent based classification and search.
  • Text items can be enriched with metadata generated by Language Skills. Clusters then aggregate item metadata to derive insights at scale.
  • Quickstart guide - docs.oneai.com/docs/quick-start-analytics
Reviews of Amazon Echo by subject and aggregate Sentiment data • Analytics API TreeMap UI

Async API

  • The async API endpoints introduce asynchronous processing of large text inputs and audio files.
  • The upload request of the input and the response with the output are split into separate endpoints, so outputs can be retrieved without waiting for the entire processing time.
  • Use the `/async`endpoint to process raw text
  • Use the `/async/files` endpoint to process binary encoded files
  • Successful async requests will return a `task_id` parameter that was assigned to the input.
  • Task status and outputs can be fetched via polling from the `/async/tasks` endpoint