On-premise or hosted
Speech-To-Text models

State of the art WER

Available languages

English

Dutch

French

Portuguese

Spanish

German

Italian

Ready soon:

Hebrew

Affordable Pricing

BETA

PAY-AS-YOU-GOHOSTED API

$0.2 / hour

$0.0033 / min

TRY IT FOR FREE

*NO CREDIT CARD REQUIRED

$10 credit, on us. Then pay-as-you-go. No expiration.
All supported languages, streaming and offline, same price.
Cloud Hosted by Banafo.
Websockets API.
up to 10 concurrent requests (more available at no extra cost).
Low latency.
10 access tokens.

BETA

PRIVATE ON-PREMISEMODELS

starting at $500 / month

Prepaid credit for the month (minimum commitment).
Lowest cost per hour.
Self-Hosted on your premise, on your private or public cloud. In docker or on your favourite (recent) Linux distribution (Windows / macOS coming soon).
Websockets API.
Highest Privacy / Security.
Lowest possible latency.
CPU based, no GPU required.
up to ~10 concurrent channels per CPU core (without loss of quality).
Streaming or offline.

BETA

ENTERPRISESOLUTIONS

TALK TO US

For businesses with tailored needs. Large volumes, custom models, languages, or integration.
Unique support needs to special pricing for charities / non-profit organizations.
Hosted by choice - Banafo hosted, on your premise or your private cloud.
Custom models (Different languages / Dialects).
On-device (WebAssembly, iOS, Android).
Higher volumes and concurrency.
Integrations with third-party apps and services.

Model Features

Ultimate Privacy. Your hardware. Data never leaves your servers.

No GPU needed

50x cheaper than Google ASR (starting at 0.01 $/hour)

State-of-the-art WER (word error rate)

Models for Streaming audio

Great for

Low latency

Live captions

Accessibility

Voice command services

Chatbots

Virtual assistants

View Plans

x10 real-time on a single CPU core

Models for pre-recorded audio

Great for

Meeting transcripts - online and offline

Visual voicemail

Voice memo transcripts

Content creation

Productivity and analytics

Movie subtitles

View Plans

x20 real-time on a single CPU core

Optimized for calls (8kHz or 16kHz)

Models optimized for Call Centers with highest accuracy, for processing recorded speech.

Websockets API

Punctuation

Word timestamps

Capitalization

No hallucinations

On-premise or hosted
Speech-To-Text models

State of the art WER

Available languages

Affordable Pricing

PAY-AS-YOU-GOHOSTED API

PRIVATE ON-PREMISEMODELS

ENTERPRISESOLUTIONS

Model Features

Models for Streaming audio

Models for pre-recorded audio

Optimized for calls (8kHz or 16kHz)

Models optimized for Call Centers with highest accuracy, for processing recorded speech.

Frequently Asked Questions

What are the server requirements for the Self-Hosted Speech-To-Text solution?

Which languages does the ASR system support?

How is security ensured in the Self-Hosted solution?

What is the expected accuracy of the ASR models?

How does the system handle speech hallucinations?

Is the solution scalable, and how?

How easily can the solution be integrated into existing systems?

What is the processing speed of the Speech-To-Text models?

What is the pricing model for the Self-Hosted solution?

Who is behind the development and support of the solution?

Try the models. No login or credit card required.

Features

About

Follow us

On-premise or hostedSpeech-To-Text models

State of the art WER

Available languages

Affordable Pricing

PAY-AS-YOU-GOHOSTED API

PRIVATE ON-PREMISEMODELS

ENTERPRISESOLUTIONS

Model Features

Models for Streaming audio

Models for pre-recorded audio

Optimized for calls (8kHz or 16kHz)

Models optimized for Call Centers with highest accuracy, for processing recorded speech.

Frequently Asked Questions

What are the server requirements for the Self-Hosted Speech-To-Text solution?

Which languages does the ASR system support?

How is security ensured in the Self-Hosted solution?

What is the expected accuracy of the ASR models?

How does the system handle speech hallucinations?

Is the solution scalable, and how?

How easily can the solution be integrated into existing systems?

What is the processing speed of the Speech-To-Text models?

What is the pricing model for the Self-Hosted solution?

Who is behind the development and support of the solution?

Try the models. No login or credit card required.

On-premise or hosted
Speech-To-Text models